The Wayback Machine - http://web.archive.org/web/20220108064611/https://github.com/topics/data-engineering
Skip to content
#

data-engineering

Here are 1,060 public repositories matching this topic...

superset
kvnkho
kvnkho commented Dec 15, 2021

Current behavior

You get an error if you try to upload the same file name

azure.core.exceptions.ResourceExistsError: The specified blob already exists.
RequestId:5bef0cf1-b01e-002e-6

Proposed behavior

The task should take in an overwrite argument and pass it to [this line](https://github.com/PrefectHQ/prefect/blob/6cd24b023411980842fa77e6c0ca2ced47eeb83e/src/prefect/

Aylr
Aylr commented Dec 28, 2020

Describe the bug
data docs columns shrink to 1 character width with long query

To Reproduce
Steps to reproduce the behavior:

  1. make a batch from a long query string
  2. run validation
  3. render result to data docs
  4. See screenshot
    <img width="1525" alt="Data_documentation_compiled_by_Great_Expectations" src="https://user-images.githubusercontent.com/928247/103230647-30eca500-4
benthos
nossrannug
nossrannug commented Dec 9, 2021

Is your feature request related to a problem? Please describe.
I have a framework that handles the offline store. It creates the tables, indexes, reads data from different data sources, does some transformations, and then inserts into the offline store. As a part of this, I can construct the entities, feature views, feature services, etc, a instance of the ParsedRepo class for Feast. What I n

lakeFS

A comprehensive list of 180+ YouTube Channels for Data Science, Data Engineering, Machine Learning, Deep learning, Computer Science, programming, software engineering, etc.

  • Updated Dec 31, 2021
edublancas
edublancas commented Dec 21, 2021

click has a CLIRunner to test CLI applications, however, it's limiting (e.g., monkeypatch doesn't work well). So we started to modify the test_cli.py tests to call the functions directly (e.g., install.main(use_lock=True). But given this change, we are no longer testing that cli args actually become the right function arguments (e.g., if we pass --use-lock), this should imply, we pass `ins

thatlittleboy
thatlittleboy commented Jan 2, 2022

Background

This thread is borne out of the discussion from #968 , in an effort to make documentation more beginner-friendly & more understandable.
One of the subtasks mentioned in that thread was to go through the function docstrings and include a minimal working example to each of the public functions in pyjanitor.

Criteria reiterated here for the benefit of discussion:

It sh
mithmatt
mithmatt commented Jan 5, 2022

In a lot of classes we use LoggerFactory to initialize logger

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class DefaultAuthorizer implements Authorizer {
  private static final Logger LOG = LoggerFactory.getLogger(DefaultAuthorizer.class);

This could be simplified to the following, with no need to initialize logger using LoggerFactory

import lombok.exte
davidradl
davidradl commented Nov 17, 2021

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

A large amount of output goes to the log, this should not happen by default.

Expected Behavior

much less content in the output of the FVT and the build bu default

Switch on debug in the logging configuration and then see all the output.

Steps To Reproduce

run the build

Env

Improve this page

Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."

Learn more