data-engineering

BigQuery error is hard to read.

Expected results

In Explore, when creating a bad expression (say DATE_TRUNC(column_that_dont_exist, DAY)) in BigQuery, the DatabaseError is shown as a UnknownError. In SQL Lab, DatabaseErrors are surfaced properly and make sure to use a monospace font so that the formatting is preserved. For most database, the formatting doesn't matter much, but for BigQ

Description

I have setup a custom, remote prefect server.

However, when registering a flow, only localhost is displayed in the Flow URL :

$ prefect register flow --file ./myflow.py -p sandbox
Result check: OK
Flow URL: http://localhost:8080/default/flow/9235a237-f6bc-41c7-89bc-132db233b49e
 └── ID: a09a47b0-1292-412f-bd70-89c8bf4dcf1e

Describe the bug
When trying to run scaffolding (profiling) command, it fails because of commas in columns.

To Reproduce
Steps to reproduce the behavior:

Run great_expectations suite scaffold scaffold-name on datasource where commas are in column
Bug pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 5323 saw 2

Expected behavior
D

Enable delete repository action from the UI

Problem description

When I use the function of concatenating multiple columns, I find that it does not handle null values as expected.

This is the current output

df.concatenate_columns(["cat_1","cat_2","cat_3"],"cat",sep=",")

	cat_1	cat_2

if they are not class methods then the method would be invoked for every test and a session would be created for each of those tests.

`class PySparkTest(unittest.TestCase):
@classmethod
def suppress_py4j_logging(cls):
logger = logging.getLogger('py4j')
logger.setLevel(logging.WARN)

@classmethod
def create_testing_pyspark_session(cls):
    return Sp

Egeria's open metadata labs use python notebooks to drive sequences of REST API calls to Egeria's runtime platform called the OMAG Server Platform. There is one function called printAssetUniverse that needs work. This function is designed to provide a data scientist with detailed information about an Asset (such as a file or a database). This includes name, description, its location, content,

Jan	FEB	Mar
	18
2020	2021	2022

data-engineering

Here are 693 public repositories matching this topic...

apache / superset

Expected results

eugeneyan / applied-ml

PrefectHQ / prefect

Description

datastacktv / data-engineer-roadmap

great-expectations / great_expectations

Jeffail / benthos

adilkhash / Data-Engineering-HowTo

kantord / just-dashboard

awslabs / aws-data-wrangler

quiltdata / quilt

GoogleCloudPlatform / data-science-on-gcp

treeverse / lakeFS

san089 / goodreads_etl_pipeline

ericmjl / pyjanitor

Problem description

This is the current output

AlexIoannides / pyspark-example-project

rich-iannone / pointblank

oleg-agapov / data-engineering-book

san089 / Udacity-Data-Engineering-Projects

automaticmode / active_workflow

kevintpeng / Learn-Something-Every-Day

gunnarmorling / awesome-opensource-data-engineering

dataform-co / dataform

odpi / egeria

Cascading / cascading

alexklibisz / elastik-nearest-neighbors

sderosiaux / every-single-day-i-tldr

abhishek-ch / around-dataengineering

aiguofer / gspread-pandas

awslabs / aws-serverless-data-lake-framework

LGE-ARC-AdvancedAI / auptimizer

Improve this page

Add this topic to your repo