etl

It would be useful to see how riko compares to other stream processors. Possible metrics to track are open sockets, bandwith, CPU, and memory usage.

https://github.com/compose/transporter/blob/9e154e76b7d2977d9ac7756660779b512cace87f/adaptor/rabbitmq/writer.go#L36

This makes the adapter not really suitable to aid in keeping remote resources in sync.

Why only publish these two ops, and not delete?

Need test cases to cover error handling in batch_work_executor.py

安装linkis jobtypes
按照官方安装文档进行自动化安装，执行sh install.sh最后一步报错：{"error":"Missing required parameter 'execid'."}。并没有看到文档中所说的“如果安装成功最后会打印：{"status":"success"}”，但是能在azkaban的/plugins/jobtypes目录下看到已经安装好的linkis任务插件。通过排查在安装脚本最后一步会去调用"curl http://azkaban_ip:executor_port/executor?action=reloadJobTypePlugins"进行插件的刷新。重启azkaban executor日志中看到已经加载了插件的信息 `INFO [JobTypeManager][Azkaban] Loaded jobtype linkis

Description

Add documentation about how to apply helper functions

Acceptance Criteria

Docs in the rules/policies pages on applying helpers, best practices, and patterns

I found that is possible to avoid BOM character exported in the Issues, but it is missing in the documentation.

Please add a reference for uFEFF={false}.

Summary and Descriptive Statistics

The first operation to perform after importing data is to get some sense of what it looks like. For numerical columns, knowing the descriptive summary statistics can help a lot in understanding the distribution of your data. The transformer "describe" returns a DataFrame containing information such as number of non-null entries (count), mean, standard deviati

Hi, I'm wanting to use Koop for integrating with the Waze live feed. After reading through the readme, I was totally lost about how to even start using Koop.

I'd really like to see more detailed documentation or a getting started guide. Additionally I'd be interested in helping to get a dedicated Waze repository up running.

The actual tables are migrating with the same name. Thus for the following csv:
CUSTOMER,Customer,False

The files are turning up in MySQL (from Oracle) as 'CUSTOMER' but then it tries to add indices and FKs to Customer and falls down in a heap.

Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/45471371-table-transform-only-ha

Seems like the published docs https://www.2ndquadrant.com/en/resources/pglogical/pglogical-docs/

are not up to date with the info in the readme https://github.com/2ndQuadrant/pglogical

This wasted a lot of time for me because of a couple pieces of missing info. Might be good to merge them, or take one down.

Thanks for a great project!

As outlined in #16, it's often useful to extend fine-grained control of sharding to the user. It can be solved by wrapping integers with an identity hash function, but that seems less than ideal. It might be useful to provide this functionality as part of bigslice.Reshuffle.

The GUI currently includes social links to Twitter, LinkedIn etc. We should add our new Gitter channel to these links.

Is it possible to have a testmetrics configuration which contain all the configuration .

We have currently limited documentation for the testing of the metric files .

The documentation has one example config file of job and metric

our prior recommendation, removed in 4bb8d76177c40c9a0405ca66da9a40dcbded4505, caused issues with makefiles on ubuntu 16.x (xenial), i.e., on our staging server, such that all recipes failed with the error "No such file or directory". do some research on directives, identify the culprit, and add a new, portable set of default directives.

@BenBirt

We should add some stuff to contributors.md. Something like:

when opening a PR, feel free to immediately request a review, probably from @BenBirt or @lewish
one reviewer is fine, add two or more though if you want to get something in faster / want more eyes reviewing
after resolving a round of PR comments, hit the "re-request review" button
once the PR is approved & you have resolved any

May	JUN	Jul
	27
2019	2020	2021

etl

Here are 1,047 public repositories matching this topic...

nerevu / riko

mara / mara-pipelines

thbar / kiba

compose / transporter

thenaturalist / awesome-business-intelligence

awslabs / aws-data-wrangler

blockchain-etl / ethereum-etl

WeBankFinTech / DataSphereStudio

panther-labs / panther

Description

Acceptance Criteria

react-csv / react-csv

PhantomInsights / baby-names-analysis

singer-io / getting-started

ananas-analytics / ananas-desktop

koopjs / koop

AlexIoannides / pyspark-example-project

dotnetcore / SmartCode

seanharr11 / etlalchemy

2ndQuadrant / pglogical

grailbio / bigslice

datacleaner / DataCleaner

reugn / go-streams

appbaseio / abc

Cascading / cascading

Cinchoo / ChoETL

WeBankFinTech / WeDataSphere

deeplearning4j / DataVec

YotpoLtd / metorikku

datamade / data-making-guidelines

JoeyBling / webkettle

dataform-co / dataform

Improve this page

Add this topic to your repo