A stream processor for mundane tasks written in Go
-
Updated
Jul 28, 2020 - Go
A stream processor for mundane tasks written in Go
A Python stream processing engine modeled after Yahoo! Pipes
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Data processing & ETL framework for Ruby
Sync data between persistence engines, like ETL only not stodgy
Actively curated list of awesome BI tools. PRs welcome!
Pandas on AWS
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
Detect threats with log data and improve cloud security posture
React components to build CSV files on the fly basing on Array/literal object of data
a go daemon that syncs MongoDB to Elasticsearch in realtime
This repository is a getting started guide to Singer.
Data ETL & Analysis on the dataset 'Baby Names from Social Security Card Applications - National Data'.
A hackable data integration & analysis tool to enable non technical users to edit data processing jobs and visualise data on demand.
Example project implementing best practices for PySpark ETL jobs and applications.
SmartCode = IDataSource -> IBuildTask -> IOutput => Build Everything!!!
Extract, Transform, Load: Any SQL Database in 4 lines of Code.
Logical Replication extension for PostgreSQL 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
Go stream processing library
The premier open source Data Quality solution
Power of appbase.io via CLI, with nifty imports from your favorite data sources
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on various cluster computing platforms. Please see https://github.com/cwensel/cascading for access to all WIP branches.
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
ETL Library for Machine Learning - data pipelines, data munging and wrangling
A simplified, lightweight ETL Framework based on Apache Spark
Add a description, image, and links to the etl topic page so that developers can more easily learn about it.
To associate your repository with the etl topic, visit your repo's landing page and select "manage topics."