ETL Library for Machine Learning - data pipelines, data munging and wrangling
-
Updated
Jun 1, 2020 - Java
ETL Library for Machine Learning - data pipelines, data munging and wrangling
Terraform module designed to easily backup EFS filesystems to S3 using DataPipeline
Domain-specific language to help build and maintain AWS Data Pipelines
Spark package to "plug" holes in data using SQL based rules
Tensorflow 2 Tutorials (use tensorflow and keras in a better way!)
Building Json data pipeline within Snowflake using Streams and Tasks
kedro cli plugin for generating a static kedro viz site (html, css, js) that can be deployed on many serverless tools.
High speed message passing between various queues and services
Reactive Streams distributed datapipeline for data process. Now support kafka,jdbc,kudu,elasticsearch,hdfs.etc
Материалы для курса Введение в Data Engineering: дата пайплайны
Go library that provides easy-to-use interfaces and tools for TensorFlow users, in particular allowing to train existing TF models on .tar and .tgz datasets
Simple Airflow on Kubernetes (GKE)
A GitHub Action to lint, test, build-docs, package, and run your kedro pipelines. Supports any Python version you'll give it (that is also supported by pyenv).
Process any type of data in your projects easily, control the flow of your data.
A simple framework for writing data pipelines in Python
A data pipeline to daily pull public transport data from the opentransportdata.swiss portal. This pipeline has three tasks, pull the right data from opentransportdata.swiss, push the data to s3 for storage, and transform and load the transformed data to a database. Hopefully this repository helps people explain ETL / Batch data pipeline.
Global Tree Cover Loss Analysis using Geotrellis and SPARK
A data pipeline to analyze real time cryptocurrency price
CLI tool to simplify AWS DataPipeline deployment and management
This is a project which demonstrates creation of a data pipeline by scraping data using twitter API and creating a data delivery stream using Kinesis Firehose for ingesting data to Amazon S3.
Contrail Data Pipeline for OpenStack Clouds
The data pipes website datapies.tech
a micro batch processing pipeline
Add a description, image, and links to the datapipeline topic page so that developers can more easily learn about it.
To associate your repository with the datapipeline topic, visit your repo's landing page and select "manage topics."