A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
-
Updated
Sep 9, 2020 - Python
{{ message }}
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Upserts, Deletes And Incremental Processing on Big Data.
Fast, sensitive and accurate integration of single-cell data with Harmony
An example mini data warehouse for python project stats, template for new projects
Hetionet: an integrative network of disease
scikit-fusion: Data fusion via collective latent factor models
NicheNet: predict active ligand-target links between interacting cells
WInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing, schema matching, identity resolution, data fusion, and result evaluation.
The Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.
Business Intelligence and Data Warehousing
an data-centric integration platform
Utilities for creating ETL pipelines with mara
Toolbox for including enzyme constraints on a genome-scale model.
A .NET class library that allows you to import data from different sources into a unified destination
Installer for Thymeflow, a personal knowledge management system.
Research data management in biomedical and machine learning applications
Generation and Applications of Knowledge Graphs in Systems and Networks Biology
Scripts and resources to create Hetionet v1.0, a heterogeneous network for drug repurposing
Some of the projects i made when starting to learn R for Data Science at the university
Development of the Gellish Communicator reference application and tools for universal data exchange and data integration supporting Formal English and other Gellish formalized natural languages.
汇总Apache Hudi相关资料
R package for High dimensional data analysis and integration with O2PLS!
An Efficient RML-Compliant Engine for Knowledge Graph Construction
Repo for Data Warehouse Concepts, Design, and Data Integration by University of Colorado System (coursera)(Notes,Assignments, quiz and research papers)
Once the compiler is ready (well it could be possible now as well to be honest), we can (if there is demand for), create Graphs to design REST Extensions to serve data. We can use those graphs to generate REST Extensions. Investigate whether there is demand for this.
Some ideas:
An Input block with GET/POST/PUT/DELETE output node.
An output node with Content/StatusCode/StatusMessage.
Match schema attributes of relational databases by value similarity. As a study assignment, this isn't well documented, but you can contact me for questions and I may even add docs, if I sense enough interest.
Add a description, image, and links to the data-integration topic page so that developers can more easily learn about it.
To associate your repository with the data-integration topic, visit your repo's landing page and select "manage topics."
Complete algorithm: http://web.cecs.pdx.edu/~mpj/pubs/polyrec.html