big-data
Here are 2,140 public repositories matching this topic...
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
-
Updated
Aug 6, 2020
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
Updated
Jul 24, 2020 - Python
-
Updated
Aug 10, 2020 - Python
PredictionIO, a machine learning server for developers and ML engineers.
-
Updated
May 7, 2020 - Scala
An open source cybersecurity protocol for syncing decentralized graph data.
-
Updated
Aug 6, 2020 - JavaScript
ClickHouse is a free analytics DBMS for big data
-
Updated
Aug 10, 2020 - C++
CMAK is a tool for managing Apache Kafka clusters
-
Updated
Jul 12, 2020 - Scala
The most widely used Python to C compiler
-
Updated
Aug 10, 2020 - Python
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
-
Updated
Aug 10, 2020 - C++
Open Source Fast Scalable Machine Learning Platform For Smarter Applications: Deep Learning, Gradient Boosting & XGBoost, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
-
Updated
Aug 10, 2020 - Jupyter Notebook
Apache CouchDB
-
Updated
Aug 10, 2020 - Erlang
Reproducible Data Science at Scale!
-
Updated
Aug 7, 2020 - Go
Stream Framework is a Python library, which allows you to build news feed, activity streams and notification systems using Cassandra and/or Redis. The authors of Stream-Framework also provide a cloud service for feed technology:
-
Updated
Jul 15, 2020 - Python
Moloch is an open source, large scale, full packet capturing, indexing, and database system.
-
Updated
Aug 10, 2020 - C
Open Source In-Memory Data Grid
-
Updated
Aug 10, 2020 - Java
BigDL: Distributed Deep Learning Framework for Apache Spark
-
Updated
Aug 5, 2020 - Scala
Apache Ignite
-
Updated
Aug 10, 2020 - Java
Vespa is an engine for low-latency computation over large data sets.
-
Updated
Aug 10, 2020 - Java
An easy to use, self-service open BI reporting and BI dashboard platform.
-
Updated
Jun 16, 2020 - TSQL
Bare bone examples of machine learning in TensorFlow
-
Updated
Mar 14, 2017 - Python
Improve this page
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."

