Apache Spark
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Here are 5,205 public repositories matching this topic...
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
Updated
Oct 1, 2020 - Python
Learn and understand Docker technologies, with real DevOps practice!
-
Updated
Nov 15, 2020 - Go
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
-
Updated
Nov 23, 2020 - JavaScript
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
-
Updated
Nov 23, 2020 - Python
Is your feature request related to a problem? Please describe.
It is idiomatic for JWTs to be accepted using a header format of Authorization: Bearer <JWT> (see jwt.io introduction.) In general, in history, the RFCs surrounding the authorization header have taken care to specify the mode of Authorization as the first part of the header value (e.g. Basic, Di
flink learning blog. http://www.flink-learning.com 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
-
Updated
Nov 11, 2020 - Java
List of Data Science Cheatsheets to rule the world
-
Updated
Oct 31, 2019
Open-source IoT Platform - Device management, data collection, processing and visualization.
-
Updated
Nov 23, 2020 - Java
A Flexible and Powerful Parameter Server for large-scale machine learning
-
Updated
Nov 23, 2020 - Java
macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
-
Updated
Jun 20, 2020 - Python
Open Source Fast Scalable Machine Learning Platform For Smarter Applications: Deep Learning, Gradient Boosting & XGBoost, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
-
Updated
Nov 23, 2020 - Jupyter Notebook
Alluxio, data orchestration for analytics and machine learning in the cloud
-
Updated
Nov 23, 2020 - Java
PipelineAI Kubeflow Distribution
-
Updated
Apr 24, 2020 - Jsonnet
BigDL: Distributed Deep Learning Framework for Apache Spark
-
Updated
Nov 23, 2020 - Scala
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
-
Updated
Jul 27, 2020 - Python
酷玩 Spark: Spark 源代码解析、Spark 类库等
-
Updated
May 26, 2019 - Scala
Interactive and Reactive Data Science using Scala and Spark.
-
Updated
Jun 2, 2020 - JavaScript
Hi, if my spark app is using 2 storage type, both S3 and Azure Data Lake Store Gen2, could I put spark.delta.logStore.class=org.apache.spark.sql.delta.storage.AzureLogStore, org.apache.spark.sql.delta.storage.S3SingleDriverLogStore
Thanks in advance
Used Spark version
2.4.3
Used Spark Job Server version
(Released version, git branch or docker image version)
0.9.0-SNAPSHOT
Deployed mode
(client/cluster on Spark Standalone/YARN/Mesos/EMR or default)
client spark standalone
Actual (wrong) behavior
curl -d "input.string = a b c a b see hello world ssdsds " 'localhost:8090/jobs?appName=test&classPath=spark.jobserver.WordCo
The Hunting ELK
-
Updated
Nov 21, 2020 - Jupyter Notebook
I have a simple regression task (using a LightGBMRegressor) where I want to penalize negative predictions more than positive ones. Is there a way to achieve this with the default regression LightGBM objectives (see https://lightgbm.readthedocs.io/en/latest/Parameters.html)? If not, is it somehow possible to define (many example for default LightGBM model) and pass a custom regression objective?
Problem
Since Java 8 was introduced there is no need to use Joda as it has been replaced the native Date-Time API.
Solution
Ideally greping and replacing the text should work (mostly)
Additional context
Need to check if de/serializing will still work.
Created by Matei Zaharia
Released May 26, 2014
- Repository
- apache/spark
- Website
- spark.apache.org
- Wikipedia
- Wikipedia


At this moment relu_layer op doesn't allow threshold configuration, and legacy RELU op allows that.
We should add configuration option to relu_layer.