A command-line tool for launching Apache Spark clusters.
-
Updated
Aug 3, 2020 - Python
A command-line tool for launching Apache Spark clusters.
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.
This project has customization likes custom data sources, plugins written for the distributed systems like Apache Spark, Apache Ignite etc
Implementations of Markov Clustrer Algorithm (MCL) and Regularized Markov Cluster Algorithm (R-MCL) in Apache Spark
Apache Spark cluster lab.
Analysis performed on data from the Steam platform using Apache Spark and Cloud services such as Amazon Web Services.
Add a description, image, and links to the apache-spark-cluster topic page so that developers can more easily learn about it.
To associate your repository with the apache-spark-cluster topic, visit your repo's landing page and select "manage topics."