apache / spark
Apache Spark - A unified analytics engine for large-scale data processing
{{ message }}
See what the GitHub community is most excited about today.
Apache Spark - A unified analytics engine for large-scale data processing
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
The pure asynchronous runtime for Scala
Chisel: A Modern Hardware Design Language
The Community Maintained High Velocity Web Framework For Java and Scala.
Mill is a fast JVM build tool that supports Java, Scala, Kotlin and many other languages. 2-4x faster than Gradle and 4-10x faster than Maven for common workflows, Mill aims to make your project’s build process performant, maintainable, and flexible
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
Source code for Twitter's Recommendation Algorithm
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Definition of the Viper intermediate verification language.
workbench identity and access management
Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
An open protocol for secure data sharing
An Agile RISC-V SoC Design Framework with in-order cores, out-of-order cores, accelerators, and more
Modern Load Testing as Code
Rocket Chip Generator
Berkeley's Spatial Array Generator
A Git platform powered by Scala with easy installation, high extensibility & GitHub API compatibility
The Scala 3 compiler, also known as Dotty.
Removes large or troublesome blobs like git-filter-branch does, but faster. And written in Scala