delta-io / delta
This connector allows Apache Spark™ to read from and write to Delta Lake.
{{ message }}
See what the GitHub community is most excited about today.
This connector allows Apache Spark™ to read from and write to Delta Lake.
A Table Structure Storage on Data Lakes to Unify Batch and Streaming Data Processing
A Scala API for Apache Beam and Google Cloud Dataflow.
A fault tolerant, protocol-agnostic RPC system
Feathr – An Enterprise-Grade, High Performance Feature Store
Simple and Distributed Machine Learning
Apache Spark - A unified analytics engine for large-scale data processing
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Open-source high-performance RISC-V processor
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
Parking ticket machine finder
ZIO — A type-safe, composable library for async and concurrent programming in Scala
TheHive: a Scalable, Open Source and Free Security Incident Response Platform
Scala 2 compiler and standard library. For bugs, see scala/bug
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
An open protocol for secure data sharing
Removes large or troublesome blobs like git-filter-branch does, but faster. And written in Scala
Apache Spark Connector for Azure Cosmos DB
Notebook service
Mirror of Apache livy (Incubating)
Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs
sbt, the interactive build tool
Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
Monitor Kafka Consumer Group Latency with Kafka Lag Exporter