A large-scale entity and relation database supporting aggregation of properties
-
Updated
Aug 7, 2020 - Java
A large-scale entity and relation database supporting aggregation of properties
Quilt is a versioned data portal for S3
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
High performance distributed data processing engine
80+ DevOps & Data CLI Tools - AWS, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, Ambari, Blueprints, CloudFormation, Elasticsearch, Solr, Pig, IPython - Python / Jython Tools
A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Manipulate arrays of complex data structures as easily as Numpy.
fully asynchronous, pure JavaScript implementation of the Parquet file format
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
A SQLite vtable extension to read Parquet files
Simple windows desktop application for viewing & querying Apache Parquet files
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)
Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .
Add a description, image, and links to the parquet topic page so that developers can more easily learn about it.
To associate your repository with the parquet topic, visit your repo's landing page and select "manage topics."