An open-source big data platform designed and optimized for the Internet of Things (IoT).
-
Updated
Oct 13, 2020 - C
{{ message }}
An open-source big data platform designed and optimized for the Internet of Things (IoT).
A curated list of awesome big data frameworks, ressources and other awesomeness.
An easy-to-use BI server built for SQL lovers. Power data analysis in SQL and gain faster business insights.
This is more a question than a feature request.
When parsing JSON files, I need to sanitize the field names so field with spaces becomes field_with_spaces.
I want to preserve the original name as well, metadata about the column if you like :)
There is a metadata field on StructField, but it is internal.
Why is this internal, is it possible or desirable to expose it?
Upserts, Deletes And Incremental Processing on Big Data.
Distributed Big Data Orchestration Service
GridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.
Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature
/help wanted
Description:
Currently, we publish helm chart at https://github.com/volcano-sh/charts and keep helm charts up-to-date at
https://github.com/volcano-sh/volcano/tree/master/installer , and they already mismatched. It's better to make charts as submod of volcano to keep them sync.
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
The Programming Language Designed For Big Data and AI
C# and F# language binding and extensions to Apache Spark
Google, Naver multiprocess image web crawler (Selenium)
Lightweight real-time big data streaming engine over Akka
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC
We need a new endpoint that functions as getIntegrationById endpoint.
We currently fetching all integration via appsync (or more specifically a sub-category of integrations based on integrationType) and iterate until we find one that matches the integrationId passed.
Although, we
学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等
A book about running Elasticsearch
Fast topic modeling platform
Add a description, image, and links to the bigdata topic page so that developers can more easily learn about it.
To associate your repository with the bigdata topic, visit your repo's landing page and select "manage topics."
Hello,
Considering your amazing efficiency on pandas, numpy, and more, it would seem to make sense for your module to work with even bigger data, such as Audio (for example .mp3 and .wav). This is something that would help a lot considering the nature audio (ie. where one of the lowest and most common sampling rates is still 44,100 samples/sec). For a use case, I would consider vaex.open('Hu