A topic-centric list of HQ open datasets.
-
Updated
Jun 14, 2021
{{ message }}
A topic-centric list of HQ open datasets.
pix2code: Generating Code from a Graphical User Interface Screenshot
Hi! I've noticed the doc references postgres for setting up data storage (https://labelstud.io/guide/storedata.html). However, I was wondering whether it's possible to switch to other databases such as MySQL for the same task. If so, more or less how'd the process be? I'm trying to set up label-studio and the environment I'm using only allows MySQL DBs.
Thanks!
The error occurs in the Step 5/9 of the docker build process
fetch http://dl-cdn.alpinelinux.org/alpine/v3.11/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.11/community/x86_64/APKINDEX.tar.gz
WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.11/main/x86_64/APKINDEX.tar.gz: BAD signature
WARNING: Ignoring http
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
We need description, citation, license, and version meta info to be added to the dataset.
Some datasets need this info inside them for legal reasons.
HUBEasy to implement, won't hurt for sure.
Currently, we have all metadata store
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
A curated list of awesome JSON datasets that don't require authentication.
搜索所有中文NLP数据集,附常用英文NLP数据集
I want to search for posts by title (e.g. in the Comments for a Post source), so I can quickly find the post I want to watch for new events. Reddit implements a global search API that we can restrict to returning links, which should allow us to implement an endpoint with search and async options.
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…
Datasets, tools, and benchmarks for representation learning of code.
Resources for deep learning with satellite & aerial imagery
Issue to track tutorial requests:
In-memory tabular data in Julia
Benchmark datasets, data loaders, and evaluators for graph machine learning
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
Large datasets for conversational AI
Currently, API manually throws its own messages and errors. We should move them to werkzeug exceptions.
Community list of transit APIs, apps, datasets, research, and software
A large collection of system log datasets for AI-powered log analytics
Machine learning datasets used in tutorials on MachineLearningMastery.com
This is a repository of a topic-centric public data sources in high quality for Recommender Systems (RS)
Add a description, image, and links to the datasets topic page so that developers can more easily learn about it.
To associate your repository with the datasets topic, visit your repo's landing page and select "manage topics."
Hi
Wikiann dataset needs to have "spans" columns, which is necessary to be able to use this dataset, but this column is missing from huggingface datasets, could you please have a look? thank you @lhoestq