COLLECTED BY
Organization:
Internet Archive
Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
The Wayback Machine - http://web.archive.org/web/20200918003122/https://github.com/topics/data-mining
Here are
2,895 public repositories
matching this topic...
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
Updated
Sep 16, 2020
Python
📝 An awesome Data Science repository to learn and apply for real world problems.
The "Python Machine Learning (1st edition)" book code repository and info resource
Updated
Aug 10, 2020
Jupyter Notebook
Ready-to-use OCR with 40+ languages supported including Chinese, Japanese, Korean and Thai
Updated
Sep 15, 2020
Python
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Machine Learning for Cyber Security
Anomaly detection related books, papers, videos, and toolboxes
Updated
Sep 14, 2020
Python
A library of extension and helper modules for Python's data analysis and machine learning libraries.
Updated
Sep 16, 2020
Python
extract text from any document. no muss. no fuss.
Updated
Sep 17, 2020
Python
Curated list of Python resources for data science.
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
Updated
Oct 26, 2018
Python
A curated list of awesome machine learning interpretability resources.
List of tools & datasets for anomaly detection on time-series data.
📝 Подборка ресурсов по машинному обучению
HTML5 based online tool to extract numerical data from plot images.
Updated
Sep 10, 2020
JavaScript
Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
Updated
Feb 12, 2019
JavaScript
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
An offline recommender system backend based on collaborative filtering written in Go
AIL framework - Analysis Information Leak framework
Updated
Sep 14, 2020
Python
Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning
Multi-class confusion matrix library in Python
Updated
Sep 14, 2020
Python
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Improve this page
Add a description, image, and links to the
data-mining
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
data-mining
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
Summary
Today in the R package, there are a lot of internal function calls which use only positional arguments. Change them to use keyword arguments for extra safety.
I've added this issue to provide a small, focused contribution opportunity for Hacktoberfest 2020 participants. If you are an experienced open source contributor, please leave this