Statistics
Statistics is a mathematical discipline concerned with developing and studying mathematical methods for collecting, analyzing, interpreting, and presenting large quantities of numerical data. Statistics is a highly interdisciplinary field of study with applications in fields such as physics, chemistry, life sciences, political science, and economics.
Here are 8,724 public repositories matching this topic...
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
-
Updated
Nov 7, 2021 - Jupyter Notebook
-
Updated
Nov 3, 2021 - TypeScript
Golang library for reading and writing Microsoft Excel™ (XLSX) files.
-
Updated
Nov 8, 2021 - Go
Simple, open-source, lightweight (< 1 KB) and privacy-friendly web analytics alternative to Google Analytics.
-
Updated
Nov 9, 2021 - Elixir
Umami is a simple, fast, website analytics alternative to Google Analytics.
-
Updated
Nov 8, 2021 - JavaScript
Create HTML profiling reports from pandas DataFrame objects
-
Updated
Nov 9, 2021 - Jupyter Notebook
Collection of follow-ups to #5827. These can/should be broken out into individual PRs. Many are relatively straightforward and would make a good first PR.
General
- Documentation (none was added in original PR).
- Release notes.
- Example notebook.
- Double-check how
sm.tsa.arima.ARIMAworks withfix_params(it should fail except when the fit method isstatespace
-
Updated
Oct 12, 2021 - HTML
-
Updated
Oct 11, 2021 - Python
Describe the issue linked to the documentation
In the section https://imbalanced-learn.org/stable/under_sampling.html#prototype-selection the selected subset S' should be a (strict) subset, not and element of S.
Suggest a potential alternative/fix
Change \in to \subset in doc/under_sampling.rst.
Statistical Machine Intelligence & Learning Engine
-
Updated
Oct 20, 2021 - Java
▁▅▆▃▅ Git quick statistics is a simple and efficient way to access various statistics in git repository.
-
Updated
Aug 19, 2021 - Shell
A probabilistic programming language in TensorFlow. Deep generative models, variational inference.
-
Updated
Oct 22, 2019 - Jupyter Notebook
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
-
Updated
Nov 9, 2021 - Go
Machine learning, computer vision, statistics and general scientific computing for .NET
-
Updated
Nov 18, 2020 - C#
A Python based monitoring and tracking tool for Plex Media Server.
-
Updated
Nov 8, 2021 - Python
Are there any plans to add a Zero-Inflated Poisson (ZIP) and Zero-Inflated Negative Binomial (ZINB) to TFP? Those are usually very common distributions in other packages, and it shouldn't be hard to implement.
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
-
Updated
Jan 28, 2021 - C++
Curated list of Python resources for data science.
-
Updated
Nov 9, 2021
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
-
Updated
Nov 5, 2021 - Java
Since the default output is meant to be human-readable, would it make sense to add thousands separators to make the output more easily readable?
Java dataframe and visualization library
-
Updated
Nov 3, 2021 - Java
-
Updated
Nov 9, 2021 - JavaScript
simple statistics for node & browser javascript
-
Updated
Nov 1, 2021 - JavaScript
Math.NET Numerics
-
Updated
Nov 8, 2021 - C#
A Laravel package to retrieve pageviews and other data from Google Analytics
-
Updated
Oct 28, 2021 - PHP


These examples take quite a long time to run, and they make our documentation CI fail quite frequently due to timeout. It'd be nice to speed the up a little bit.
To contributors: if you want to work on an example, first have a look at the example, and if you think you're comfortable working on it, please mention which one you're working on.