Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
Here are 17,131 public repositories matching this topic...
Most functions in scipy.linalg functions (e.g. svd, qr, eig, eigh, pinv, pinv2 ...) have a default kwarg check_finite=True that we typically leave to the default value in scikit-learn.
As we already validate the input data for most estimators in scikit-learn, this check is redundant and can cause significant overhead, especially at predict / transform time. We should probably a
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
-
Updated
Jan 4, 2021 - Jupyter Notebook
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
Updated
Jan 28, 2021 - Python
-
Updated
Feb 6, 2021 - Python
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
-
Updated
Dec 21, 2020 - Python
-
Updated
Feb 3, 2021
Describe your feature request
Hi guys,
It would be awesome to add API that has same output as ray memory command.
Also, it would be good to add some additional output info for ray.objects(). For example, node IP, IDs of objects which are created in in-process stores, IDs of objects from remote calls (when remote calls are still being executed).
Thanks in advance!
Travis is not going to automatically offer the free tier for all open source projects; We likely want o migrate away from travis.
Setting up github actions to replace travis would be a welcomed contribution.
In recent versions (can't say from exactly when), there seems to be an off-by-one error in dcc.DatePickerRange. I set max_date_allowed = datetime.today().date(), but in the calendar, yesterday is the maximum date allowed. I see it in my apps, and it is also present in the first example on the DatePickerRange documentation page.
E
Summary
When a function has print('sth', file=sys.stderr) in the body I get:
InternalHashError: [Errno 2] No such file or directory: '<stderr>'
While caching the body of eval_models_on_all_data(), Streamlit encountered an object of type _io.TextIOWrapper, which it does not know how to hash.Steps to reproduce
Code snippet:
@st.cache
def f():
prinEach of these should be removed and format fixed in separate PRs
Context: #5739
Tracker:
- E203
- E231
- W504: Just remove, should not need formatting changes
Contributors are welcome
Not a high-priority at all, but it'd be more sensible for such a tutorial/testing utility corpus to be implemented elsewhere - maybe under /test/ or some other data- or doc- related module – rather than in gensim.models.word2vec.
Originally posted by @gojomo in RaRe-Technologies/gensim#2939 (comment)
VIP cheatsheets for Stanford's CS 229 Machine Learning
-
Updated
May 20, 2020
The fastai book, published as Jupyter Notebooks
-
Updated
Feb 6, 2021 - Jupyter Notebook
A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
-
Updated
Jan 4, 2021
The "Python Machine Learning (1st edition)" book code repository and info resource
-
Updated
Oct 16, 2020 - Jupyter Notebook
Dive into Machine Learning with Python Jupyter notebook and scikit-learn!
-
Updated
Jul 31, 2020
A curated list of awesome big data frameworks, ressources and other awesomeness.
-
Updated
Feb 2, 2021
more details at: allenai/allennlp#2264 (comment)
Deep learning library featuring a higher-level API for TensorFlow.
-
Updated
Jan 25, 2021 - Python
Best Practices on Recommendation Systems
-
Updated
Feb 4, 2021 - Python
I'm using mxnet to do some work, but there is nothing when I search the mxnet trial and example.
Roadmap to becoming an Artificial Intelligence Expert in 2021
-
Updated
Jan 19, 2021 - JavaScript
Current pytorch implementation ignores the argument split_f in the function train_batch_ch13 as shown below.
def train_batch_ch13(net, X, y, loss, trainer, devices):
if isinstance(X, list):
# Required for BERT Fine-tuning (to be covered later)
X = [x.to(devices[0]) for x in X]
else:
X = X.to(devices[0])
...Todo: Define the argument `
Tutorials, assignments, and competitions for MIT Deep Learning related courses.
-
Updated
Jan 15, 2021 - Jupyter Notebook
Statistical data visualization using matplotlib
-
Updated
Feb 5, 2021 - Python
The expressions editor window includes a Help tab that has some plain-text information about each variable and function.
I am editing this text right now to match the user manual. I'd like the fields to render with links (they currently just have full URLS that users can copy-pa
- Wikipedia
- Wikipedia



(e.g. for links and images), because some of these examples are now being rendered in the docs.
Added by @fchollet in requests for contributions.