Natural language processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
Here are 9,941 public repositories matching this topic...
AiLearning: 机器学习 - MachineLearning - ML、深度学习 - DeepLearning - DL、自然语言处理 NLP
-
Updated
Jun 1, 2020 - Python
When you look at the variables in the pretrained base uncased BERT the varibles look like list 1. When you do the training from scratch, 2 additional variables per layer are introduced, with suffixes adam_m and adam_v. It would be nice for someone to explain what these variables are? and what is their significance to the process of training?
If one were to manually initialize variables from a pri
中文分词 词性标注 命名实体识别 依存句法分析 语义依存分析 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
-
Updated
Jun 15, 2020 - Python
I was going though the existing enhancement issues again and though it'd be nice to collect ideas for spaCy plugins and related projects. There are always people in the community who are looking for new things to build, so here's some inspiration
If you have questions about the projects I suggested,
Oxford Deep NLP 2017 course
-
Updated
Jun 12, 2017
Your new Mentor for Data Science E-Learning.
-
Updated
Jun 16, 2020 - Jupyter Notebook
Example (from TfidfTransformer)
if isinstance(docs[0], tuple):
docs = [docs]
return [self.gensim_model[doc] for doc in docs]This method expects a list of tuples, instead of an iterable. This means that the entire corpus has to be stored as a lis
-
Updated
Jun 1, 2020
A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
-
Updated
May 24, 2020
Looping the process of writing images into the .tfrecords-file works fine, but how do I read multiple images from a .tfrecords-file?
Is there any simple solution? would be great if added to the code.
Hello and thanks for the great library!
I've been having some difficulty getting the doc.lists() method to work, it seems to be very sensitive to the way lists are structured and the items in the list. The example material at https://observablehq.com/@spencermountain/compromise-lists contains an example of this sensitivity, where the code sample for `nlp('he eats, shoots, and leaves.').list
This output is unexpected. The In returns the capitalize In from PorterStemmer's output.
>>> from nltk.stem import PorterStemmer
>>> porter = PorterStemmer()
>>> porter.stem('In')
'In'More details on https://stackoverflow.com/q/60387288/610569
I tried selecting hyper parameters of my model following "Tutorial 8: Model Tuning" below:
https://github.com/flairNLP/flair/blob/master/resources/docs/TUTORIAL_8_MODEL_OPTIMIZATION.md
Although I got the "param_selection.txt" file in the result directory, I am not sure how to interpret the file, i.e. which parameter combination to use. At the bottom of the "param_selection.txt" file, I found "
My feature request is to include an option on a button made from choice skill, to redirect a link to an external url...
Here's a detailed explanation including screenshots
This option will be really beneficial using choice skill buttons since at the moment, you can only add an ext
Feature request: separate logging for model computed loss and regularization loss in tensorboard
It would be nice to separately log model computed loss from regularization loss in tensorboard. Involves minor changes to the Trainer.
I propose this topic as feature request, but it's also a documentation issue, as lack of details in user guide paragraph: https://rasa.com/docs/rasa/core/actions/#custom-actions.
What specified in paragraph Execute Actions in Other Code is obscure to me, and details at the API documentation link [Action Server](]https://rasa.com/docs/rasa/api/acti
Prerequisites
Please fill in by replacing
[ ]with[x].
- Are you running the latest
bert-as-service? - Did you follow the installation and the usage instructions in
README.md? - Did you check the [FAQ list in
README.md](https://github.com/hanxiao/bert-as-se
As per the StanfordCoreNLP documentation for CoreLabel, The functions after() and before() should return white space strings between the token and the next/previous tokens respectively.
However, they return an empty string always even if there are some white spaces when the tokenizer option **normalizeOth
The words and sentences properties are helpers that use the textblob.tokenizers.WordTokenizer and textblob.tokenizers.SentenceTokenizer classes, respectively.
You can use other tokenizers, such as those provided by NLTK, by passing them into the TextBlob constructor then accessing the t
Excuse me, https://github.com/graykode/nlp-tutorial/blob/master/1-1.NNLM/NNLM-Torch.py#L50 The comment here may be wrong. It should be X = X.view(-1, n_step * m) # [batch_size, n_step * m]
Sorry for disturbing you.
all kinds of text classification models and more with deep learning
-
Updated
May 20, 2020 - Python
Hi I would like to propose a better implementation for 'test_indices':
We can remove the unneeded np.array casting:
Cleaner/New:
test_indices = list(set(range(len(texts))) - set(train_indices))
Old:
test_indices = np.array(list(set(range(len(texts))) - set(train_indices)))
Hi, can batchify method only batch a doc in a file, not two docs in the same file? Why the EOD flag not use to distinguish different docs in data_utils.py ?
此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。
-
Updated
Apr 20, 2020 - Jupyter Notebook
TensorFlow 2.x version's Tutorials and Examples, including CNN, RNN, GAN, Auto-Encoders, FasterRCNN, GPT, BERT examples, etc. TF 2.0版入门实例代码,实战教程。
-
Updated
Jun 9, 2020 - Jupyter Notebook
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
-
Updated
Dec 1, 2019
Description
Add a ReadMe file in the GitHub folder.
Explain usage of the Templates
Other Comments
Principles of NLP Documentation
Each landing page at the folder level should have a ReadMe which explains -
○ Summary of what this folder offers.
○ Why and how it benefits users
○ As applicable - Documentation of using it, brief description etc
Scenarios folder:
○
Created by Alan Turing
- Wikipedia
- Wikipedia


Many models have identical implementations of
prune_headsit would be nice to store that implementation as a method onPretrainedModeland reduce the redundancy.