text-processing
Here are 737 public repositories matching this topic...
Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
-
Updated
May 28, 2020 - Python
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
-
Updated
Jun 18, 2020 - Python
I assume the majority of users who come across sd are likely also using other popular Rust-based CLI tools like fd and rg:
sharkdp/fd@app.rs#L70-L75- [
BurntSushi/ripgrep@app.rs#L1175-L1178](https://github.com/BurntSushi/ripgrep/blob/8ebc113847926922bb85fb5a01c175319fb1e8d4/src/app.rs#L1
Text Classification Algorithms: A Survey
-
Updated
Jun 15, 2020 - Python
Is the pyparsing class diagram still accurate? The last version it was generated for is 1.5.2, but it looks about right.
I'd like to be able to run commands on all lines of a file. For example, bsed wrap lines with " should execute on all lines of the file. Current workaround is to include some trivial filter like wrap lines containing '.' with "
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
-
Updated
Mar 13, 2019 - Python
A simple Python module for parsing human names into their individual components
-
Updated
Feb 27, 2020 - Python
Open Korean Text Processor - An Open-source Korean Text Processor
-
Updated
Aug 7, 2018 - Scala
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
-
Updated
Jun 10, 2020 - Python
A fast implementation of Aho-Corasick in Rust.
-
Updated
May 7, 2020 - Rust
I think it is necessary to add an experiment that compare the test accuracy of the original text and the adversarial text examples in the target model to judge whether the adversarial text examples really reduce the accuracy.
pip install -r requirements.txt
python -m spacy download en_core_web_sm # depend on which language you want to you: https://spacy.io/usage/models
Currently, the readme links to ["Characterization of a global germplasm collection and its potential utilization for analysis of complex quantitative traits in maize"](https://www.researchgate.net/profile/XU_Shutu/publication/229032602_Characterization_of_a_global_germplasm_collection_and_its_potential_utilization_for_analysis_of_complex_quantitative_traits_in_maize/links/02bfe50f914d04c837000000.
Support for stdin
Tool which allow you to detect and translate text.
-
Updated
Sep 10, 2019 - Python
Text vectorization tool to outperform TFIDF for classification tasks
-
Updated
Feb 27, 2020 - Python
Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.
-
Updated
May 23, 2020 - JavaScript
Extract indicators of compromise from text, including "escaped" ones.
-
Updated
Apr 19, 2020 - Go
A web app to create and browse text visualizations for automated customer listening.
-
Updated
Jun 7, 2020 - TypeScript
Stanford NLP group's shared Python tools.
-
Updated
Mar 14, 2018 - Python
Python library for Natural Language Preprocessing (NLPre)
-
Updated
Nov 15, 2019 - Python
Documentation
I am happy to see a documentation website for the library
I suggest using sphinx-rtd theme which supports RTL languages
https://sphinx-rtd-theme.readthedocs.io/en/stable/
https://github.com/readthedocs/sphinx_rtd_theme#contributing-or-modifying-the-theme
thank you
Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku and Zenkaku
-
Updated
Feb 24, 2020 - Python
Text Mining and Topic Modeling Toolkit for Python with parallel processing power
-
Updated
Mar 31, 2020 - Python
Preprocessing Library for Natural Language Processing
-
Updated
Feb 5, 2020 - Python
Explain the need to download models and how to change CogComp versions.
Talk about the PyJNIUS way to run a JVM and point to the CogComp/cogcomp-nlp
A Golang library for processing Asciidoc files.
-
Updated
Jun 18, 2020 - Go
Improve this page
Add a description, image, and links to the text-processing topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the text-processing topic, visit your repo's landing page and select "manage topics."


I think it would be a good idea to mention the
teecommand, probably somewhere in the "Cat, Less, Tail and Head" chapter