The Wayback Machine - http://web.archive.org/web/20200806182235/https://github.com/topics/text-processing

#

text-processing

Here are 759 public repositories matching this topic...

learnbyexample / Command-line-text-processing

Star

⚡ From finding text to search and replace, from sorting to beautifying text and more 🎨

ruby linux command-line regex perl ebook awk sed text-processing grep

Updated Apr 3, 2020
Shell

google / diff-match-patch

Star

Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.

diff match patch text-processing difference

Updated Jun 24, 2020
Python

fastnlp / fastNLP

Star

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

natural-language-processing deep-learning text-classification chinese-nlp text-processing nlp-parsing nlp-library

Updated Aug 5, 2020
Python

chmln / sd

Star

Intuitive find & replace CLI (sed alternative)

rust terminal command-line regex text-processing

Updated Aug 3, 2020
Rust

kk7nc / Text_Classification

Star

Text Classification Algorithms: A Survey

deep-learning random-forest text-classification recurrent-neural-networks naive-bayes-classifier dimensionality-reduction logistic-regression document-classification convolutional-neural-networks text-processing decision-trees boosting-algorithms support-vector-machines hierarchical-attention-networks nlp-machine-learning conditional-random-fields k-nearest-neighbours deep-belief-network rocchio-algorithm deep-neural-network

Updated Jun 15, 2020
Python

pyparsing / pyparsing

Star

Python library for creating PEG parsers

python parsing parser-combinators python3 parsing-expression-grammar python-3 text-processing python-2 python2 parsing-library peg-parsers

Updated Jul 31, 2020
Python

abadojack / whatlanggo

Star

Natural language detection library for Go

nlp go language text-processing

Updated Mar 6, 2019
Go

derek73 / python-nameparser

Star

A simple Python module for parsing human names into their individual components

python text-processing text-parser python-module

Updated Feb 27, 2020
Python

proycon / pynlpl

Star

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

python nlp machine-learning natural-language-processing library linguistics computational-linguistics text-processing nlp-library search-algorithms evaluation-metrics folia language-modelling

Updated Mar 13, 2019
Python

andrewbihl / bsed

Star

Simple SQL-like syntax on top of Perl text processing.

python csv perl awk sed text-processing grep domain-specific-language

Updated Jul 2, 2019
Python

open-korean-text / open-korean-text

Star

Open Korean Text Processor - An Open-source Korean Text Processor

natural-language-processing tokenizer korean text-processing korean-text-processing korean-tokenizer

Updated Aug 7, 2018
Scala

cbaziotis / ekphrasis

Star

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).

nlp tokenizer text-processing semeval nlp-library word-segmentation spelling-correction tokenization text-segmentation spell-corrector word-normalization

Updated Aug 5, 2020
Python

BurntSushi / aho-corasick

Star

A fast implementation of Aho-Corasick in Rust.

search finite-state-machine text-processing aho-corasick substring-matching

Updated Jul 1, 2020
Rust

airbnb / artificial-adversary

Star

🗣️ Tool to generate adversarial text examples and test machine learning models against them

python spam data-science machine-learning text-mining data-mining text-classification metrics text text-analysis python3 classification text-processing python2 spam-filtering spam-detection spam-classification adversarial-examples black-box-attacks black-box-benchmarking

Updated Oct 14, 2018
Python

textpipe / textpipe

Star

Textpipe: clean and extract metadata from text

nlp text-analysis named-entities named-entity-recognition text-processing language-identification

Updated Aug 3, 2020
Python

BurntSushi / regex-automata

Star

A low level regular expression library that uses deterministic finite automata.

rust automata regex regexp text-processing nfa automaton dfa regex-engine

Updated Jul 12, 2020
Rust

rust-unic

open-i18n / rust-unic

Star

UNIC: Unicode and Internationalization Crates for Rust

rust unicode internationalization cldr crates unicode-characters text-processing unic locale-data unicode-algorithms

Updated Jul 24, 2019
Rust

s3nh / text-detector

Star

Tool which allow you to detect and translate text.

nlp recognition deep-learning text craft pytorch text-recognition text-processing ocr-recognition crnn scene-text-detection scene-text-detectors

Updated Sep 10, 2019
Python

linuxscout / pyarabic

Star

pyarabic

text-processing nlp-library arabic-language

Updated Aug 3, 2020
Python

textvec / textvec

Star

Text vectorization tool to outperform TFIDF for classification tasks

python nlp machine-learning natural-language-processing text-classification text-analysis tf-idf text-processing

Updated Feb 27, 2020
Python

hakatashi / japanese.js

Star

Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.

javascript utility katakana hiragana japanese text-processing romanize

Updated May 23, 2020
JavaScript

assafmo / xioc

Star

Extract indicators of compromise from text, including "escaped" ones.

ioc text-mining data-mining command-line regex regexp extract extraction command-line-tool text-processing iocs defang indicators-of-compromise escaping

Updated Apr 19, 2020
Go

NIHOPA / NLPre

Star

Python library for Natural Language Preprocessing (NLPre)

python nlp natural-language-processing text-processing nlp-parsing

Updated Nov 15, 2019
Python

microsoft / browsecloud

Star

A web app to create and browse text visualizations for automated customer listening.

visualization nlp text-classification text-processing bayesian-networks counting-grids

Updated Jul 31, 2020
TypeScript

stanfordnlp / stanza-old

Star

Stanford NLP group's shared Python tools.

python nlp natural-language-processing text-analysis text-processing

Updated Mar 14, 2018
Python

ikegami-yukino / jaconv

Star

Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku and Zenkaku

japanese-language text-processing pure-python preprocessing character-converter japanese-kana

Updated Aug 2, 2020
Python

WZBSocialScienceCenter / tmtoolkit

Star

Text Mining and Topic Modeling Toolkit for Python with parallel processing power

python nlp evaluation topic-modeling text-processing parallel-processing socialscience

Updated Aug 3, 2020
Python

lyeoni / prenlp

Star

Preprocessing Library for Natural Language Processing

nlp natural-language-processing text-processing text-preprocessing preprocessing-library

Updated Feb 5, 2020
Python

CogComp / cogcomp-nlpy

Star

CogComp's light-weight Python NLP annotators

nlp natural-language-processing text-mining data-mining text-processing

Updated Feb 18, 2019
Python

bytesparadise / libasciidoc

Star

A Golang library for processing Asciidoc files.

golang library asciidoc text-processing

Updated Aug 2, 2020
Go

Improve this page

Add a description, image, and links to the text-processing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the text-processing topic, visit your repo's landing page and select "manage topics."

You can’t perform that action at this time.