Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...
-
Updated
Jan 22, 2022 - Java
{{ message }}
Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...
A .NET port of java-string-similarity
set of functions and operators for executing similarity queries
Ruby & C implementation of Jaro-Winkler distance algorithm which supports UTF-8 string.
Golang metrics for calculating string similarity and other string utility functions
Ruby gem (native extension in Rust) providing implementations of various string metrics
String similarity metrics for Elixir
A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard similarity, Longest common subsequence, Hamming distance, and more..
Spark functions to run popular phonetic and string matching algorithms
String similarity functions, String distance's, Jaccard, Levenshtein, Hamming, Jaro-Winkler, Q-grams, N-grams, LCS - Longest Common Subsequence, Cosine similarity...
Fast batch jaro winkler distance implementation in C99 with Ruby, OCaml and Python bindings.
PySpark phonetic and string matching algorithms
Edit distance algorithms inc. Jaro, Damerau-Levenshtein, and Optimal Alignment
A collection of metrics and phonetic algorithms for fuzzy string matching in Elixir.
String metrics function in golang (levenshtein, damerau-levenshtein, jaro, jaro-winkler and additionally bk-tree) for autocorrect
String Comparision in C#.NET
A project around helping to prevent typing typos. TySug (Typo Suggestions) suggests alternative words with respect to keyboard layouts
Distance related functions (Damerau-Levenshtein, Jaro-Winkler , longest common substring & subsequence) implemented as SQLite run-time loadable extension. Any UTF-8 strings are supported.
jSimilarity is a library that implements various similarity measures
Python library for fast approximate string matching using Jaro and Jaro-Winkler similarity
A string similarity utility that uses the Jaro-Winkler algorithm
Go port of the python jellyfish module for approximate and phonetic matching of strings.
A measure of distance between words with the Jaro-Winkler algorithm
A text similarity metric library, e.g. from edit distance's (Levenshtein, Gotoh, Jaro, etc) to other metrics, (e.g Soundex, Chapman). This library is compiled based on the .NET standard with a lot of useful extension methods.
Add a description, image, and links to the jaro-winkler topic page so that developers can more easily learn about it.
To associate your repository with the jaro-winkler topic, visit your repo's landing page and select "manage topics."
Create a user example (see #2) which shows how StringCompare can be used to match business names.
That is, suppose we have a long list L of business names. Given another business name provided by a user, we want to be able to find the name in L which most closely matches it.
We can address this problem in a few steps: