Introduction

The smart-match module contains functions for calculating strings/sets similarity.

Concept

similarity: A value in a range of [0, 1], which represents how similar the two strings are. The larger the value, the more similar the two strings are.
dissimilarity: A value in a range of [0, 1], which represents how dissimilar the two strings are. The larger the value, the more dissimilar the two strings are. For a pair of strings, similarity = 1 - dissimilarity
distance: How far the two strings are. Notice that not all the methods support distance method.
score The larger the score, the more similar the two strings are. Notice not all the methods have score method.

We support three levels of string matching.

char: Similarity computation based on characters in the strings.
term: Similarity computation based on terms in the strings.
gram: Similarity computation based on q-grams in the strings.

Methods

We support the following methods.

Method	similarity	dissimilarity	distance	score
Levenshtein (default)	✅	✅	✅	❌
Euclidean	✅	✅	✅	❌
Damerau Levenshtein	✅	✅	✅	❌
Block Distance	✅	✅	✅	❌
Cosine	✅	✅	❌	❌
Tanimoto Coefficient	✅	✅	❌	❌
Dice	✅	✅	❌	❌
Simon White	✅	✅	❌	❌
Longest Common Substring	✅	✅	✅	✅
Longest Common SubSequence	✅	✅	✅	✅
Overlap Coefficient	✅	✅	❌	❌
Generalized Overlap Coefficient	✅	✅	❌	❌
Jaccard	✅	✅	❌	❌
Generalized Jaccard	✅	✅	❌	❌
Hamming	✅	✅	✅	❌
Jaro	✅	✅	❌	❌
Jaro Winkler	✅	✅	❌	❌
Needleman Wunch	✅	✅	❌	✅
Smith Waterman	✅	✅	❌	✅
Smith Waterman Gotoh	✅	✅	❌	✅
Monge Elkan	✅	✅	❌	❌

Installation

pip install smart-match

Usage

import smart_match
print(smart_match.similarity('hello', 'hero'))
print(smart_match.dissimilarity('hello', 'hero'))
print(smart_match.distance('hello', 'hero'))

Output:

0.6
0.4
2

Check Wiki for more details.

License

smart-match is a free software. See the file LICENSE for the full text.

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
smart_match		smart_match
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Concept

Methods

Installation

Usage

License

Authors

About

Releases

Packages

Contributors 9

Languages

License

jiayingwang/smart-match

Folders and files

Latest commit

History

Repository files navigation

Introduction

Concept

Methods

Installation

Usage

License

Authors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 9

Languages

Packages