Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding QWERTY support to DL distance #92

Open
DocShahrukh opened this issue Dec 21, 2017 · 4 comments
Open

Adding QWERTY support to DL distance #92

DocShahrukh opened this issue Dec 21, 2017 · 4 comments

Comments

@DocShahrukh
Copy link

Adjusting cost in DL distance for QWERTY keypad mistakes, may be to others too.
Please see if you're free.

key_pairs = [{'q','a'},{'q','w'},{'w','a'},{'w','e'},{'w','s'},{'e','s'},{'e','d'},{'e','r'},{'r','d'},{'r','f'},{'r','t'},{'t','g'},{'t','y'},{'y','g'},{'y','h'},{'y','u'},{'u','h'},{'u','j'},{'u','i'},{'i','j'},{'i','k'},{'i','o'},{'o','k'},{'o','l'},{'o','p'},{'l','k'},{'m','k'},{'m','n'},{'n','j'},{'n','b'},{'b','h'},{'b','v'},{'v','g'},{'v','c'},{'c','f'},{'c','x'},{'x','d'},{'x','z'},{'z','s'}]

def damerau_levenshtein_cost(a,b): if a==b : return 0 elif set([a,b]) in key_pairs: return .25 return 1

cost = damerau_levenshtein_cost(s1[i-1],s2[j-1])

I wish this hack finds some stack

@jsfenfen
Copy link

This is really interesting. @DocShahrukh would this approach work for OCR errors too, assuming one came up with a useful weighting? So '1' paired with 'l', etc.

@jamesturk
Copy link
Owner

jamesturk commented Dec 21, 2017 via email

@DonaldTsang
Copy link

Don't forget QWERTZ (Germany, Austria and Eastern Europe)!

@DimitriPapadopoulos
Copy link
Contributor

And AZERTY, Dvorak, and other layouts. Cost function really needs to be configurable, perhaps with a few standard costs.

This is really interesting. @DocShahrukh would this approach work for OCR errors too, assuming one came up with a useful weighting? So '1' paired with 'l', etc.

Indeed, see for example:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants