-
-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding QWERTY support to DL distance #92
Comments
This is really interesting. @DocShahrukh would this approach work for OCR errors too, assuming one came up with a useful weighting? So '1' paired with 'l', etc. |
I think it’d make sense to make the cost function configurable, that’d let
people do this for different layouts or ocr specific functions. I’d be glad
to incorporate such a PR if anyone has time
…On Thu, Dec 21, 2017 at 12:54 PM Jacob Fenton ***@***.***> wrote:
This is really interesting. @DocShahrukh <https://github.com/docshahrukh>
would this approach work for OCR errors too, assuming one came up with a
useful weighting? So '1' paired with 'l', etc.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#92 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAfYjwm1znugDdT4oG1TdkkS1zuz5Onks5tCptcgaJpZM4RJpuI>
.
|
Don't forget QWERTZ (Germany, Austria and Eastern Europe)! |
And AZERTY, Dvorak, and other layouts. Cost function really needs to be configurable, perhaps with a few standard costs.
Indeed, see for example: |
Adjusting cost in DL distance for QWERTY keypad mistakes, may be to others too.
Please see if you're free.
key_pairs = [{'q','a'},{'q','w'},{'w','a'},{'w','e'},{'w','s'},{'e','s'},{'e','d'},{'e','r'},{'r','d'},{'r','f'},{'r','t'},{'t','g'},{'t','y'},{'y','g'},{'y','h'},{'y','u'},{'u','h'},{'u','j'},{'u','i'},{'i','j'},{'i','k'},{'i','o'},{'o','k'},{'o','l'},{'o','p'},{'l','k'},{'m','k'},{'m','n'},{'n','j'},{'n','b'},{'b','h'},{'b','v'},{'v','g'},{'v','c'},{'c','f'},{'c','x'},{'x','d'},{'x','z'},{'z','s'}]
def damerau_levenshtein_cost(a,b): if a==b : return 0 elif set([a,b]) in key_pairs: return .25 return 1
cost = damerau_levenshtein_cost(s1[i-1],s2[j-1])
I wish this hack finds some stack
The text was updated successfully, but these errors were encountered: