Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asking for 'make this work on mixed strings' #12

Open
i30817 opened this issue Jun 26, 2022 · 1 comment
Open

Asking for 'make this work on mixed strings' #12

i30817 opened this issue Jun 26, 2022 · 1 comment

Comments

@i30817
Copy link

i30817 commented Jun 26, 2022

When i handle roman numerals i want to convert them to integers in mixed strings for purposes of comparison in fuzz matching, so even false positives of 'roman numbers' aren't a problem as long as both sides match. Since roman numerals are near universally used in upper case in my dataset, only upper case roman numerals are recognized in my case.

So a method that translates any of possible multiple roman sequence on a mixed string, that left any not 'roman' character alone would be valuable.

I'm sure it can be done manually externally by just having a sequence of roman numerals and looking ahead on a string to 'extract' valid ones. But i'm worried a algorithm like this derived from a library would choke on 'illegal' roman numbers. Imagine passing 'IIII'. There is the possibility it would get translated to '13' or '31' but there is also the possibility that a library like this would just throw a exception.

I'm asking for a, maybe optionally, permissive method that essentially does this kind of 'best effort' translation of mixed strings with the documentation that the method is meant to be permissive enough that it's just best effort, and illegal numbers will get mistranslated into 'two valid' (or more) numbers.

Or another DIY method where only roman numerals exist, but it allows 'illegal sequences' by transforming into multiple numbers (i'm not asking for all possible combinations of multiple numbers mind you, since i want it for fuzz so a deterministic one suffices, but someone might)

@dataflake
Copy link
Member

A PR is always welcome (but please sign the contributor agreement first, see https://www.zope.dev/developer/becoming-a-committer.html)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants