parsing tags on word boundaries #319

MatthewBaggins · 2023-08-12T14:09:01Z

closes #318

Aprillion · 2023-08-12T21:37:16Z

utilities/question_query_utils.py

@@ -59,7 +59,7 @@ def parse_tag(text: str) -> Optional[str]:
        return None
    tag_val = match.group(1)
    tag_pat = _tag_pat.replace(r"\s", " ")
-    tag_idx = tag_pat.lower().find(tag_val.lower())
+    tag_idx = tag_pat.lower().find(rf"\b{tag_val.lower()}\b") + 2


what's the + 2 part about, please?

it's to offset the index so that it matches the start of the tag in the pattern
"\b" is 2 characters

(inb4: I considered getting rid of the regex and just iterating over all the tags and returning the first one that occurs in the string (if any) but in that case, we wouldn't be able to match on word boundaries, so the simplest reliable approach is to use regex anyway, so we can just leave it as is.)

huh? \b is an escape sequence that matches zero characters at the word boundary, the lenght of the regex input shouldn't matter, only the match, no? or what am I missing here?

oh, it's a non-regex find inside a string that represents a regex for future match 🤦

I'm not sure why we have a list of tags in the format of a string that represents regex, but if it's not possible to use an actual list of tags like ['tag1', 'tag2'], then I guess the code is fine...

parsing tags on word boundaries

92fde0f

MatthewBaggins requested review from Aprillion, Lovkush-A and jknowak August 12, 2023 14:09

Aprillion reviewed Aug 12, 2023

View reviewed changes

Aprillion approved these changes Aug 16, 2023

View reviewed changes

MatthewBaggins merged commit 997d3df into master Aug 16, 2023
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parsing tags on word boundaries #319

parsing tags on word boundaries #319

MatthewBaggins commented Aug 12, 2023

Aprillion Aug 12, 2023

MatthewBaggins Aug 15, 2023 •

edited

Loading

Aprillion Aug 16, 2023

Aprillion Aug 16, 2023

parsing tags on word boundaries #319

parsing tags on word boundaries #319

Conversation

MatthewBaggins commented Aug 12, 2023

Aprillion Aug 12, 2023

Choose a reason for hiding this comment

MatthewBaggins Aug 15, 2023 • edited Loading

Choose a reason for hiding this comment

Aprillion Aug 16, 2023

Choose a reason for hiding this comment

Aprillion Aug 16, 2023

Choose a reason for hiding this comment

MatthewBaggins Aug 15, 2023 •

edited

Loading