Skip to content

Conversation

@Kiuh
Copy link

@Kiuh Kiuh commented Aug 21, 2025

User description

When use domonic, first import emits warning with new python version.


PR Type

Bug fix


Description

  • Fix regex pattern to use raw string literal

  • Resolve Python deprecation warning for invalid escape sequence


Diagram Walkthrough

flowchart LR
  A["Regex pattern with escape sequence"] --> B["Raw string literal"] 
  B --> C["Warning resolved"]
Loading

File Walkthrough

Relevant files
Bug fix
dom.py
Fix regex escape sequence warning                                               

domonic/dom.py

  • Convert regex pattern string to raw string literal
  • Fix invalid escape sequence warning in class name matching
+1/-1     

@qodo-merge-pro
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 1 🔵⚪⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Regex Robustness

The regex concatenates user-provided class names without escaping. Class names with regex metacharacters (e.g., '+', '.', '[]') will alter matching. Consider wrapping class_name with re.escape() to ensure literal matching.

    r"(^|\s)" + class_name + r"(\s|$)", fnd.getAttribute("class")
):
Word Boundary Accuracy

Using (\s|$) may miss cases with non-space separators or multiple spaces; consider splitting the class attribute on whitespace and comparing tokens, or using a more precise regex like r'(?:(?<=^)|(?<=\s))' and r'(?=\s|$)'. Token-based comparison is simpler and faster.

    r"(^|\s)" + class_name + r"(\s|$)", fnd.getAttribute("class")
):

@qodo-merge-pro
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Escape user-supplied regex input

Escape class_name before interpolating it into the regex to avoid unintended
regex metacharacter interpretation and potential ReDoS on crafted inputs. Use
re.escape around class_name to ensure literal matching of class names.

domonic/dom.py [2210-2213]

 if fnd.getAttribute("class") and re.search(
-    r"(^|\s)" + class_name + r"(\s|$)", fnd.getAttribute("class")
+    r"(^|\s)" + re.escape(class_name) + r"(\s|$)", fnd.getAttribute("class")
 ):
     context.append(fnd)
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a potential security vulnerability (ReDoS) and correctness issue where class_name, which can contain user-controlled characters, is not escaped before being used in a regular expression.

High
  • More

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant