-
Notifications
You must be signed in to change notification settings - Fork 95
Add linter checking remote_repository_url #1581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add linter checking remote_repository_url #1581
Conversation
the value should be the git URL to the tool, i.e. it should have a common suffix with the path of the repo that is under consideration
|
||
|
||
def lint_shed_remote_repository_url(realized_repository: "RealizedRepository", lint_ctx): | ||
path = realized_repository.real_path.rstrip(" /") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this doing rstrip with a whitespace and lsash ? can you add a comment explaining this longest common suffix heuristic ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i appreciate the extra comments but i'm still lost on what the while loop does, so I asked and this is what I got:
Current Implementation Issues
This code attempts to find a common suffix between a file path and a URL by iterating backwards through both strings. However, there are several problems:
Logic Error: The condition checks if characters match, but path[i:] captures everything from position i to the end, which grows longer as i becomes more negative. This doesn't correctly identify the longest common suffix.
String Comparison Confusion: Comparing individual characters at negative indices doesn't guarantee meaningful path segment matching. For example, /tool in a path might accidentally match ool in "school" in the URL.
Unclear Purpose: The docstring mentions checking for "common prefix" but the code looks for a suffix, creating confusion.
Weak Validation: Only checking for "/" in the common part is insufficient - it could match arbitrary substrings.
This was a replacement suggestion:
from pathlib import PurePosixPath
def lint_shed_remote_repository_url(realized_repository: "RealizedRepository", lint_ctx):
"""
Verify that remote_repository_url contains the repository path as a suffix.
Expected URL format: https://gitserver/organisation/tree/main/path
where 'path' should match the repository's filesystem path.
"""
path = PurePosixPath(realized_repository.real_path)
remote_repository_url = realized_repository.config.get("remote_repository_url", "").rstrip(" /")
if not remote_repository_url:
return # No URL to validate
# Get path parts (segments) excluding empty strings
path_parts = path.parts
# Check if URL ends with a reasonable portion of the path
# Look for at least 2 path segments to avoid false positives
min_segments = min(2, len(path_parts))
for i in range(len(path_parts) - min_segments + 1):
suffix = "/".join(path_parts[i:])
if remote_repository_url.endswith(suffix):
# Found a match with at least min_segments
return
# If no match found, issue warning
lint_ctx.warn(
f"remote_repository_url may be incorrect: expected it to end with "
f"repository path '{path}' or a significant portion of it"
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic Error: The condition checks if characters match, but path[i:] captures everything from position i to the end, which grows longer as i becomes more negative. This doesn't correctly identify the longest common suffix.
This is why I'm not convinced yet of AI :) Of course checking equality for the last, 2nd last, 3rd last ... character will determine the longest common substring. Even if efficiency is not relevant here, note that it's also more efficient than repeatedly constructing potential longest substrings and comparing these substrings (O(n) vs O(n^2)) ... but I should move longest_common_suffix = path[i:]
to the else
branch :)
String Comparison Confusion: ...
Unclear Purpose: ...
Weak Validation: ...
This is why I still make use of it: Indeed checking for longest common suffix of path segments is a better idea.
the value should be the git URL to the tool, i.e. it should have a common suffix with the path of the repo that is under consideration