-
-
Notifications
You must be signed in to change notification settings - Fork 915
fix: Replace os.path with pathlib.Path for cross-platform compatibility #1137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix: Replace os.path with pathlib.Path for cross-platform compatibility #1137
Conversation
- Replace os.path.join, os.path.normpath, os.path.basename with pathlib equivalents - Update tests/conftest.py to use pathlib.Path instead of os.path functions - Modify nettacker/core/graph.py to use pathlib for path operations - Update nettacker/api/core.py and nettacker/api/engine.py to use Path objects - Fix tests/test_yaml_regexes.py to use pathlib.Path.name instead of os.path.basename - Remove obsolete os.path patches from tests/core/test_graph.py This ensures the application works consistently across Windows and Unix-like systems by eliminating platform-specific path assumptions throughout the codebase. Fixes OWASP#933
Summary by CodeRabbit
WalkthroughReplaces os.path usage with pathlib.Path across API, core graph logic, and tests. Updates path resolution, validation, MIME type/suffix handling, and file name extraction. Adjusts tests to use Path, remove path mocks, and simplify signatures. No production public API changes; some test function signatures updated. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
nettacker/api/core.py (1)
120-127
: Fix path traversal check: startswith on strings is unsafe; use Path.is_relative_to + strict resolve.A crafted path like “…/web_static_dir_evil/…” can pass a string prefix check. Use pathlib semantics and fail closed.
- if not str(Path(filename).resolve()).startswith(str(Config.path.web_static_dir.resolve())): - abort(404) - try: - return open(filename, "rb").read() - except ValueError: - abort(404) - except IOError: - abort(404) + base = Config.path.web_static_dir.resolve() + try: + target = Path(filename).resolve(strict=True) + except FileNotFoundError: + abort(404) + if not target.is_relative_to(base): + abort(404) + try: + return target.read_bytes() + except OSError: + abort(404)nettacker/core/graph.py (1)
380-398
: Bug: fullpath is a Path; len()/slicing like fullpath[-5:] will raise. Use suffix/suffixes.This will currently throw TypeError at runtime. Switch to pathlib suffix logic and handle “.dd.json”.
- if (len(fullpath) >= 5 and fullpath[-5:] == ".html") or ( - len(fullpath) >= 4 and fullpath[-4:] == ".htm" - ): + suffix = fullpath.suffix.lower() + suffixes = [s.lower() for s in fullpath.suffixes] + if suffix in (".html", ".htm"): html_report = build_compare_report(compare_results) - with Path(fullpath).open("w", encoding="utf-8") as compare_report: + with fullpath.open("w", encoding="utf-8") as compare_report: compare_report.write(html_report + "\n") - elif len(fullpath) >= 5 and fullpath[-5:] == ".json": - with Path(fullpath).open("w", encoding="utf-8") as compare_report: + elif suffixes[-2:] == [".dd", ".json"]: + with fullpath.open("w", encoding="utf-8") as compare_report: + compare_report.write(str(json.dumps(compare_results)) + "\n") + elif suffix == ".json": + with fullpath.open("w", encoding="utf-8") as compare_report: compare_report.write(str(json.dumps(compare_results)) + "\n") - elif len(fullpath) >= 5 and fullpath[-4:] == ".csv": + elif suffix == ".csv": keys = compare_results.keys() - with Path(fullpath).open("a") as csvfile: + with fullpath.open("a") as csvfile: writer = csv.DictWriter(csvfile, fieldnames=keys) if csvfile.tell() == 0: writer.writeheader() writer.writerow(compare_results) else: - with Path(fullpath).open("w", encoding="utf-8") as compare_report: + with fullpath.open("w", encoding="utf-8") as compare_report: compare_report.write(create_compare_text_table(compare_results))tests/core/test_graph.py (1)
246-259
: Fix the PermissionError test to actually exercise the guard.With sanitize_path in place and a simple filename, no PermissionError will be raised. Patch sanitize_path to a pass-through and send a traversal path.
-@patch("nettacker.core.graph.get_logs_by_scan_id") -@patch("nettacker.core.graph.get_options_by_scan_id") -def test_permission_error(mock_opts, mock_logs): +@patch("nettacker.core.graph.get_logs_by_scan_id") +@patch("nettacker.core.graph.get_options_by_scan_id") +@patch("nettacker.core.graph.sanitize_path", side_effect=lambda s: s) # disable sanitization +def test_permission_error(mock_sanitize, mock_opts, mock_logs): @@ - with pytest.raises(PermissionError): - create_compare_report(DummyOptions("scan-comp", "report.html"), "scan-1") + with pytest.raises(PermissionError): + # attempt directory traversal should be blocked by is_relative_to + create_compare_report(DummyOptions("scan-comp", "../../outside/report.html"), "scan-1")
🧹 Nitpick comments (4)
tests/test_yaml_regexes.py (1)
15-18
: Deterministic, file-only discovery (use glob + sort) and keep Path objects.Ensure stable test collection order across platforms and skip non-files. Also, you can yield Path objects directly and let callers accept Path (open() already supports it).
- base_path = Path(base) - for file in base_path.iterdir(): - if file.suffix == ".yaml": - yield str(file) + base_path = Path(base) + if not base_path.is_dir(): + continue + for file in sorted(base_path.glob("*.yaml")): + if file.is_file(): + yield file # callers (open) accept Pathnettacker/api/engine.py (1)
392-396
: Use Path for filename in Content‑Disposition.Split on "/" is not cross‑platform; prefer Path(...).name and quote.
- headers={"Content-Disposition": "attachment;filename=" + filename.split("/")[-1]}, + headers={"Content-Disposition": f'attachment; filename="{Path(filename).name}"'},tests/conftest.py (1)
2-2
: Slightly safer path resolution for test harness.Resolve the project_root too to avoid surprises when invoked from subdirs; otherwise looks good.
-project_root = Path(__file__).parent.parent +project_root = Path(__file__).resolve().parent.parentAlso applies to: 4-6, 8-9
nettacker/core/graph.py (1)
285-287
: Tiny clean-up: avoid redundant str() around json.dumps.json.dumps already returns str.
- with Path(report_path_filename).open("w", encoding="utf-8") as report_file: - report_file.write(str(json.dumps(all_scan_logs)) + "\n") + with Path(report_path_filename).open("w", encoding="utf-8") as report_file: + report_file.write(json.dumps(all_scan_logs) + "\n")
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
nettacker/api/core.py
(2 hunks)nettacker/api/engine.py
(4 hunks)nettacker/core/graph.py
(1 hunks)tests/conftest.py
(1 hunks)tests/core/test_graph.py
(1 hunks)tests/test_yaml_regexes.py
(3 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py
: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical
Files:
nettacker/api/core.py
nettacker/api/engine.py
tests/test_yaml_regexes.py
nettacker/core/graph.py
tests/conftest.py
tests/core/test_graph.py
nettacker/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Add docstrings for public APIs in the nettacker package
Files:
nettacker/api/core.py
nettacker/api/engine.py
nettacker/core/graph.py
nettacker/api/**
📄 CodeRabbit inference engine (AGENTS.md)
Place API-related code under nettacker/api/
Files:
nettacker/api/core.py
nettacker/api/engine.py
tests/**/test_*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/test_*.py
: Place tests under tests/ and name files as test_*.py
Use pytest (with pytest-asyncio, xdist) for tests
Files:
tests/test_yaml_regexes.py
tests/core/test_graph.py
tests/**
📄 CodeRabbit inference engine (AGENTS.md)
Mirror package layout in tests (e.g., tests/core/, tests/lib/, etc.)
Files:
tests/test_yaml_regexes.py
tests/conftest.py
tests/core/test_graph.py
nettacker/core/**
📄 CodeRabbit inference engine (AGENTS.md)
Place core libraries under nettacker/core/
Files:
nettacker/core/graph.py
🧠 Learnings (1)
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
PR: OWASP/Nettacker#0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Applies to nettacker.py : Repository provides nettacker.py as an entry script (use as Python entry point)
Applied to files:
tests/conftest.py
🧬 Code graph analysis (3)
nettacker/api/core.py (1)
nettacker/config.py (1)
Config
(183-187)
nettacker/api/engine.py (2)
nettacker/api/core.py (2)
get_file
(110-127)mime_types
(30-107)nettacker/config.py (1)
Config
(183-187)
nettacker/core/graph.py (1)
nettacker/core/utils/common.py (1)
sanitize_path
(430-445)
🔇 Additional comments (7)
tests/test_yaml_regexes.py (2)
2-2
: LGTM on pathlib import.
81-81
: Cross‑platform basename extraction looks good.nettacker/api/core.py (1)
1-1
: LGTM on pathlib migration import.nettacker/api/engine.py (3)
7-7
: LGTM on pathlib import here.
188-192
: Static handler aligns with new get_file; OK once core guard is fixed.Please confirm large/binary files won’t exhaust memory via Response(get_file(...)) since it loads whole file; consider send_file/conditional responses later if needed.
223-237
: Sanitization flow is sound — Python requirement satisfied. pyproject.toml sets python = "^3.9, <3.13", so Path.is_relative_to (requires 3.9+) is supported; no change required.tests/core/test_graph.py (1)
187-214
: OK: Path.open patching and removal of os.path mocks.Tests align with Path-based implementation.
base_path = nettacker_path_config.results_dir | ||
compare_report_path_filename = sanitize_path(compare_report_path_filename) | ||
fullpath = os.path.normpath(os.path.join(base_path, compare_report_path_filename)) | ||
fullpath = (base_path / compare_report_path_filename).resolve() | ||
|
||
if not fullpath.startswith(base_path): | ||
if not str(fullpath).startswith(str(base_path.resolve())): | ||
raise PermissionError | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Path traversal check: replace string prefix with Path.is_relative_to.
Using startswith on strings is brittle and can be bypassed with common prefixes.
- base_path = nettacker_path_config.results_dir
+ base_path = nettacker_path_config.results_dir
compare_report_path_filename = sanitize_path(compare_report_path_filename)
- fullpath = (base_path / compare_report_path_filename).resolve()
-
- if not str(fullpath).startswith(str(base_path.resolve())):
- raise PermissionError
+ fullpath = (base_path / compare_report_path_filename).resolve()
+ if not fullpath.is_relative_to(base_path.resolve()):
+ raise PermissionError
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
base_path = nettacker_path_config.results_dir | |
compare_report_path_filename = sanitize_path(compare_report_path_filename) | |
fullpath = os.path.normpath(os.path.join(base_path, compare_report_path_filename)) | |
fullpath = (base_path / compare_report_path_filename).resolve() | |
if not fullpath.startswith(base_path): | |
if not str(fullpath).startswith(str(base_path.resolve())): | |
raise PermissionError | |
base_path = nettacker_path_config.results_dir | |
compare_report_path_filename = sanitize_path(compare_report_path_filename) | |
fullpath = (base_path / compare_report_path_filename).resolve() | |
if not fullpath.is_relative_to(base_path.resolve()): | |
raise PermissionError |
🤖 Prompt for AI Agents
In nettacker/core/graph.py around lines 373 to 379, the path traversal check
currently compares string prefixes using startswith which is brittle; replace
that logic with Path.is_relative_to to robustly verify that fullpath is inside
base_path (after resolving both), e.g. resolve both paths then use
fullpath.is_relative_to(base_path.resolve()) and raise PermissionError if that
returns False; keep sanitization and resolve steps but remove the string
startswith check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works, @PreistlyPython do look into this
base_path = nettacker_path_config.results_dir | ||
compare_report_path_filename = sanitize_path(compare_report_path_filename) | ||
fullpath = os.path.normpath(os.path.join(base_path, compare_report_path_filename)) | ||
fullpath = (base_path / compare_report_path_filename).resolve() | ||
|
||
if not fullpath.startswith(base_path): | ||
if not str(fullpath).startswith(str(base_path.resolve())): | ||
raise PermissionError | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works, @PreistlyPython do look into this
|
||
@patch("nettacker.core.graph.get_logs_by_scan_id") | ||
@patch("nettacker.core.graph.get_options_by_scan_id") | ||
@patch("nettacker.core.graph.os.path.normpath", side_effect=lambda x: "/etc/passwd") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot remove both these mocks. In fact, you probably shouldn't remove them. This is because either of them is enough to trigger the PermissionError but if we remove both mocks then this test will fail and we'll never raise the error and unchecked code goes through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look into this further tonight and make updates accordingly. Thanks for the response
- Replace string startswith() with Path.is_relative_to() for secure path validation - Fix runtime TypeError where Path object was treated as string in file extension checks - Use proper pathlib.Path.suffix for file extension detection - Improve Content-Disposition header with proper filename extraction - Remove redundant str() call around json.dumps() Security improvements: - Prevents path traversal attacks using modern pathlib semantics - Uses resolve(strict=True) to ensure target file exists before validation - Replaces brittle string prefix checks with robust path relationship validation Bug fixes: - Fixes TypeError in report generation where len(Path_object) would fail - Handles complex file extensions like .dd.json correctly - Maintains backward compatibility while using modern Python patterns
Hey! I've gone through the automated security review feedback and addressed all the issues that were flagged: Security fixes:
Test compatibility:
Other improvements:
The old approach was vulnerable because something like "../web_static_dir_evil/file" could potentially bypass the string prefix check, but Path.is_relative_to() handles this correctly. # Before (vulnerable)
if not str(target).startswith(str(base)):
# After (secure)
if not target.is_relative_to(base): Everything should be backward compatible and all tests should pass now. Let me know if you spot any other issues! @arkid15r @securestep9 - this should be ready for another look |
- Add sanitize_path mock to ensure PermissionError test can trigger properly - Mock returns '../../etc/passwd' to simulate path traversal attempt - Preserves test intent while adapting to new pathlib.Path approach - Addresses reviewer concern about removed os.path mocks
Summary
This PR addresses issue #933 by refactoring the code to use
pathlib.Path
exclusively for all path operations, ensuring OS-agnostic behavior across Windows, macOS, and Linux.Changes
os.path
usage withpathlib.Path
in 6 core filestests/conftest.py
nettacker/core/graph.py
nettacker/api/core.py
andnettacker/api/engine.py
os.path
patches from test filesKey Improvements
Testing
os.path
referencesFixes #933
🤖 Generated with Claude Code
Co-Authored-By: Claude [email protected]