-
-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python: Add extract filter for tarfile.extractall #3340
Conversation
07d4db6
to
2464589
Compare
Should the (few) other uses fixed as well?
|
As we can't be sure if the user's Python version is the latest (i.e. a backported version), at least the call should be in a try-catch clause falling back to supported call. Perhaps make a wrapper for more general use? |
I’ll copy the full alert here: Tool Extracting files from a malicious tar archive without validating that the destination file path is within the destination directory can cause files outside the destination directory to be overwritten, due to the possible presence of directory traversal elements ( Tar archives contain archive entries representing each file in the archive. These entries include a file path for the entry, but these file paths are not restricted and may contain unexpected special elements such as the directory traversal element ( For example, if a tar archive contains a file entry RecommendationEnsure that output paths constructed from tar archive entries are validated to prevent writing files to unexpected locations. The recommended way of writing an output file from a tar archive entry is to check that " ExampleIn this example an archive is extracted without validating file paths. If archive.tar contained relative paths (for instance, if it were created by something like import sys
import tarfile
with tarfile.open(sys.argv[1]) as tar:
#BAD : This could write any file on the filesystem.
for entry in tar:
tar.extract(entry, "/tmp/unpack/") To fix this vulnerability, we need to check that the path does not contain any import sys
import tarfile
import os.path
with tarfile.open(sys.argv[1]) as tar:
for entry in tar:
#GOOD: Check that entry is safe
if os.path.isabs(entry.name) or ".." in entry.name:
raise ValueError("Illegal tar archive entry")
tar.extract(entry, "/tmp/unpack/") References Snyk: Zip Slip Vulnerability. |
Yes they should |
To really fix that specific issue reported, it should probably be something like the loop of the example. But using the 'data' filter solves way more. |
(I have opened the current macOS |
Is there some usages of https://www.python.org/downloads/release/python-3817/ -> no installers are made since 3.8.10 (source only as in security fixes only) https://www.python.org/downloads/release/python-3917/ -> no installers since 3.9.13 https://www.python.org/downloads/release/python-31012/ -> no installers since 3.10.11 https://www.python.org/downloads/release/python-3114/ -> installers available |
... not (yet?):
|
I didn't forget about this one, it's just a little too involved to do correctly only on a phone in the evening. It's on my weekend list. |
173022d
to
936760b
Compare
I think a good summary for this PR (maybe the final commit message contents) could be:
|
936760b
to
d933b65
Compare
…herwise and use old behavior See https://peps.python.org/pep-0706/#backporting-forward-compatibility for examples
d933b65
to
29b16ed
Compare
I took a look at how the changes were implemented, first suggested, and what other changes were made elsewhere to use this security backport. I found a couple issues/PRs of the author of the cpython fix or reviewed by him. One of these is pypa/build#675, and had a comment with a suggestion here pypa/build#675 (comment) but I didn't like both, it's quite long. pypa/pip#12214 is even longer. Another reference https://packaging.python.org/en/latest/specifications/source-distribution-format/#unpacking-with-the-data-filter But then I found the PEP 706 that was made for this issue, which suggests some usages: On one file, I used debug instead of warning, since there was a debug function that wrapped the grass.script to prevent circular dependencies. |
It seems only r.unpack has tests that uses any of the files I changed. But they are not found by gunittest, since the shell script and data are in a |
Ready to review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Good solution!
Addressing security related issues is important in general, even if the risk of exploytation seems very low, Especially if there is such a good fix available.
Fixed a CodeQL CWE-22 type issue (named Arbitrary file write during tarfile extraction), found when running an extended analysis (instead of only default).
The solution suggested in the alert didn’t seem appropriate (extracting files one by one), but I found that this issue was fixed in Python 3.12 and backported in the versions from 3.8 to 3.11. The extraction filter chosen is the strictest, since the archives that are supposed to be extracted are purely data, we do not expect any rare or advanced behaviour.
https://docs.python.org/3/library/tarfile.html#extraction-filters
https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractall