-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INVAL
/OBJECT_NAME_INVALID
in fs
APIs incorrectly marked as unreachable
#15607
Comments
…aths There are many different types of Windows paths, and there are a few different possible namespaces on top of that. Before this commit, NT namespaced paths were somewhat supported, and for Win32 paths (those without a namespace prefix), only relative and drive absolute paths were supported. After this commit, all of the following are supported: - Device namespaced paths (`\\.\`) - Verbatim paths (`\\?\`) - NT-namespaced paths (`\??\`) - Relative paths (`foo`) - Drive-absolute paths (`C:\foo`) - Rooted paths (`\foo`) - UNC absolute paths (`\\server\share\foo`) - Root local device paths (`\\.` or `\\?` exactly) Plus: - Any of the path types and namespace types can be mixed and matched together as appropriate. - All of the `std.os.windows.*ToPrefixedFileW` functions will accept any path type, prefixed or not, and do the appropriate thing to convert them to an NT-prefixed path if necessary. This is achieved by making the `std.os.windows.*ToPrefixedFileW` functions behave like `ntdll.RtlDosPathNameToNtPathName_U`, but with a few differences: - Does not allocate on the heap (this is why we can't use `ntdll.RtlDosPathNameToNtPathName_U` directly, it does internal heap allocation). - Relative paths are kept as relative unless they contain too many .. components, in which case they are treated as rooted and resolved against the current drive (this is how it behaved before this commit as well). - Special case device names like COM1, NUL, etc are not handled specially (TODO) - `.` and space are not stripped from the end of relative paths (potential TODO) Most of the non-trivial conversion of non-relative paths is done via `ntdll.RtlGetFullPathName_U`, which AFAIK is used internally by `ntdll.RtlDosPathNameToNtPathName_U`. Some relevant reading on Windows paths: - https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html - https://chrisdenton.github.io/omnipath/Overview.html Closes ziglang#8205 Might close (untested) ziglang#12729 Note: - This removes checking for illegal characters in `std.os.windows.sliceToPrefixedFileW`, since the previous solution (iterate the whole string and error if any illegal characters were found) was naive and won't work for all path types. This is further complicated by things like file streams (where `:` is used as a delimiter, e.g. `file.ext:stream_name:$DATA`) and things in the device namespace (where a path like `\\.\GLOBALROOT\??\UNC\localhost\C$\foo` is valid despite the `?`s in the path and is effectively equivalent to `C:\foo`). Truly validating paths is complicated and would need to be tailored to each path type. The illegal character checking being removed may open up users to more instances of hitting `OBJECT_NAME_INVALID => unreachable` when using `fs` APIs. + This is related to ziglang#15607
…aths There are many different types of Windows paths, and there are a few different possible namespaces on top of that. Before this commit, NT namespaced paths were somewhat supported, and for Win32 paths (those without a namespace prefix), only relative and drive absolute paths were supported. After this commit, all of the following are supported: - Device namespaced paths (`\\.\`) - Verbatim paths (`\\?\`) - NT-namespaced paths (`\??\`) - Relative paths (`foo`) - Drive-absolute paths (`C:\foo`) - Drive-relative paths (`C:foo`) - Rooted paths (`\foo`) - UNC absolute paths (`\\server\share\foo`) - Root local device paths (`\\.` or `\\?` exactly) Plus: - Any of the path types and namespace types can be mixed and matched together as appropriate. - All of the `std.os.windows.*ToPrefixedFileW` functions will accept any path type, prefixed or not, and do the appropriate thing to convert them to an NT-prefixed path if necessary. This is achieved by making the `std.os.windows.*ToPrefixedFileW` functions behave like `ntdll.RtlDosPathNameToNtPathName_U`, but with a few differences: - Does not allocate on the heap (this is why we can't use `ntdll.RtlDosPathNameToNtPathName_U` directly, it does internal heap allocation). - Relative paths are kept as relative unless they contain too many .. components, in which case they are treated as rooted and resolved against the current drive (this is how it behaved before this commit as well). - Special case device names like COM1, NUL, etc are not handled specially (TODO) - `.` and space are not stripped from the end of relative paths (potential TODO) Most of the non-trivial conversion of non-relative paths is done via `ntdll.RtlGetFullPathName_U`, which AFAIK is used internally by `ntdll.RtlDosPathNameToNtPathName_U`. Some relevant reading on Windows paths: - https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html - https://chrisdenton.github.io/omnipath/Overview.html Closes ziglang#8205 Might close (untested) ziglang#12729 Note: - This removes checking for illegal characters in `std.os.windows.sliceToPrefixedFileW`, since the previous solution (iterate the whole string and error if any illegal characters were found) was naive and won't work for all path types. This is further complicated by things like file streams (where `:` is used as a delimiter, e.g. `file.ext:stream_name:$DATA`) and things in the device namespace (where a path like `\\.\GLOBALROOT\??\UNC\localhost\C$\foo` is valid despite the `?`s in the path and is effectively equivalent to `C:\foo`). Truly validating paths is complicated and would need to be tailored to each path type. The illegal character checking being removed may open up users to more instances of hitting `OBJECT_NAME_INVALID => unreachable` when using `fs` APIs. + This is related to ziglang#15607
…aths There are many different types of Windows paths, and there are a few different possible namespaces on top of that. Before this commit, NT namespaced paths were somewhat supported, and for Win32 paths (those without a namespace prefix), only relative and drive absolute paths were supported. After this commit, all of the following are supported: - Device namespaced paths (`\\.\`) - Verbatim paths (`\\?\`) - NT-namespaced paths (`\??\`) - Relative paths (`foo`) - Drive-absolute paths (`C:\foo`) - Drive-relative paths (`C:foo`) - Rooted paths (`\foo`) - UNC absolute paths (`\\server\share\foo`) - Root local device paths (`\\.` or `\\?` exactly) Plus: - Any of the path types and namespace types can be mixed and matched together as appropriate. - All of the `std.os.windows.*ToPrefixedFileW` functions will accept any path type, prefixed or not, and do the appropriate thing to convert them to an NT-prefixed path if necessary. This is achieved by making the `std.os.windows.*ToPrefixedFileW` functions behave like `ntdll.RtlDosPathNameToNtPathName_U`, but with a few differences: - Does not allocate on the heap (this is why we can't use `ntdll.RtlDosPathNameToNtPathName_U` directly, it does internal heap allocation). - Relative paths are kept as relative unless they contain too many .. components, in which case they are treated as 'drive relative' and resolved against the CWD (this is how it behaved before this commit as well). - Special case device names like COM1, NUL, etc are not handled specially (TODO) - `.` and space are not stripped from the end of relative paths (potential TODO) Most of the non-trivial conversion of non-relative paths is done via `ntdll.RtlGetFullPathName_U`, which AFAIK is used internally by `ntdll.RtlDosPathNameToNtPathName_U`. Some relevant reading on Windows paths: - https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html - https://chrisdenton.github.io/omnipath/Overview.html Closes ziglang#8205 Might close (untested) ziglang#12729 Note: - This removes checking for illegal characters in `std.os.windows.sliceToPrefixedFileW`, since the previous solution (iterate the whole string and error if any illegal characters were found) was naive and won't work for all path types. This is further complicated by things like file streams (where `:` is used as a delimiter, e.g. `file.ext:stream_name:$DATA`) and things in the device namespace (where a path like `\\.\GLOBALROOT\??\UNC\localhost\C$\foo` is valid despite the `?`s in the path and is effectively equivalent to `C:\foo`). Truly validating paths is complicated and would need to be tailored to each path type. The illegal character checking being removed may open up users to more instances of hitting `OBJECT_NAME_INVALID => unreachable` when using `fs` APIs. + This is related to ziglang#15607
…aths There are many different types of Windows paths, and there are a few different possible namespaces on top of that. Before this commit, NT namespaced paths were somewhat supported, and for Win32 paths (those without a namespace prefix), only relative and drive absolute paths were supported. After this commit, all of the following are supported: - Device namespaced paths (`\\.\`) - Verbatim paths (`\\?\`) - NT-namespaced paths (`\??\`) - Relative paths (`foo`) - Drive-absolute paths (`C:\foo`) - Drive-relative paths (`C:foo`) - Rooted paths (`\foo`) - UNC absolute paths (`\\server\share\foo`) - Root local device paths (`\\.` or `\\?` exactly) Plus: - Any of the path types and namespace types can be mixed and matched together as appropriate. - All of the `std.os.windows.*ToPrefixedFileW` functions will accept any path type, prefixed or not, and do the appropriate thing to convert them to an NT-prefixed path if necessary. This is achieved by making the `std.os.windows.*ToPrefixedFileW` functions behave like `ntdll.RtlDosPathNameToNtPathName_U`, but with a few differences: - Does not allocate on the heap (this is why we can't use `ntdll.RtlDosPathNameToNtPathName_U` directly, it does internal heap allocation). - Relative paths are kept as relative unless they contain too many .. components, in which case they are treated as 'drive relative' and resolved against the CWD (this is how it behaved before this commit as well). - Special case device names like COM1, NUL, etc are not handled specially (TODO) - `.` and space are not stripped from the end of relative paths (potential TODO) Most of the non-trivial conversion of non-relative paths is done via `ntdll.RtlGetFullPathName_U`, which AFAIK is used internally by `ntdll.RtlDosPathNameToNtPathName_U`. Some relevant reading on Windows paths: - https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html - https://chrisdenton.github.io/omnipath/Overview.html Closes #8205 Might close (untested) #12729 Note: - This removes checking for illegal characters in `std.os.windows.sliceToPrefixedFileW`, since the previous solution (iterate the whole string and error if any illegal characters were found) was naive and won't work for all path types. This is further complicated by things like file streams (where `:` is used as a delimiter, e.g. `file.ext:stream_name:$DATA`) and things in the device namespace (where a path like `\\.\GLOBALROOT\??\UNC\localhost\C$\foo` is valid despite the `?`s in the path and is effectively equivalent to `C:\foo`). Truly validating paths is complicated and would need to be tailored to each path type. The illegal character checking being removed may open up users to more instances of hitting `OBJECT_NAME_INVALID => unreachable` when using `fs` APIs. + This is related to #15607
Coming back to this from #14533, I don't see the benefits the Given that OS APIs must internally perform validation, and they have no issues recovering from failing to do so, why should Zig force its users to introduce redundant validation logic in the form of a The broader design question in this context is: how should we handle a generic As @squeek502 pointed out, it's impossible to perform path validation in advance on POSIX systems, and it's easy to find many similar instances of |
INVAL
/OBJECT_NAME_INVALID
in fs
APIsINVAL
/OBJECT_NAME_INVALID
in fs
APIs incorrectly marked as unreachable
Contributes to ziglang#15607 Although the case is not handled in `openatWasi` (as I could not get a working wasi environment to test the change) I have added a FIXME addressing it and linking to the issue.
Contributes to ziglang#15607 Although the case is not handled in `openatWasi` (as I could not get a working wasi environment to test the change) I have added a FIXME addressing it and linking to the issue.
Contributes to ziglang#15607 Although the case is not handled in `openatWasi` (as I could not get a working wasi environment to test the change) I have added a FIXME addressing it and linking to the issue.
Contributes to ziglang#15607 Although the case is not handled in `openatWasi` (as I could not get a working wasi environment to test the change) I have added a FIXME addressing it and linking to the issue.
Contributes to ziglang#15607 Although the case is not handled in `openatWasi` (as I could not get a working wasi environment to test the change) I have added a FIXME addressing it and linking to the issue.
I think this can be closed since #19833 was merged? |
Currently, Zig treats passing invalid paths to
fs
-related APIs as programmer error, meaning that the APIs treat returns likeINVAL
orOBJECT_NAME_INVALID
asunreachable
. I've (tentatively) come to the conclusion that this strategy is untenable, and that these errors should be treated as reachable.From #15382 (comment):
Here's my attempt at a summary of whether or not such a path validation function is possible per-platform:
" * / : < > ? \ |
among other things. That is, these disallowed filenames can be created on NTFS filesystems if the file is created from Linux or ifFILE_FLAG_POSIX_SEMANTICS
is used when calling the Windows APIs (see Naming Files, Paths, and Namespaces and CreateFileW)./
andNUL
are disallowed on the common filesystems, but it's the underlying filesystem limitations that ultimately matter. The same call to something likeopenat
may or may not hitINVAL
depending on the underlying filesystem (fromman openat
:EINVAL O_CREAT was specified in flags and the final component ("basename") of the new file's pathname is invalid (e.g., it contains characters not permitted by the underlying filesystem)
).Here's an example that demonstrates the problem on Linux:
Works fine when run on an
ext4
fs:But when run on a
vfat
fs (that disallows|
characters at the filesystem level):I'm unsure if there's a way to know in advance what the underlying filesystem is, but I'm assuming there isn't, and that this means that this
unreachable
is in fact inherently reachable.Related to:
Further reading:
The text was updated successfully, but these errors were encountered: