Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: Support UNC, rooted, drive relative, and namespaced/device paths #15768

Merged
merged 1 commit into from
May 29, 2023

Conversation

squeek502
Copy link
Collaborator

@squeek502 squeek502 commented May 19, 2023

There are many different types of Windows paths, and there are a few different possible namespaces on top of that. Before this commit, NT namespaced paths were somewhat supported, and for Win32 paths (those without a namespace prefix), only relative and drive absolute paths were supported. After this commit, all of the following are supported:

  • Device namespaced paths (\\.\)
  • Verbatim paths (\\?\)
  • NT-namespaced paths (\??\)
  • Relative paths (foo)
  • Drive-absolute paths (C:\foo)
  • Drive-relative paths (C:foo)
  • Rooted paths (\foo)
  • UNC absolute paths (\\server\share\foo)
  • Root local device paths (\\. or \\? exactly)

Plus:

  • Any of the path types and namespace types can be mixed and matched together as appropriate.
  • All of the std.os.windows.*ToPrefixedFileW functions will accept any path type, prefixed or not, and do the appropriate thing to convert them to an NT-prefixed path if necessary.

This is achieved by making the std.os.windows.*ToPrefixedFileW functions behave like ntdll.RtlDosPathNameToNtPathName_U, but with a few differences:

  • Does not allocate on the heap (this is why we can't use ntdll.RtlDosPathNameToNtPathName_U directly, it does internal heap allocation).
  • Relative paths are kept as relative unless they contain too many .. components, in which case they are treated as 'drive relative' and resolved against the CWD (this is how it behaved before this commit as well).
  • Special case device names like COM1, NUL, etc are not handled specially (TODO, not planned for this PR though)
  • . and space are not stripped from the end of relative paths (potential TODO, needs to be decided on)

Most of the non-trivial conversion of non-relative paths is done via ntdll.RtlGetFullPathName_U, which AFAIK is used internally by ntdll.RtlDosPathNameToNtPathName_U.

Some relevant reading on Windows paths:

Closes #8205
Might close (untested) #12729
Supersedes #15283

Note:

  • This removes checking for illegal characters in std.os.windows.sliceToPrefixedFileW, since the previous solution (iterate the whole string and error if any illegal characters were found) was naive and won't work for all path types. This is further complicated by things like file streams (where : is used as a delimiter, e.g. file.ext:stream_name:$DATA) and things in the device namespace (where a path like \\.\GLOBALROOT\??\UNC\localhost\C$\foo is valid despite the ?s in the path and is effectively equivalent to C:\foo). Truly validating paths is complicated and would need to be tailored to each path type. The illegal character checking being removed may open up users to more instances of hitting OBJECT_NAME_INVALID => unreachable when using fs APIs.

cc @marler8997

Finally, here's a screenshot of zig fmt being able to act on files within a test .\fmt directory using a lot of different path types:

zig-windows-paths

…aths

There are many different types of Windows paths, and there are a few different possible namespaces on top of that. Before this commit, NT namespaced paths were somewhat supported, and for Win32 paths (those without a namespace prefix), only relative and drive absolute paths were supported. After this commit, all of the following are supported:

- Device namespaced paths (`\\.\`)
- Verbatim paths (`\\?\`)
- NT-namespaced paths (`\??\`)
- Relative paths (`foo`)
- Drive-absolute paths (`C:\foo`)
- Drive-relative paths (`C:foo`)
- Rooted paths (`\foo`)
- UNC absolute paths (`\\server\share\foo`)
- Root local device paths (`\\.` or `\\?` exactly)

Plus:

- Any of the path types and namespace types can be mixed and matched together as appropriate.
- All of the `std.os.windows.*ToPrefixedFileW` functions will accept any path type, prefixed or not, and do the appropriate thing to convert them to an NT-prefixed path if necessary.

This is achieved by making the `std.os.windows.*ToPrefixedFileW` functions behave like `ntdll.RtlDosPathNameToNtPathName_U`, but with a few differences:

- Does not allocate on the heap (this is why we can't use `ntdll.RtlDosPathNameToNtPathName_U` directly, it does internal heap allocation).
- Relative paths are kept as relative unless they contain too many .. components, in which case they are treated as 'drive relative' and resolved against the CWD (this is how it behaved before this commit as well).
- Special case device names like COM1, NUL, etc are not handled specially (TODO)
- `.` and space are not stripped from the end of relative paths (potential TODO)

Most of the non-trivial conversion of non-relative paths is done via `ntdll.RtlGetFullPathName_U`, which AFAIK is used internally by `ntdll.RtlDosPathNameToNtPathName_U`.

Some relevant reading on Windows paths:

- https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html
- https://chrisdenton.github.io/omnipath/Overview.html

Closes ziglang#8205
Might close (untested) ziglang#12729

Note:
- This removes checking for illegal characters in `std.os.windows.sliceToPrefixedFileW`, since the previous solution (iterate the whole string and error if any illegal characters were found) was naive and won't work for all path types. This is further complicated by things like file streams (where `:` is used as a delimiter, e.g. `file.ext:stream_name:$DATA`) and things in the device namespace (where a path like `\\.\GLOBALROOT\??\UNC\localhost\C$\foo` is valid despite the `?`s in the path and is effectively equivalent to `C:\foo`). Truly validating paths is complicated and would need to be tailored to each path type. The illegal character checking being removed may open up users to more instances of hitting `OBJECT_NAME_INVALID => unreachable` when using `fs` APIs.
  + This is related to ziglang#15607
@andrewrk
Copy link
Member

Reverted in b5fad3a due to master branch failures. I didn't look into why yet

@andrewrk
Copy link
Member

andrewrk commented May 29, 2023

Looks like the failures occurred due to exposing a latent bug in the x86 machine code backend. The next step to getting this re-landed will be to open a new PR to revert the revert, and then either add an additional commit to fix the exposed bug in the x86 backend, or to disable the affected tests for that backend, so that the bug can be fixed later without blocking these changes.

@squeek502
Copy link
Collaborator Author

squeek502 commented May 30, 2023

The error is confusing to me for two reasons:

  • It seems like it's failing because abiSize of the return of sliceToPrefixedFileW is overflowing a u16 during an @intCast, but the return type of sliceToPrefixedFileW is the same in this branch as on master (same error set [I'll double check this though, any change would have been inadvertent], same return type [PathSpace, which was unchanged])
    • EDIT: Looks like on master, the failing airBitCast call is not happening for the return type of sliceToPrefixedFileW, but in this branch it is
  • This branch was passing the tests before it was merged, so the problem seems to have been introduced between when this branch was created and now?
    • EDIT: Looks like the failing @intCast was introduced in 729daed

Will try to reproduce it locally and look into it more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unreachable code reached when using UNC paths on Windows
3 participants