Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement readUser on Windows #3248

Conversation

gabriel-samfira
Copy link
Collaborator

This change allows us to parse the SAM registry hive to fetch the information associated with the user we set in the USER stanza. The Security Accounts Manager (SAM) file is a registry hive that holds Windows users, password hashes, information about the user (last login, full name, etc). In essence, it is a combination of /etc/passwd, /etc/shadow and last log.

Parsing this file allows us to properly set permissions on files and folders we create when we use ADD, COPY, WORKDIR, etc when USER contains a user that does not have a Well Known SID like ContainerAdministrator or ContainerUser. This also removes Windows as a special case in a number of code paths.

This means we can create new users in containers using:

RUN net user example /ADD /ACTIVE:yes

Then use that user instead of the two built in ones.

Sadly, the format of the SAM registry hive is quite obscure. I was unable to find proper documentation on how to parse it, aside from an old article that is no longer available outside of the web archive project.

https://web.archive.org/web/20190423074618/http://www.beginningtoseethelight.org:80/ntsecurity/index.htm

Depends on: tonistiigi/fsutil#138

Signed-off-by: Gabriel Adrian Samfira [email protected]

@gabriel-samfira gabriel-samfira marked this pull request as draft November 1, 2022 13:26
This change allows us to parse the SAM registry hive to fetch the user
information associated with the user we set in the USER stanza. This is
the Windows equivalent of parsing the /etc/passwd file. Unfortunately
the format of that hive is obscure.

Signed-off-by: Gabriel Adrian Samfira <[email protected]>
@tonistiigi
Copy link
Member

ping @TBBle

@TBBle
Copy link
Collaborator

TBBle commented Nov 5, 2022

I'm not really in a position to review this properly, but the principle seems sound. The only other way I can see of making this work would be to start the container and execute a command to dump the relevant user SID.

Having had a flick through, I see at least one spot that still has a debugging path (GetUserInfoFromOfflineSAMHive doesn't use samPath), but this is a draft pending tonistiigi/fsutil#138, so that's not important right now.

Having also had a flick through the linked reference doc, I was wondering why the decision was made to walk the SAM for all users and then look for a match, rather than attempting to fetch the desired name from the Names subkey, and then the relevant numbered key. Do we have a use-case for "list all known users in the image"? The Linux version works this way because /etc/passwd etc is not structured, but the Windows SAM is.

My only concern (and I assume this flow has been tested) is whether the WCIFS local-mounting actually does the right thing with the registry hives from a multi-layer image. I would have assumed the differencing-registry setup was handled by the container execution environment, but if this code works for the described use-case, then my assumption is wrong and WCIFS is setting it up as well, which is nice. This source suggests that the layers are separately stored in the layer images, so maybe the Offline Registry Library itself understanders layered hives?

@gabriel-samfira
Copy link
Collaborator Author

The only other way I can see of making this work would be to start the container and execute a command to dump the relevant user SID.

That was my first option, but it meant building a helper binary which works in every Windows container (the Current() function in os/user doesn't work on nanoserver - missing netapi32.dll) which we needed to add into every container to fetch the SID, then remove it. Felt brittle and hacky. I cannot argue that parsing an undocumented hive is any better, but the info is there, and we can fetch it without much of a fuss.

Having also had a flick through the linked reference doc, I was wondering why the decision was made to walk the SAM for all users and then look for a match, rather than attempting to fetch the desired name from the Names subkey, and then the relevant numbered key.

The Names subkey is just a list of user names. While it is an ordered list of names that you could probably match to the proper user one level up HKLM:\SAM\Domains\Account\Users, You'd still have to loop through that list to look for the name, then go back and parse the V value associated with the proper user.

This is what the user listing looks like in the SAM hive:
Screenshot from 2022-11-05 21-03-14

The hex value you see under Name is in fact the RID of the user. The V value holds all the information about that user, including the username, SID (well, for everyone except the Guest account it seems), password hash, last login attempt, etc. We only care about the username and the SID, though.

I considered parsing Names first, then matching it to the proper key and only parsing the V value just for that one user, but I didn't feel it added much value. I also thought that if at some point we may need to do that for more than one user, we'd have to open the SAM hive and parse it for every user we care about. This way we only have to read the SAM hive once.

There are not many local users inside a typical container, nor on any single instance for that matter. Fetching them all didn't seem too much of a waste of cycles.

My only concern (and I assume this flow has been tested) is whether the WCIFS local-mounting actually does the right thing with the registry hives from a multi-layer image. I would have assumed the differencing-registry setup was handled by the container execution environment, but if this code works for the described use-case, then my assumption is wrong and WCIFS is setting it up as well, which is nice. This source suggests that the layers are separately stored in the layer images, so maybe the Offline Registry Library itself understanders layered hives?

That is the best part. The layered keys seem to be activated when the container layer is activated. I locally rebased your two PRs: containerd/containerd#4419 and microsoft/hcsshim#901, applied the changes I proposed for both: TBBle/containerd#1 and TBBle/hcsshim#1.

With the changes in this PR, plus some quick and nasty hacks in the containerd executor of buildkit, I managed to build an image on Windows using the following Dockerfile:

FROM mcr.microsoft.com/windows/nanoserver:ltsc2022

USER ContainerAdministrator

RUN net user gabriel /ADD /ACTIVE:yes

USER gabriel
WORKDIR C:\\workdir

And the result is as follows:

Screenshot from 2022-11-05 22-31-59

The WORKDIR stanza prompted buildkit to create the folder and change the owner. The SID corresponds to the SID for the gabriel user inside the image, which was created above.

There is still a lot of work to do in various parts of buildkit, but most things seem to be minor.

@TBBle
Copy link
Collaborator

TBBle commented Nov 6, 2022

The Names subkey is just a list of user names. While it is an ordered list of names that you could probably match to the proper user one level up HKLM:\SAM\Domains\Account\Users, You'd still have to loop through that list to look for the name, then go back and parse the V value associated with the proper user.

My understanding from the linked doc was that the Names subkey is a list of keys, named for the user with the @ value then being the hex key name from the level up for that name, hence trading any loop-through in favour of an extra indirection.

No particularly strong feelings either way about it, I agree that container images are unlikely to have so many users that O(N)/2 will be observably worse that 2*O(1).

@gabriel-samfira
Copy link
Collaborator Author

gabriel-samfira commented Nov 6, 2022

This is what the contents of the Names subkey looks like:

Screenshot from 2022-11-06 14-00-54

That article is old. Really old. Things may have changed since then. Luckily what we need has stayed the same since the Windows XP era 😄 .

Edit: To clarify, there does not seem to be an @ value:

PS C:\Users\Administrator\work\playground> Get-ItemProperty HKLM:\external\SAM\Domains\Account\Users\Names\gabriel -Name "@"
Get-ItemProperty : Property @ does not exist at path
HKEY_LOCAL_MACHINE\external\SAM\Domains\Account\Users\Names\gabriel.
At line:1 char:1
+ Get-ItemProperty HKLM:\external\SAM\Domains\Account\Users\Names\gabri ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidArgument: (@:String) [Get-ItemProperty], PSArgumentException
    + FullyQualifiedErrorId : System.Management.Automation.PSArgumentException,Microsoft.PowerShell.Commands.GetItemPr
   opertyCommand
PS C:\Users\Administrator\work\playground> Get-ItemPropertyValue HKLM:\external\SAM\Domains\Account\Users\Names\gabriel -Name "@"
Get-ItemPropertyValue : Property @ does not exist at path
HKEY_LOCAL_MACHINE\external\SAM\Domains\Account\Users\Names\gabriel.
At line:1 char:1
+ Get-ItemPropertyValue HKLM:\external\SAM\Domains\Account\Users\Names\ ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [Get-ItemPropertyValue], PSArgumentException
    + FullyQualifiedErrorId : Argument,Microsoft.PowerShell.Commands.GetItemPropertyValueCommand
PS C:\Users\Administrator\work\playground> ls HKLM:\external\SAM\Domains\Account\Users\Names\gabriel\
PS C:\Users\Administrator\work\playground>

@gabriel-samfira
Copy link
Collaborator Author

I've been thinking about this PR and I think we'll end up defining a reexec utility in buildkit, and using that to fetch the SID from a container. It's not as clean as just reading a file, but parsing undocumented data structures is generally frowned upon.

Will leave this as a draft for now and focus on getting the containerd worker executor code working. We need to do that anyway, and we can use it to get the SID using documented APIs, even if it means we create a temporary container.

As a side note, it seems that HCS does not carry ownership information from layer to layer. Only DACLs are preserved. The owner of the file is reset to the built-in Administrators group. We still need the SID to create proper ACLs, and at some point in the future when HCS will preserve ownership info, we'll already have that as well.

@gabriel-samfira
Copy link
Collaborator Author

Closing this as it was replaced by #3518

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants