-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: VFS #377
Comments
@bugaevc I hope you don't mind me asking, but has there been any progress with adding back VFS? If not, I would like to help get the ball rolling (so to speak). Skimming through #222, I believe that we need to modify the syscall in Now, about those features your want.
Like I said in my earlier post, some file systems already offer this feature. If we need to, we can implement it, but I rather have the OS take care of this (mainly for performance reasons). We also need to make sure there is a way to disable this. iOS is case-sensitive by default.
I haven't look into to how darling handles this, but I was thinking of having a hidden file in each directory (we could call it This file would only be created if you are in the Darling virtual drive. If you are accessing the Linux file system through Darling or adding a file to a writable DMG, this file won't be created.
I am not sure how to properly implement this part since I don't have any experience with VFS. With that being said, I was brainstorming some ideas for how the If this idea is a good idea, we would do something similar for
Dumb question, but I am going to assume that this allows each process to keep track of it's own CWD, right?
From what I understand, |
The current plan is to do something like this with help from the kernel, dubbed vchroot. There was even a WIP branch.
Yes
No, dyld should use the same libkernel stuff to access the file system as everything else uses.
I don't think we need to track the fake owners. I don't remember the exact details, but IIRC it's pretty clear what the fake owner needs to be, it's faking itself that is important.
That's basically how per-thread CWD is implemented now, yes. With shared CWD the kernel keeps track of it (i.e.we change the CWD of the Linux process), and when a thread wants to switch to a per-thread CWD the userspace implementation takes over.
They do not exist, so it's unlikely that Darwin software would look for them, so it's not hurting compatibility to expose them. The reason we want then in our container is simple: |
While the VFS stuff might be too advance for me, I would love to at least try and help you or LubosD with this.
Thank's for the link. Am I going to assume that this code is outdated and will probably be removed eventually, right? Regardless, I am going to take a look at the first commit to see what LubosD changed.
I guess the majority of applications don't need to change ownership. Most of the files and folders on a real Mac are owned by the |
The vchroot branch is based on a kernel-based "acceleration", instead of doing relatively slow userspace resolution. At the moment, the kernel code is buggy (or simply wrong). It is based on some tricks with dentries, but this area of the Linux kernel isn't very well documented (just like the rest of it), so it works only in specific scenarios and needs a rework. |
I have resumed my work on this and hopefully already figured out some of the troubles - once again caused by the kernel not exporting some very useful functions... :-( Such as Case InsensitivityFor now, As a long-term solution (if we choose to fork overlayfs, for instance), we could maybe reuse some of the insensitive lookup logic from sdcardfs. |
Small change of plans: I keep finding scenarios where my kernel-based vchroot just doesn't work. I'm starting to believe that what we need cannot be implemented by making a few simple Linux kernel API calls (or I'm just doing something wrong). Either way, to speed things up, I'll now implement a user-space based vchroot, with a possible later upgrade to a kernel implementation. Because we really need this issue to be resolved - the sooner the better! |
The branch now works to the point that For the first time in god knows how long, I can just start HelloWorld.app without any additional hassle. But it hangs after the first click - does it ring any bells, @bugaevc? |
Please test various stuff with |
Hmmmmm, but the test case I wrote back then now works (the vchroot branch uses the new XNU). I did some quick grepping, and I can't find What do you think? |
Hmmmmm (2), I see I've changed how this works relatively recently: darlinghq/darling-cocotron@56ba4b6 The change itself makes sense, and I remember wanting to do it. So I guess we now need to add |
As of tge moment you can't build the |
Also, where is the source for |
@TheBrokenRail I fixed the fakechroot stuff. |
They add Making a commit now. |
When building the LKM I got:
|
@TheBrokenRail Strange, the file gets generated during build as Try cleaning your build directory. I've seen that the LKM build sometimes forgets to generate new files. |
TODO: 32-bit binaries are broken in the branch. |
It seems |
@TheBrokenRail Looks like it's me who should clean the build tree. Fixed. As well as the 32-bit binaries. |
The LKM build is now failing with:
|
I created a PR to fix it: darlinghq/darling-newlkm#10. |
When I tried to run
|
and running
despite the file existing:
|
I don't know about |
It seems Darling is unable to launch anything when being straced-ed. |
Core dump for
|
I can confirm that something is wiping I think strace doesn't work because it interferes with the parent/child relationship as seen by the LKM and this relationship is required for passing down the vchroot information. I say we should provide a guaranteed working LLDB build for debugging purposes. |
If you're running |
I am using:
|
Fixed. |
@LubosD I agree that including a working LLDB would be very useful! |
I guess we can close this, #600, and #415? @TheBrokenRail please open separate GitHub issues if there still are issues with this. |
Summary
Bring back VirtalPrefix and add more stuff on top of it, getting rid of mount namespaces and overlayfs
(sounds like a regression, doesn't it?)
Background
(I'm trying to both make a proposal and document what goes on, hence this section)
What is this all about?
Linux and macOS filesystem layouts, while similar, differ significantly enough that we can't present the host filesystem to programs running under Darling, and when they do agree (e.g. both put executables under
/bin
&/usr/bin
) we want programs to see our versions of those directores.No matter what exact mechanism we use for that, we deal with the above by having our own macOS-like "chroot" in
libexec/darling/
, on top of which we overlay so-called "prefixes" aka dprefixes (more about them in the wiki).What is VirtualPrefix?
Darling used to implement chroot emulation for macOS executables by making use of the fact that we ship a complete libSystem (libc + other stuff) library instead of linking to the one used by the host. While we try not to modify most of libSystem (compared to what Apple ships/publishes), we do change libkernel/libsyscall to bridge between Darwin and Linux syscalls, and that gives us a chance to transform paths passed between libSystem and the Linux kernel.
The overlaying was achieved by just copying
libexec/darling
into each prefix each time anything changed (this is also what Wine does).Why was VirtualPrefix removed?
I've found a few bugs in how VirtualPrefix worked but decided that it'd be easier to do away with it altogether than debug and fix those bugs. We were just moving to mount namespace + overlayfs mechanism for overlaying prefix contents over
libexec/darling
, and chrooting into the resulting directory was a very natural and simple extension of that idea.To quote myself from #197,
It seemed the only "small problem" with this new layout was that
ld-linux
, Linux's dynamic loader, was unable to find native ELF libraries when invoked from inside the container. We worked that around by setting up a copy of/etc/ld.so.conf
and/etc/ld.so.cache
at installation time.See #222 for the merge request that removed VirtualPrefix.
Was it a good idea to remove VirtualPrefix?
I'm not so sure anymore. Using Linux's native mount namespaces and
chroot
/pivot_root
implementation is indeed a much cleaner solution than reimplementing all that logic ourselves, but it started to cause us a lot more headache since then.Besides
ld.so
, other things than we had or would have to patch/workaround in some way because of native flies not being where they are expected to be are the X11 and Wayland sockets (we would have to symlink them), fontconfig config files and fonts themselves (which we hacked around by again making a modified copy of the config at installation time), Mesa "drivers" (relatively cleanly fixed by symlinking/usr/lib64/
to that of the host), GTK+ and Qt themes (and icons, and cursors, and the rest of/usr/share
— could be fixed by modifyingXDG_DATA_DIRS
), native open/save file dialogs displaying prefixes intead of host's filesystem layout (my latest idea was to ask the DE to open an appropriate dialog over D-Bus usingorg.freedesktop.portal
API) and probably many more that we haven't thought about / stumbled upon yet.As you can see, these are spread throughout the stack and are all worked around differently — and yet, incidentally or not, they would all go away if only the native libc saw the host's filesystem layout (which was the case with VirtualPrefix).
Do all those justify bringing VirtualPrefix back? I'm not sure of that either. VirtualPrefix still is an ugly & buggy hack and native solutions are still nicer.
But wait,
There's more to the story
There are other filesystem-related things we have/want to tackle in some way:
/dev
layout (currently we symlink/dev
from the host)/Volumes
(see New /Volumes design #220)And if you think of it, mount namespaces and overlayfs are too parts of the filesystem story, which brings us to the
Proposal
(well, maybe not a proposal, but an idea)
Let's revive [the idea of] VirtualPrefix, rewrite it from scratch to be fully correct and turn it into a more complete Virtual File System (VFS) implementation that would handle all of the above, including case-insensitivity, faking ovnership, tracking CWD, tweaking directory layout, mounting sub-
/Volumes
the way we want them and overlaying the prefix on top oflibexec/darling
.Now that sounds like a much cleaner solution than what we have today.
Why reimplement overlayfs functionality?
...we don't have lots of hacks because of overlayfs, do we?
Not as many as we have because of other things, no, but there are a few problems with it. Firstly, modifying underlying filesystems while overlayfs is mounted is undefined behavior (meaning we can't update our
libexec
files while a container is running, and putting.init.pid
inside the prefix directory is/was UB too). Secondly, it doesn't support encrypted home folders (see #242) nor some other interesting filesystems. Last but not least, we can't make it support case-insensitivity without basically reimplementing it.Reimplementing overlayfs would mean that we would also no longer need mount namespaces.
Unresolved issues
/proc
and/dev/shm
, but with no kernel-level-chroot, there's nowhere to mount them. Ooops.proc
to e.g./proc2
(or$DPREFIX/proc
). I don't quite understand how/dev/shm
works even now with/dev
being a symlink.Alternatives
(Note: RFC stays for 'Request For Comments')
The text was updated successfully, but these errors were encountered: