Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_link should be allow absolute paths #353

Closed
asomers opened this issue May 14, 2024 · 14 comments
Closed

read_link should be allow absolute paths #353

asomers opened this issue May 14, 2024 · 14 comments

Comments

@asomers
Copy link
Contributor

asomers commented May 14, 2024

The read_link function will return an error if the link's destination is an absolute path. There are three problems with this:

  • It won't return an error if the link's destination is a relative path that uses ../ to point outside of the sandbox. That's inconsistent.
  • Sometimes a symlink with an absolute path doesn't escape the sandbox. For example /sandbox/link -> /sandbox/target.
  • More importantly, sometimes it's important to read a link that points outside of the sandbox. For example, a backup application could copy all of the files from the root directory of one computer to a subdirectory of another. Symlinks would be broken, but would work again after a restore operation. The application that performs the restore would need to be able to read link targets, even if they point outside of the root. Or, a jail file system could contain absolute symlinks that resolve correctly for a jailed process, but not for an unjailed one. But it would still sometimes be useful for an unjailed process to read the links.

I suggest simply removing the absolute path check.

asomers added a commit to asomers/cap-std that referenced this issue May 14, 2024
Previously it would return an error if the link's destination were
absolute.  But there were three problems with this:

* It wouldn't return an error if the link's destination were a relative
  path that uses ../ to point outside of the sandbox. That's
  inconsistent.
* Sometimes a symlink with an absolute path doesn't escape the sandbox.
  For example /sandbox/link -> /sandbox/target.  But read_link wouldn't
  allow that.
* More importantly, sometimes it's important to read a link that points
  outside of the sandbox. For example, a backup application could copy
  all of the files from the root directory of one computer to a
  subdirectory of another. Symlinks would be broken, but would work
  again after a restore operation. The application that performs the
  restore would need to be able to read link targets, even if they point
  outside of the root. Or, a jail file system could contain absolute
  symlinks that resolve correctly for a jailed process, but not for an
  unjailed one. But it would still sometimes be useful for an unjailed
  process to read the links.

Fixes bytecodealliance#353
asomers added a commit to asomers/cap-std that referenced this issue May 14, 2024
Previously it would return an error if the link's destination were
absolute.  But there were three problems with this:

* It wouldn't return an error if the link's destination were a relative
  path that uses ../ to point outside of the sandbox. That's
  inconsistent.
* Sometimes a symlink with an absolute path doesn't escape the sandbox.
  For example /sandbox/link -> /sandbox/target.  But read_link wouldn't
  allow that.
* More importantly, sometimes it's important to read a link that points
  outside of the sandbox. For example, a backup application could copy
  all of the files from the root directory of one computer to a
  subdirectory of another. Symlinks would be broken, but would work
  again after a restore operation. The application that performs the
  restore would need to be able to read link targets, even if they point
  outside of the root. Or, a jail file system could contain absolute
  symlinks that resolve correctly for a jailed process, but not for an
  unjailed one. But it would still sometimes be useful for an unjailed
  process to read the links.

Fixes bytecodealliance#353
asomers added a commit to asomers/cap-std that referenced this issue May 14, 2024
Previously it would return an error if the link's destination were
absolute.  But there were three problems with this:

* It wouldn't return an error if the link's destination were a relative
  path that uses ../ to point outside of the sandbox. That's
  inconsistent.
* Sometimes a symlink with an absolute path doesn't escape the sandbox.
  For example /sandbox/link -> /sandbox/target.  But read_link wouldn't
  allow that.
* More importantly, sometimes it's important to read a link that points
  outside of the sandbox. For example, a backup application could copy
  all of the files from the root directory of one computer to a
  subdirectory of another. Symlinks would be broken, but would work
  again after a restore operation. The application that performs the
  restore would need to be able to read link targets, even if they point
  outside of the root. Or, a jail file system could contain absolute
  symlinks that resolve correctly for a jailed process, but not for an
  unjailed one. But it would still sometimes be useful for an unjailed
  process to read the links.

Fixes bytecodealliance#353
asomers added a commit to asomers/cap-std that referenced this issue May 14, 2024
Previously it would return an error if the link's destination were
absolute.  But there were three problems with this:

* It wouldn't return an error if the link's destination were a relative
  path that uses ../ to point outside of the sandbox. That's
  inconsistent.
* Sometimes a symlink with an absolute path doesn't escape the sandbox.
  For example /sandbox/link -> /sandbox/target.  But read_link wouldn't
  allow that.
* More importantly, sometimes it's important to read a link that points
  outside of the sandbox. For example, a backup application could copy
  all of the files from the root directory of one computer to a
  subdirectory of another. Symlinks would be broken, but would work
  again after a restore operation. The application that performs the
  restore would need to be able to read link targets, even if they point
  outside of the root. Or, a jail file system could contain absolute
  symlinks that resolve correctly for a jailed process, but not for an
  unjailed one. But it would still sometimes be useful for an unjailed
  process to read the links.

Fixes bytecodealliance#353
@sunfishcode
Copy link
Member

The read_link function will return an error if the link's destination is an absolute path. There are three problems with this:

* It _won't_ return an error if the link's destination is a relative path that uses `../` to point outside of the sandbox.  That's inconsistent.

It's not about preventing paths from being followed, because indeed, cap-std already does that. It is about presenting a view of the filesystem which isn't inherently rooted. If applications create absolute-path symlinks, they will be unwittingly leaking information about the host filesystem hierarchy outside the sandbox. It's an admittedly debatable stance, and I'm open to adding other options (I'll suggest something below).

* Sometimes a symlink with an absolute path _doesn't_ escape the sandbox.  For example `/sandbox/link -> /sandbox/target`.

That's true, however absolute paths that point to locations inside the sandbox still aren't permitted in general, because they leak information about the filesystem hierarchy outside the sandbox. For example, even though /home/sunfishcode/../../sandbox/target might end up inside the sandbox, if opening it succeeds, it reveals that /home/sunfishcode exists.

So perhaps what we should eventually do there is make readlink rewrite /sandbox/target to ./target in cases like that, so that it resolves as expected.

* More importantly, sometimes it's important to read a link that points outside of the sandbox.  For example, a backup application could copy all of the files from the root directory of one computer to a subdirectory of another.  Symlinks would be broken, but would work again after a restore operation.  The application that performs the restore would need to be able to read link targets, even if they point outside of the root.  Or, a jail file system could contain absolute symlinks that resolve correctly for a jailed process, but not for an unjailed one.  But it would still sometimes be useful for an unjailed process to read the links.

Would it work for your purposes if we added a new readlink_verbatim function, which just always returned the verbatim contents of the symlink? It could take an AmbientAuthority argument to indicate that it can view absolute paths. That way it'd still be easily accessible to anyone who wants to use it, such as eg. backup software, without changing any existing assumptions of existing users. Would that work here?

@cgwalters
Copy link
Contributor

xref #252

@asomers
Copy link
Contributor Author

asomers commented May 14, 2024

Adding a new method would work for me, as long as it doesn't need an AmbientAuthority. My application runs with Capsicum, so I can't get an AmbientAuthority except during startup.

@asomers
Copy link
Contributor Author

asomers commented May 14, 2024

@cgwalters unfortunately cap_primitives::fs::read_link_contents won't work here, because that requires a &std::fs::File argument. But all of the functions in cap_std want a cap_std::fs::Dir. And there's no method to get the former from the latter. Should I add a cap_std::fs::Dir::read_link_contents method?

@sunfishcode
Copy link
Member

@cgwalters Ah, thanks. I forget we added that.

So yes, adding a cap_std::fs::Dir::read_link_contents function that calls that makes sense.

And yeah, thinking about it more, I don't think this needs an AmbientAuthority for this either.

@cgwalters
Copy link
Contributor

Somewhat related to this...the current README contains an argument why cap-std doesn't use RESOLVE_IN_ROOT.

I would like to at least have a discussion about supporting that. This isn't the first time for me I've wanted it (it just came up again in containers/bootc#659 (comment) ). Now to be clear, it's not really onerous to just drop down to the rustix APIs and bypass Dir for this (and nice work on ensuring rustix has an ergonomic and safe api too!) but still...WDYT about adding something like:

diff --git a/cap-primitives/src/fs/mod.rs b/cap-primitives/src/fs/mod.rs
index 62ef8c9..d94863e 100644
--- a/cap-primitives/src/fs/mod.rs
+++ b/cap-primitives/src/fs/mod.rs
@@ -51,6 +51,8 @@ use maybe_owned_file::MaybeOwnedFile;
 pub(crate) use file_path_by_searching::file_path_by_searching;
 pub(crate) use open_unchecked_error::*;
 
+#[cfg(not(windows))]
+pub use super::rustix::fs::open_beneath;
 #[cfg(not(windows))]
 pub(crate) use super::rustix::fs::*;
 #[cfg(windows)]
diff --git a/cap-primitives/src/rustix/linux/fs/canonicalize_impl.rs b/cap-primitives/src/rustix/linux/fs/canonicalize_impl.rs
index 87dd9eb..7fdc988 100644
--- a/cap-primitives/src/rustix/linux/fs/canonicalize_impl.rs
+++ b/cap-primitives/src/rustix/linux/fs/canonicalize_impl.rs
@@ -19,6 +19,7 @@ pub(crate) fn canonicalize_impl(start: &fs::File, path: &Path) -> io::Result<Pat
             .read(true)
             .follow(FollowSymlinks::Yes)
             .custom_flags(OFlags::PATH.bits() as i32),
+        false,
     );
 
     // If that worked, call `readlink`.
diff --git a/cap-primitives/src/rustix/linux/fs/open_entry_impl.rs b/cap-primitives/src/rustix/linux/fs/open_entry_impl.rs
index b5bc0f6..16e1419 100644
--- a/cap-primitives/src/rustix/linux/fs/open_entry_impl.rs
+++ b/cap-primitives/src/rustix/linux/fs/open_entry_impl.rs
@@ -7,7 +7,7 @@ pub(crate) fn open_entry_impl(
     path: &OsStr,
     options: &OpenOptions,
 ) -> io::Result<fs::File> {
-    let result = open_beneath(start, path.as_ref(), options);
+    let result = open_beneath(start, path.as_ref(), options, false);
 
     match result {
         Ok(file) => Ok(file),
diff --git a/cap-primitives/src/rustix/linux/fs/open_impl.rs b/cap-primitives/src/rustix/linux/fs/open_impl.rs
index edb4c8b..09f3a9f 100644
--- a/cap-primitives/src/rustix/linux/fs/open_impl.rs
+++ b/cap-primitives/src/rustix/linux/fs/open_impl.rs
@@ -36,7 +36,7 @@ pub(crate) fn open_impl(
     // [seccomp policy]: https://android-developers.googleblog.com/2017/07/seccomp-filter-in-android-o.html
     #[cfg(target_os = "linux")]
     {
-        let result = open_beneath(start, path, options);
+        let result = open_beneath(start, path, options, false);
 
         // If we got anything other than a `ENOSYS` error, that's our result.
         match result {
@@ -53,10 +53,11 @@ pub(crate) fn open_impl(
 /// unavailable, mark it so for future calls. If `openat2` is unavailable
 /// either permanently or temporarily, return `ENOSYS`.
 #[cfg(target_os = "linux")]
-pub(crate) fn open_beneath(
+pub fn open_beneath(
     start: &fs::File,
     path: &Path,
     options: &OpenOptions,
+    resolve_in_root: bool,
 ) -> io::Result<fs::File> {
     static INVALID: AtomicBool = AtomicBool::new(false);
     if INVALID.load(Relaxed) {
@@ -74,6 +75,11 @@ pub(crate) fn open_beneath(
         Mode::empty()
     };
 
+    let mut resolveflags = ResolveFlags::BENEATH | ResolveFlags::NO_MAGICLINKS;
+    if resolve_in_root {
+        resolveflags |= ResolveFlags::IN_ROOT;
+    }
+
     // We know `openat2` needs a `&CStr` internally; to avoid allocating on
     // each iteration of the loop below, allocate the `CString` now.
     path.into_with_c_str(|path_c_str| {
@@ -82,13 +88,7 @@ pub(crate) fn open_beneath(
         // times, because there's no limit on how often this can happen. The actual
         // number here is currently an arbitrarily chosen guess.
         for _ in 0..4 {
-            match openat2(
-                start,
-                path_c_str,
-                oflags,
-                mode,
-                ResolveFlags::BENEATH | ResolveFlags::NO_MAGICLINKS,
-            ) {
+            match openat2(start, path_c_str, oflags, mode, resolveflags) {
                 Ok(file) => {
                     let file = fs::File::from_into_fd(file);
 
diff --git a/cap-primitives/src/rustix/linux/fs/stat_impl.rs b/cap-primitives/src/rustix/linux/fs/stat_impl.rs
index 93d37e9..385c10b 100644
--- a/cap-primitives/src/rustix/linux/fs/stat_impl.rs
+++ b/cap-primitives/src/rustix/linux/fs/stat_impl.rs
@@ -27,6 +27,7 @@ pub(crate) fn stat_impl(
             .read(true)
             .follow(follow)
             .custom_flags(OFlags::PATH.bits() as i32),
+        false,
     );
 
     // If that worked, call `fstat`.
diff --git a/cap-std/src/fs/dir.rs b/cap-std/src/fs/dir.rs
index 7426838..aed174a 100644
--- a/cap-std/src/fs/dir.rs
+++ b/cap-std/src/fs/dir.rs
@@ -46,6 +46,8 @@ use {
 ///
 /// [functions in `std::fs`]: https://doc.rust-lang.org/std/fs/index.html#functions
 pub struct Dir {
+    #[cfg(any(target_os = "android", target_os = "linux"))]
+    resolve_in_root: bool,
     std_file: fs::File,
 }
 
@@ -59,7 +61,13 @@ impl Dir {
     /// has access to.
     #[inline]
     pub fn from_std_file(std_file: fs::File) -> Self {
-        Self { std_file }
+        #[cfg(any(target_os = "android", target_os = "linux"))]
+        return Self {
+            std_file,
+            resolve_in_root: false,
+        };
+        #[cfg(not(any(target_os = "android", target_os = "linux")))]
+        return Self { std_file };
     }
 
     /// Consumes `self` and returns a [`std::fs::File`].
@@ -105,6 +113,24 @@ impl Dir {
     /// Attempts to open a directory.
     #[inline]
     pub fn open_dir<P: AsRef<Path>>(&self, path: P) -> io::Result<Self> {
+        #[cfg(any(target_os = "android", target_os = "linux"))]
+        if self.resolve_in_root {
+            return Err(io::Error::new(
+                io::ErrorKind::InvalidInput,
+                "directory is configured with RESOLVE_IN_ROOT",
+            ));
+        }
+        let dir = open_dir(&self.std_file, path.as_ref())?;
+        Ok(Self::from_std_file(dir))
+    }
+
+    /// Attempts to open a directory. This uses RESOLVE_IN_ROOT semantics on Linux,
+    /// which will resolve symbolic links relative to this directory. It is
+    /// an error to attempt to open
+    #[cfg(any(target_os = "android", target_os = "linux"))]
+    pub fn open_rootdir<P: AsRef<Path>>(&self, path: P) -> io::Result<Self> {
+        let options = &dir_options();
+        let dir = cap_primitives::fs::open_beneath(&self.std_file, path.as_ref(), options, false)?;
         let dir = open_dir(&self.std_file, path.as_ref())?;
         Ok(Self::from_std_file(dir))
     }

(Doesn't compile, just a sketch)

@cgwalters
Copy link
Contributor

(and to elaborate on the connection here, we'd pass down the self.resolve_in_root to the symlink reading APIs and they would just allow absolute symlinks in that mode)

@cgwalters
Copy link
Contributor

I also did coreos/cap-std-ext#54 - and there I think the RootDir API's read_link would always allow absolute links for example. (I didn't actually add that, but I can trivially)

But as written that API hard requires Linux (openat2 RESOLVE_IN_ROOT) today; I haven't investigated at all userspace emulation.

@sunfishcode
Copy link
Member

@cgwalters Yes, I'm open to supporting this, though I don't have much bandwidth for writing code for it at the moment. I tend to prefer a RootDir-like API like coreos/cap-std-ext#54 does, rather than adding a flag to Dir, because from what I understand, the use cases will be different and can just use different types rather than needing to switch between them at runtime, but if there's a reason the flag is better, I'd be open to that.

@cgwalters
Copy link
Contributor

I'm OK to leave it in cap-std-ext for now. But let's ask @asomers - does that cap-std-ext implementation fit your needs if we added more apis like read_link to it? Are the relevant projects Linux-only?

@asomers
Copy link
Contributor Author

asomers commented Jul 22, 2024

RESOLVE_IN_ROOT isn't really relevant for what I'm doing. I just need a read_link method that returns that literal value of the symlink and nothing else, like read_link_contents does. And for me, the relevant project isn't Linux-only, in fact it's non-Linux (specifically, FreeBSD) only.

@cgwalters
Copy link
Contributor

OK. Then indeed let's just do this:

Should I add a cap_std::fs::Dir::read_link_contents method?

Since I think everyone is in agreement.

@sunfishcode
Copy link
Member

I believe this is already done, in #356. Dir has a read_link_contents function here.

@sunfishcode
Copy link
Member

As commented previously, I believe this is fixed. Please re-open or file new issues if there's anything else needed here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants