Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for decoding erofs into lcfs_node, and some tooling using it #185

Merged
merged 19 commits into from
Sep 15, 2023

Conversation

alexlarsson
Copy link
Collaborator

This will help with fixing
#182

@alexlarsson
Copy link
Collaborator Author

I updated the composefs-info escaping to be a bit more strict.

The dump format is very simple and readable, yet efficiently streamable, I wonder if that is a good format for streaming data to mkcomposefs. Its maybe not super extensible though...

libcomposefs/lcfs-writer-erofs.c Show resolved Hide resolved
libcomposefs/lcfs-writer-erofs.c Outdated Show resolved Hide resolved
tools/composefs-info.c Outdated Show resolved Hide resolved
tools/composefs-info.c Outdated Show resolved Hide resolved
tools/composefs-info.c Outdated Show resolved Hide resolved
tools/composefs-dump.c Outdated Show resolved Hide resolved
@alexlarsson
Copy link
Collaborator Author

Also, this PR needs work once #175 and #187 lands.

@alexlarsson
Copy link
Collaborator Author

Rebased on latest and fixed some minor issues. This is still not ready.

@alexlarsson
Copy link
Collaborator Author

alexlarsson commented Sep 13, 2023

TODO:

  • Make cfs-fuse and image-to-node handle the new inlined files
  • Share more code between cfs-fuse and image-to-node
  • Add some more testing
  • Handle encoded whiteouts

Signed-off-by: Alexander Larsson <[email protected]>
trusted.overlay.verity doesn't exist anymore, its just
trusted.overlay.metacopy.

Signed-off-by: Alexander Larsson <[email protected]>
Right now these are only used from cfs-fuse.c, but they will be reused
more later.

Signed-off-by: Alexander Larsson <[email protected]>
This allows a whiteout in the composefs mount be used as a whiteout
both with a regular and an userxattr overlayfs mount.

Signed-off-by: Alexander Larsson <[email protected]>
We want to be able to read back everything that we can set.

Signed-off-by: Alexander Larsson <[email protected]>
This is useful to figure out the content of a composefs iamge.

Signed-off-by: Alexander Larsson <[email protected]>
This lets you list the content of an image, and dump the objects
it references.

Signed-off-by: Alexander Larsson <[email protected]>
For all the images we generate from json, try also
to regenerated them and ensure we get the same bits.

To do the dumping we add the internal composefs-dump tool.

Signed-off-by: Alexander Larsson <[email protected]>
This lets you test lcfs_node_set_content()

Signed-off-by: Alexander Larsson <[email protected]>
This tesets various special things like device nodes,
escaped xattrs whiteouts and inlined files.

Signed-off-by: Alexander Larsson <[email protected]>
This may catch any weird stuff we might generate.

Signed-off-by: Alexander Larsson <[email protected]>
Now uses more of the new helpers and code.
Also supports inlined data.

Signed-off-by: Alexander Larsson <[email protected]>
@alexlarsson
Copy link
Collaborator Author

I rebased on master, implemented missing features in fuse, generally cleaned up the code and added a few more tests.

I think with this in we're ready to freeze the on-disk format.

Once this is in I will focus on adding even more tests.

@alexlarsson alexlarsson marked this pull request as ready for review September 14, 2023 16:02
Copy link
Contributor

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of code here, I gave this a semi-skim, only found one actual minor bug (that can be fixed in a followup even); the rest is just nits/questions.

Comment on lines +122 to +131
res = malloc(prefix_len + name_len + 1);
if (res == NULL) {
errno = ENOMEM;
return NULL;
}
memcpy(res, prefix, prefix_len);
memcpy(res + prefix_len, name, name_len);
res[prefix_len + name_len] = 0;

return res;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all just return asprintf("%s%.*s", prefix, name_len, name) right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but I'm not super found of asprintf like this, because I always have to look up the length modifier in the manpage to get it right.

And also, I always worry about the integer format conversion in varargs calls like this.
Like, name_len is size_t, which could be larger than int in some compilers, and the field width must be int. So, do we need to cast? etc.

return s2;
}

static inline char *str_join(const char *a, const char *b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: This is just code motion, so OK, but same question re asprintf - can be a followup.

@@ -29,6 +29,28 @@
*/
#define LCFS_BUILD_INLINE_FILE_SIZE_LIMIT 64

#define OVERLAY_XATTR_USER_PREFIX "user."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aweseome, I think this helps a lot to have these. Then we can also followup with some links to the kernel sources.

@@ -1116,12 +1116,20 @@ static int add_overlayfs_xattrs(struct lcfs_node_s *node)
"", 0);
if (ret < 0)
return ret;
ret = lcfs_node_set_xattr(node, OVERLAY_XATTR_USERXATTR_WHITEOUT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, interesting...but can you help me understand this? Do we need the userxattrs for unprivileged overlayfs mounts? This seems tricky because it's redundant data in the privileged case, no? Or maybe it isn't much data and doesn't matter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, a traditional whiteout is a chardev(0,0), and having that in the lowerdir will work as a whiteout for both regular and userxattr overlayfs mounts. Unfortunately they don't work in composefs, because the outer overlayfs eats them.

So, we have a new type of whiteout. If we set an escaped whiteout xattr like trusted.overlay.overlay.whiteout on a file in the erofs, then in the overlayfs mount it will show up as trusted.overlay.whiteout, and a (new) regular overlayfs mount will treat it as a whiteout. However, to a userxattr overlayfs mount it will not be special in any way, because it looks for user.overlay.whiteout files. So, composefs needs to set both of these for the resulting file to always work as a whiteout.

In terms of redundancy. Its not ideal, but it is what we need, and I don't think we'll have tons of whiteouts in practice.


node = lcfs_load_node_from_image(image_data, image_data_size);
if (node == NULL) {
errsv = errno;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a handy PROTECT_ERRNO macro btw.


static void digest_to_string(const uint8_t *csum, char *buf)
{
static const char hexchars[] = "0123456789abcdef";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, this would literally be the fourth copy of this in tree...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooops :)

get_objects(node, ht);

size_t n_objects = hash_get_n_entries(ht);
char **objects = calloc(n_objects, sizeof(char *));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing NULL check

Comment on lines +253 to +257
const char *bin = argv[0];
int fd;
cleanup_node struct lcfs_node_s *root = NULL;
const char *image_path = NULL;
const char *command;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So not really related to this exactly but...can I arm twist you into using declare-and-initialize style in this codebase? I find it much more readable than "big dump of variables at the top of the function". It plays more nicely with __attribute__(cleanup) too because there's no footgun around forgetting to init to NULL etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Comment on lines +967 to +982
const erofs_inode *cino = cfs_get_erofs_inode(ino);
uint32_t mode;
uint64_t file_size;
uint16_t xattr_icount;
size_t xattr_size;
size_t isize;
uint64_t n_blocks;
uint64_t last_oob_block;
bool tailpacked;
size_t tail_size;
uint32_t raw_blkaddr;
off_t oob_size;
const uint8_t *tail_data;
const uint8_t *oob_data;
struct iovec iov[2];
int i;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(IMO, this is crying out for declare-and-initialize...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to completely belabor this point but...basically all other sane modern languages allow declaring (and initializing) a variable inside a function.

In the case of this particular function, it would probably be half the size if done this way.

I know you work on the Linux kernel which isn't yet using this style, and I know even some others like systemd haven't really adopted it but IMO it really just makes things much more readable. Particularly in this example!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the kernel is moving toward using attribute(cleanup), and as a result will be moving towards this style too (at least partially)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexlarsson alexlarsson merged commit ca46d09 into containers:main Sep 15, 2023
8 checks passed
cgwalters added a commit to cgwalters/storage that referenced this pull request Jun 4, 2024
I was just reading the code and I have a mental checklist item
for "invoking open without O_CLOEXEC" that triggered here.
(See also e.g.
containers/composefs#185 (comment)
)

It has security-relevant properties for us, xref
CVE-2024-21626 for example.

This isn't the only missing variant of this in this codebase,
just using this targeted PR to test the waters for more PRs.

Signed-off-by: Colin Walters <[email protected]>
cgwalters added a commit to cgwalters/storage that referenced this pull request Jun 4, 2024
I was just reading the code and I have a mental checklist item
for "invoking open without O_CLOEXEC" that triggered here.
(See also e.g.
containers/composefs#185 (comment)
)

It has security-relevant properties for us, xref
CVE-2024-21626 for example.

This isn't the only missing variant of this in this codebase,
just using this targeted PR to test the waters for more PRs.

Signed-off-by: Colin Walters <[email protected]>
cgwalters added a commit to cgwalters/storage that referenced this pull request Jun 5, 2024
I was just reading the code and I have a mental checklist item
for "invoking open without O_CLOEXEC" that triggered here.
(See also e.g.
containers/composefs#185 (comment)
)

It has security-relevant properties for us, xref
CVE-2024-21626 for example.

This isn't the only missing variant of this in this codebase,
just using this targeted PR to test the waters for more PRs.

Signed-off-by: Colin Walters <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants