-
Notifications
You must be signed in to change notification settings - Fork 1.9k
FreeBSD: Correct _PC_MIN_HOLE_SIZE #17750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
4174951
to
a748c90
Compare
While I wonder if apps could actually request it more specific. As I see, |
That sounds reasonable, I'll give it a shot. But I wonder if we should still report 1 in the ctldir case. |
Does this look reasonable?
|
I don't remember us having any files in the ctldir that could potentially have holes, so I wonder if we should just delete that part instead. |
Not bad. I just wonder what are the best APIs to use here. From actual functionality side though, for files of only one block we should report maximum of the file block size and the dataset record size, since the file block size may increase (but never decrease) if the file grow, while for a file of one block holes reporting is not really productive. |
Does the ctldir even ever contain any files? afaict the only entries it contains are |
Isn't that just the record size? I've been trying various things and it looks like files smaller than the record size can contain holes only if the entire file is a hole. As soon as there is even one nonzero byte anywhere in the file, the entire file gets allocated. So I'm starting to think the dataset record size is the correct value in all cases. |
No, I don't think it does now, but I think it was discussed at some point to expose some information/controls that way.
Dataset record size might change, but that does not affect already existing files with more than one block or with a block size already bigger than the new value. So the couple more conditions could give better result. |
a748c90
to
9212c0b
Compare
I note that I paper over this in our version of openrsync today in a way that kind of sucks, too. See, e.g., https://gist.github.com/kevans91/87ff85f9d85cf8c6f93369928a5bdb74
When the file is freshly created we report an st_blksize of the recordsize, which shrinks down as the file is actually written (and scales up to the recordsize, as you noted), but that's not really helpful. The rsync protocol means that I don't get to see holes in the original file as a hint to try and speed things up, so I have to check all incoming blocks at possible hole-boundaries for opportunities to create holes. So:
For us, --sparse is worded in such a way that we're not going to be in trouble if we miss a chance to punch holes smaller than the recordsize and a larger |
ping |
9212c0b
to
1bfe710
Compare
@dag-erling I don't see any reaction on my proposal to take into account dataset recordsize for files of one block. I mean something like this:
|
1bfe710
to
059ae68
Compare
@amotin are you certain
Looking at the code, the value returned in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you certain nblocks is correct here?
Not any more. I should have checked the function semantics, but I jumped to conclusion. Sorry.
I wonder what if instead of this similar to code below we just get zp
, and similar to the original code in zfs_write()
use zp->z_size <= zp->z_blksz
?
The actual minimum hole size on ZFS is variable, but we always report SPA_MINBLOCKSIZE, which is 512. This may lead applications to believe that they can reliably create holes at 512-byte boundaries and waste resources trying to punch holes that ZFS ends up filling anyway. * In the general case, if the vnode is a regular file, return its current block size, or the record size if the file is smaller than its own block size. If the vnode is a directory, return the dataset record size. If it is neither a regular file nor a directory, return EINVAL. * In the control directory case, always return EINVAL. Signed-off-by: Dag-Erling Smørgrav <[email protected]>
059ae68
to
4381f21
Compare
Motivation and Context
The actual minimum hole size on ZFS is variable, but we always report
SPA_MINBLOCKSIZE
, which is 512. This may lead applications to believe that they can reliably create holes at 512-byte boundaries and waste resources trying to punch holes that ZFS ends up filling anyway.Description
In
zfs_pathconf()
, if the vnode is a regular file, return its block size; if it is a directory, return the dataset record size; if it is neither, returnEINVAL
.In
zfsctl_common_pathconf()
, always returnEINVAL
for_PC_MIN_HOLE_SIZE
.How Has This Been Tested?
Tested in FreeBSD 16.0-CURRENT.
Types of changes
Checklist:
Signed-off-by
.