-
-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make ItemSliceSync
safe
#1231
Comments
Thanks for bringing this up! I also would love to get rid of all If we somehow would allow Thus I believe it's impossible for this data structure to end up in a place where it violates its assumption. Maybe that's wrong somehow, but I don't see it yet.
Even if the performance-drop when doing that would be negligible I'd think that the increase in memory consumption will not be that. Last time I eyeballed the memory consumption Would you consider closing the issue until there is more evidence something needs to be done? |
Ah, I see. I agree that it should be safe then.
I'll defer that to @Manishearth who did the unsafe review of |
As an unsafe reviewer the standard I tend to apply is that it should be very easy to verify that the invariant us upheld (not just a comment claiming it is). So I think what I would need is:
Unless the comment is sufficiently self-evident and does not require jumping around a lot of code to verify. Even then I'd like the code it references to also have comments to that effect. |
Inspired by #1231 (comment) . - make sure the type in question isn't use outside of its designated module - improve documentation around safety to make the underlying data structure more obvious.
Inspired by #1231 (comment) . - make sure the type in question isn't use outside of its designated module - improve documentation around safety to make the underlying data structure more obvious.
Thanks for chiming in - I like your standard as it's easy to follow and thus hopefully just as easy to apply. Trying just that, I created #1233.
Access control was improved by restricting it more.
I didn't propagate-up the It's interesting that the |
Inspired by #1231 (comment) . - make sure the type in question isn't use outside of its designated module - improve documentation around safety to make the underlying data structure more obvious.
Inspired by #1231 (comment) . - make sure the type in question isn't use outside of its designated module - improve documentation around safety to make the underlying data structure more obvious.
Inspired by #1231 (comment) . - make sure the type in question isn't use outside of its designated module - improve documentation around safety to make the underlying data structure more obvious.
@Byron I think that's a start but still not easily reviewed. I'm also not convinced on that Send and Sync bound, the safety comments on the Send and Sync bounds are definitely not correct at the very least. |
I think the main culprit is that By itself there is no way this can be considered safe, as it is only 'not unsafe' if its used correctly, and that usage is described in detail, but the code for that is in a couple of places and certainly not too easy to follow. With that said, I don't know what else to do about this. |
Yeah the main thing to be careful about is that a RefCell is Send, but not Sync, so an |
I wish this could be reflected somehow, after all the |
No, your type is logically an |
This definitely compiles even though it shouldn't: diff --git a/gix-pack/src/cache/delta/traverse/util.rs b/gix-pack/src/cache/delta/traverse/util.rs
index 95789b50b..79602ada1 100644
--- a/gix-pack/src/cache/delta/traverse/util.rs
+++ b/gix-pack/src/cache/delta/traverse/util.rs
@@ -59,3 +59,18 @@ unsafe impl<T> Send for ItemSliceSync<'_, T> where T: Send {}
// `get_mut()`, we only ever access one T at a time.
#[allow(unsafe_code)]
unsafe impl<T> Sync for ItemSliceSync<'_, T> where T: Send {}
+
+#[cfg(test)]
+mod tests {
+ use super::*;
+ use std::cell::RefCell;
+
+ #[test]
+ fn type_cannot_be_refcell() {
+ type Test<'a> = ItemSliceSync<'a, RefCell<String>>;
+ let mut item = RefCell::new("hi".into());
+ let value = Test::new(std::slice::from_mut(&mut item));
+ fn use_it(_slice: &Test<'_>) {}
+ use_it(&value);
+ }
+} To me it's a matter of unsafe abstraction - the way Now the question is how this "array with members that are (trust me) owned by a single thread at a time (but accessed by reference)" can be expressed so that ideally Rusts other safeguards aren't removed like what's done here. |
There are a couple patterns that help with this. If you're doing parallel tree traversal the best thing is tohave a little bit of unsafe in the splitting code (effectively a vector based The core problem here is that the tree is not structured in an ownership-based way, so you're having to stitch this together across different parts of the code. That said the main invariant is on |
Okay, the invariant is trickier because |
I dug in and added this chunk of text. In summary, this is for handling a special case where ref-delta entries point to their base by id which is in the future of the current entry, not in its past. Thus these yet-to-be-seen parent objects are hooked up with their children when finishing the tree. Such packs exist only at rest and are untypical. diff --git a/gix-pack/src/cache/delta/mod.rs b/gix-pack/src/cache/delta/mod.rs
index 435008edf..c9d498ef1 100644
--- a/gix-pack/src/cache/delta/mod.rs
+++ b/gix-pack/src/cache/delta/mod.rs
@@ -53,6 +53,18 @@ pub struct Tree<T> {
last_seen: Option<NodeKind>,
/// Future child offsets, associating their offset into the pack with their index in the items array.
/// (parent_offset, child_index)
+ /// In thin-packs, ref-deltas entries point to an object by its id. If this is a thin pack then this
+ /// would have been handled by [`LookupRefDeltaObjectsIter`](crate::data::input::LookupRefDeltaObjectsIter)
+ /// already.
+ /// Thus `child_index` points to a resolved ref-delta entry which pointed to an object within this pack
+ /// that wasn't seen yet, hence it points to a yet-to-be-seen pack entry.
+ /// And these we store and resolve later once we have seen all entries in the pack.
+ /// Note that such pack are atypical, as packs at rest should only ever use ofs-deltas to refer to parent entries.
+ /// In practice though, these packs with ref-delta entries exist at reset as well.
+ /// They are even produced by [Azure Git servers](https://github.com/Byron/gitoxide/issues/1025),
+ /// which is when we cannot handle them at all as we cannot know where an entry is in its pack by knowing its ID
+ /// without having an index yet, which we don't have as we are streaming the pack right now and thus try to produce
+ /// said index.
future_child_offsets: Vec<(crate::data::Offset, usize)>,
}
|
It's fine I figured it out, I've got a partial PR. It wasn't that it's confusing, it's that the invariant has multiple caveats which is tricky for invariants of this form. |
Current behavior 😯
ItemSliceSync::get_mut()
takes an non-mutable reference and an index. It doesn't check if there are multiple callers accessing the same index, so that's left to the caller to do (the function is therefore markedunsafe
). I can't tell if the callers are actually guaranteed not to access the same index concurrently. My guess - without looking very carefully - is that the requirement can be violated by a bad packfile.Expected behavior 🤔
Maybe ItemSliceSync should be replaced by a list of RefCell to make it safe?
The text was updated successfully, but these errors were encountered: