-
Notifications
You must be signed in to change notification settings - Fork 14
feat(persistent): Redesign CommitStateSpace, bound Commit lifetime #2534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
29c3e36
to
0dfe318
Compare
0dfe318
to
722896c
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2534 +/- ##
==========================================
+ Coverage 82.99% 83.00% +0.01%
==========================================
Files 254 255 +1
Lines 48040 48086 +46
Branches 43550 43596 +46
==========================================
+ Hits 39873 39916 +43
+ Misses 6098 6096 -2
- Partials 2069 2074 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
0f77b7b
to
fd61770
Compare
Could you explain what the purpose of the lifetime parameter is in more detail? It looks to me as if the |
Of course, sorry that was not clear. I've expanded the docs of /// # Lifetime of commits
///
/// A commit remains valid as long as the [`CommitStateSpace`] containing it is
/// alive. Note that it is also sufficient that a [`PersistentHugr`] containing
/// the commit is alive, given that the [`CommitStateSpace`] is guaranteed to
/// be alive as long as any of its contained [`PersistentHugr`]s. In other
/// words, the lifetime dependency is:
/// ```ignore
/// PersistentHugr -> CommitStateSpace -> Commit
/// ```
/// where `->` can be read as "is outlived by" (or "maintains a strong reference
/// to"). Note that the dependencies are NOT valid in the other direction: a
/// [`Commit`] only maintains a weak reference to its [`CommitStateSpace`].
///
/// When a [`CommitStateSpace`] goes out of scope, all its commit become
/// invalid. The implementation uses lifetimes to ensure at compile time that
/// the commit is valid throughout its lifetime. All constructors of [`Commit`]
/// thus expect a reference to the state space that the commit should be added
/// to, which fixes the lifetime of the commit.
///
/// Methods that directly modify the lifetime are marked as `unsafe`. It is up
/// to the user to ensure that the commit is valid throughout its updated
/// lifetime. I hope that clarifies your question! |
c3a228a
to
167e0c1
Compare
167e0c1
to
a3df91c
Compare
pub(super) base_commit: CommitId, | ||
#[repr(transparent)] | ||
pub struct CommitStateSpace { | ||
registry: Rc<RefCell<Registry<CommitData, ()>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the Rc
really make sense here? That means you can cheaply clone
a CommitStateSpace, but the clones will then share the state hidden inside the RefCell. I would not expect cloning a CommitStateSpace to be something you want to do often....???
(One can imagine a "fork" operation but TBH I'm not really sure why you would want to do that either ;-) unless Ids are scarce or something)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good comment. You are right that cloning could mean two different things here: i) cloning handles to a same state space or ii) creating a new state space that as you say "forks" the state space.
The semantics of cloning here are the former. We need to have multiple handles to the same state space, as every PersistentHugr
contains an Rc to the state space it belongs to.
I've considered deleting the Clone
implementation (as we often don't want to clone handles ourselves), but there are cases where this is useful, e.g. in testing where we first create PersistentHugr, and then want to create new PersistentHugrs in the same state space as the first.
All in all, I think keeping this as is is the simplest solution. I've added a sentence in the docs specifying the behaviour of cloning.
/// - there is a unique commit of variant [`CommitData::Base`] and its ID | ||
/// is `base_commit_id`. | ||
graph: HistoryGraph<CommitData, ()>, | ||
/// The unique root of the commit graph. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"(Cache of)" perhaps in that it could be computed (by following the parent pointer from any commit, result would be the same regardless of which commit you started from)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Have added "cache of" as well as written explicitly the invariant that all paths from a commit in self through ancestors will lead to this root.
/// the same node. | ||
/// - there is a unique commit of variant [`CommitData::Base`] and its ID | ||
/// is `base_commit_id`. | ||
graph: HistoryGraph<CommitData, ()>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is the bit that might be expensive to clone. I'm not terribly clear from the docs of HistoryGraph as to whether it's cheap, or whether it's actually a complete+owned petgraph constructed from (a cache of) the parent pointers of some set of commits. It might be worth nothing that a persistent (im::) list/set of "leaf commits" would store the same information (perhaps in less accessible form) and really would be cheap to clone...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it's not that cheap to clone, so a persistent list (inside HistoryGraph) would be better - but storing the complete list of commits (as it does, rather than just leaves) might be best because then clones would only ever append to the list rather than removing nodes that are no longer leaves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes agreed; I will create an issue for this, to be picked up when this becomes a bottleneck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR updates the relrc dependency to v0.5.0 and completely redesigns the
CommitStateSpace
struct. The result is much cleaner:PersistentHugr
is an immutable data type following the design of the types of theim
crate, i.e. it is a mutable Rust type that is cheap to clone and mutate (immutability -- if desired -- is achieved by cloning before mutating).Commit
is a "transaction", i.e. one atomic mutation that was applied to aPersistentHugr
. It has an ID, obtainable throughcommit.id()
and a lifetime parameter that ensures it does not stay in scope beyond the lifetime of thePersistentHugr
PersistentHugr
belongs to a uniqueCommitStateSpace
. ACommitStateSpace
is a registry of all commits of thePersistentHugr
s within that state space. It keeps the map between commits and IDs.This is a breaking change, but does not trigger a breaking HUGR release because it is hidden behind an unstable feature.