-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
datastore: insert via BSATN instead of via PV #2069
base: master
Are you sure you want to change the base?
Conversation
5df3305
to
5dfebdd
Compare
374c094
to
5df4fc2
Compare
8275279
to
4af0705
Compare
which asserts that `table.insert(..)` does the same as using the bsatn path. Sprinkles various `Debug, PartialEq, Eq` derives to achieve this. Also uses `confirm_insertion` more to get that more under test. 2. Review and justify some unsafes.
@@ -1,4 +1,4 @@ | |||
#![forbid(unsafe_op_in_unsafe_fn)] | |||
#![deny(unsafe_op_in_unsafe_fn)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly caused by a rustc bug interacting with thread_local!
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL that deny
and forbid
are different lint levels. I have no clue which is worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have only reviewed the table
and locking_tx_datastore
changes with any particular care; everything else I glanced at, but didn't worry too much about, on the grounds that it's all safe code, it's barely changing and it's tested. The table
and datastore
parts I looked at quite carefully. I agree with your safety reasoning, and your benchmark results speak for themselves. I've left a few minor comment-related requests, like I always do. Great work on this! And thanks again for writing such a detailed PR description; it made review much smoother.
/// A [`StaticLayout`] for fast BFLATN <-> BSATN conversion, | ||
/// if the [`RowTypeLayout`] has a static BSATN length and layout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update this comment to also mention the StaticBsatnValidator
?
/// This does not check for set semantic or unique constraints. | ||
/// | ||
/// This is also useful when we need to insert a row temporarily to get back a `RowPointer`. | ||
/// In this case, A call to this method should be followed by a call to [`delete_internal_skip_pointer_map`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand this comment with a short paragraph about what will happen if this is called with a row
that is not valid BSATN at the table's row type? Doesn't have to be super in-depth, but just to set our expectations between "Precisely detects all type errors" and "corrupts the table and burns down your house."
// as `row_ty` was derived from the same schema as `seq` is part of. | ||
let elem_ty = unsafe { &row_ty.elements.get_unchecked(seq.col_pos.idx()) }; | ||
// SAFETY: | ||
// - `elem_ty` appears as a column in th row type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// - `elem_ty` appears as a column in th row type. | |
// - `elem_ty` appears as a column in the row type. |
/// # Safety | ||
/// | ||
/// - `self.is_row_present(row)` must hold. | ||
/// - `col_id` must be a valid column, with a primiive type, of the row type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// - `col_id` must be a valid column, with a primiive type, of the row type. | |
/// - `col_id` must be a valid column, with a primitive integer type, of the row type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling on "primitive." Being an integer is not a safety requirement as written, but I think it could be (see other comment). Whether or not we actually invoke UB if the column is non-integer, I think it's simplest in terms of docs to just say that we don't define behavior in that case, i.e. to make it a safety requirement as I'm doing here.
PrimitiveType::Bool | PrimitiveType::F32 | PrimitiveType::F64 => { | ||
panic!("`{:?}` is not a sequence integer type", &elem_ty.ty) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels mildly odd that we panic here, but unreachable_unchecked
above. I don't feel strongly, but I might replace this with unreachable_unchecked
just for consistency's sake.
} | ||
|
||
/// Performs all the checks necessary after having fully decided on a rows contents. | ||
/// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// | |
/// | |
/// This includes inserting the row into any applicable indices and/or the pointer map. | |
/// |
}) | ||
} | ||
|
||
/// Insert a row, encoded in BSATN, into a table. | ||
/// | ||
/// Requires: | ||
/// - `TableId` must refer to a valid table for the database at `database_address`. | ||
/// - `row` must be a valid row for the table at `table_id`. | ||
/// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// | |
/// | |
/// Zero placeholders in auto-inc columns in the new row will be replaced with generated values | |
/// if and only if `GENERATE` is true. | |
/// This method is called with `GENERATE` false when updating the `st_sequence` system table. | |
/// |
@@ -1,4 +1,4 @@ | |||
#![forbid(unsafe_op_in_unsafe_fn)] | |||
#![deny(unsafe_op_in_unsafe_fn)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL that deny
and forbid
are different lint levels. I have no clue which is worse.
Description of Changes
This PR changes the structure of the datastore to insert via BSATN rather than a PV. But this isn't entirely true, we just use the fast-path for BSATN -> BFLATN but in the general case we still do BSATN -> PV -> BFLATN. Later, I'll fully remove the temporary step. One of the other required changes here is that we write and generate sequence values directly in BFLATN.
Fixes #2017.
Perf numbers on master
with this PR:
That is, this removes ~0.4516s and 0.3887s from the benchmarks respectively.
Flamegraph of
InstanceEnv::insert
:Full flamegraph: https://flamegraph.com/share/14b571da-c9ba-11ef-9832-26c3e5347170
The next step after this PR is to avoid the pointer map in case there is a unique index.
After that, the next step is to add an
update
ABI.API and ABI breaking changes
None
Expected complexity level and risk
4 -- limited scope, but lots of unsafe code and complicated logic.
Testing
The internal
spacetimedb_table
tests still use the oldtable.insert
, so the new path. To compensate for this, without duplicating tests, a new proptestinsert_bsatn_same_as_pv
is added asserting that the result and side-effects of inserting via PV and BSATN are the same. Moreover, both insert paths try to share as much code as possible to improve test coverage. The higher level tests, starting withMutTxId
now use the new path. Over time, we can replace the remaining old paths with the new and then move all tests to the new as well.Reviewer notes
I would recommend reviewing in this order:
spacetimedb_table
locking_tx_datastore
InstanceEnv
estimation.rs
that were unfortunately necessary.Review notes for Tyler
In
traits.rs
the following changes:get_next_sequence_value_mut_tx
is removed. It was unused.insert_mut_tx
changes from: