Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datastore: insert via BSATN instead of via PV #2069

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Centril
Copy link
Contributor

@Centril Centril commented Dec 18, 2024

Description of Changes

This PR changes the structure of the datastore to insert via BSATN rather than a PV. But this isn't entirely true, we just use the fast-path for BSATN -> BFLATN but in the general case we still do BSATN -> PV -> BFLATN. Later, I'll fully remove the temporary step. One of the other required changes here is that we write and generate sequence values directly in BFLATN.

Fixes #2017.

Perf numbers on master

2025-01-03T09:30:37.776998Z  INFO: : Timing span "update_positions_by_id": 1.969988054s
2025-01-03T09:30:43.829835Z  INFO: : Timing span "update_positions_by_collect": 1.518314839s

with this PR:

2025-01-03T09:35:54.415502Z  INFO: : Timing span "update_positions_by_id": 1.64947372s
2025-01-03T09:36:00.880068Z  INFO: : Timing span "update_positions_by_collect": 1.26073871s

That is, this removes ~0.4516s and 0.3887s from the benchmarks respectively.

Flamegraph of InstanceEnv::insert:
Screenshot 2025-01-03 at 10-53-47 Flame Graph

Full flamegraph: https://flamegraph.com/share/14b571da-c9ba-11ef-9832-26c3e5347170

The next step after this PR is to avoid the pointer map in case there is a unique index.
After that, the next step is to add an update ABI.

API and ABI breaking changes

None

Expected complexity level and risk

4 -- limited scope, but lots of unsafe code and complicated logic.

Testing

The internal spacetimedb_table tests still use the old table.insert, so the new path. To compensate for this, without duplicating tests, a new proptest insert_bsatn_same_as_pv is added asserting that the result and side-effects of inserting via PV and BSATN are the same. Moreover, both insert paths try to share as much code as possible to improve test coverage. The higher level tests, starting with MutTxId now use the new path. Over time, we can replace the remaining old paths with the new and then move all tests to the new as well.

Reviewer notes

I would recommend reviewing in this order:

  1. Start with spacetimedb_table
  2. Move on to locking_tx_datastore
  3. Move on to InstanceEnv
  4. Review all the other boring changes to files like estimation.rs that were unfortunately necessary.

Review notes for Tyler

In traits.rs the following changes:

  • get_next_sequence_value_mut_tx is removed. It was unused.
  • insert_mut_tx changes from:
    fn insert_mut_tx<'a>(..., row: ProductValue) -> Result<(AlgebraicValue, RowRef<'a>)>;
    to:
    fn insert_mut_tx<'a>(..., row: &[u8]) -> Result<(ColList, RowRef<'a>)>;

@Centril Centril marked this pull request as draft December 18, 2024 17:19
@Centril Centril force-pushed the centril/insert-bsatn branch 3 times, most recently from 5df3305 to 5dfebdd Compare December 18, 2024 17:45
@Centril Centril force-pushed the centril/insert-bsatn branch from 374c094 to 5df4fc2 Compare January 2, 2025 13:37
@Centril Centril force-pushed the centril/insert-bsatn branch from 8275279 to 4af0705 Compare January 3, 2025 09:32
which asserts that `table.insert(..)` does the same as using the bsatn path.
Sprinkles various `Debug, PartialEq, Eq` derives to achieve this.
Also uses `confirm_insertion` more to get that more under test.

2. Review and justify some unsafes.
@@ -1,4 +1,4 @@
#![forbid(unsafe_op_in_unsafe_fn)]
#![deny(unsafe_op_in_unsafe_fn)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly caused by a rustc bug interacting with thread_local!.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL that deny and forbid are different lint levels. I have no clue which is worse.

@Centril Centril marked this pull request as ready for review January 3, 2025 12:15
@Centril Centril requested a review from gefjon January 3, 2025 12:15
Copy link
Contributor

@gefjon gefjon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only reviewed the table and locking_tx_datastore changes with any particular care; everything else I glanced at, but didn't worry too much about, on the grounds that it's all safe code, it's barely changing and it's tested. The table and datastore parts I looked at quite carefully. I agree with your safety reasoning, and your benchmark results speak for themselves. I've left a few minor comment-related requests, like I always do. Great work on this! And thanks again for writing such a detailed PR description; it made review much smoother.

Comment on lines 98 to 99
/// A [`StaticLayout`] for fast BFLATN <-> BSATN conversion,
/// if the [`RowTypeLayout`] has a static BSATN length and layout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update this comment to also mention the StaticBsatnValidator?

/// This does not check for set semantic or unique constraints.
///
/// This is also useful when we need to insert a row temporarily to get back a `RowPointer`.
/// In this case, A call to this method should be followed by a call to [`delete_internal_skip_pointer_map`].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand this comment with a short paragraph about what will happen if this is called with a row that is not valid BSATN at the table's row type? Doesn't have to be super in-depth, but just to set our expectations between "Precisely detects all type errors" and "corrupts the table and burns down your house."

// as `row_ty` was derived from the same schema as `seq` is part of.
let elem_ty = unsafe { &row_ty.elements.get_unchecked(seq.col_pos.idx()) };
// SAFETY:
// - `elem_ty` appears as a column in th row type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// - `elem_ty` appears as a column in th row type.
// - `elem_ty` appears as a column in the row type.

/// # Safety
///
/// - `self.is_row_present(row)` must hold.
/// - `col_id` must be a valid column, with a primiive type, of the row type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// - `col_id` must be a valid column, with a primiive type, of the row type.
/// - `col_id` must be a valid column, with a primitive integer type, of the row type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling on "primitive." Being an integer is not a safety requirement as written, but I think it could be (see other comment). Whether or not we actually invoke UB if the column is non-integer, I think it's simplest in terms of docs to just say that we don't define behavior in that case, i.e. to make it a safety requirement as I'm doing here.

Comment on lines +448 to +450
PrimitiveType::Bool | PrimitiveType::F32 | PrimitiveType::F64 => {
panic!("`{:?}` is not a sequence integer type", &elem_ty.ty)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels mildly odd that we panic here, but unreachable_unchecked above. I don't feel strongly, but I might replace this with unreachable_unchecked just for consistency's sake.

}

/// Performs all the checks necessary after having fully decided on a rows contents.
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
///
///
/// This includes inserting the row into any applicable indices and/or the pointer map.
///

})
}

/// Insert a row, encoded in BSATN, into a table.
///
/// Requires:
/// - `TableId` must refer to a valid table for the database at `database_address`.
/// - `row` must be a valid row for the table at `table_id`.
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
///
///
/// Zero placeholders in auto-inc columns in the new row will be replaced with generated values
/// if and only if `GENERATE` is true.
/// This method is called with `GENERATE` false when updating the `st_sequence` system table.
///

@@ -1,4 +1,4 @@
#![forbid(unsafe_op_in_unsafe_fn)]
#![deny(unsafe_op_in_unsafe_fn)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL that deny and forbid are different lint levels. I have no clue which is worse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

perf: Use StaticLayout to insert BSATN -> BFLATN
2 participants