-
Notifications
You must be signed in to change notification settings - Fork 645
fix: Drop table #3210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Drop table #3210
Conversation
4e25272 to
19a8706
Compare
e17509a to
4d1166a
Compare
4d1166a to
05c6e14
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am somewhat unhappy with this solution. It seems simpler to keep the PendingSchemaChange machinery for handling drop_table, and to do one of:
- Require that the table be empty at the time of
MutTx::drop_table. - Delete all the rows during
MutTx::drop_table, before making the change to the in-memory structures. - Visit the pending schema change in
CommittedState::mergeand delete all the rows then.
| // After applying `merge_apply_deletes` and `merge_apply_inserts`, | ||
| // any tables no longer referenced by `st_table` should be | ||
| // removed from the committed state. | ||
| let mut tables_to_drop = thin_vec::ThinVec::<TableId>::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use ThinVec here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing the idea is to keep the footprint small as it will be empty most of the time. However, as we never put this into the heap, it doesn't matter that we use 16 bytes more on the stack, so a regular Vec seems better.
|
|
||
| // First, apply deletes. This will free up space in the committed tables. | ||
| self.merge_apply_deletes(&mut tx_data, tx_state.delete_tables); | ||
| self.merge_apply_deletes(&mut tx_data, &mut tables_to_drop, tx_state.delete_tables); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| self.merge_apply_deletes(&mut tx_data, &mut tables_to_drop, tx_state.delete_tables); | |
| // Any rows deleted from `st_table` will be noted in `tables_to_drop`. | |
| self.merge_apply_deletes(&mut tx_data, &mut tables_to_drop, tx_state.delete_tables); |
I don't actually know if this is true (this is the first part of my review), it's just my assumption about what's going on here.
| // Then, apply inserts. This will re-fill the holes freed by deletions | ||
| // before allocating new pages. | ||
| self.merge_apply_inserts(&mut tx_data, tx_state.insert_tables, tx_state.blob_store); | ||
| self.merge_apply_inserts( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| self.merge_apply_inserts( | |
| // Any rows inserted into `st_table` will be removed from `tables_to_drop`. | |
| // This allows us to avoid incorrectly dropping tables whose `st_table` rows have been "updated," | |
| // i.e. replaced with a new version. | |
| self.merge_apply_inserts( |
I don't actually know if this is true (this is the first part of my review), it's just my assumption about what's going on here.
| self.tx_state.delete_tables.remove(&table_id); | ||
| let commit_table = self | ||
| .committed_state_write_lock | ||
| .tables | ||
| .remove(&table_id) | ||
| .expect("there should be a schema in the committed state if we reach here"); | ||
| self.push_schema_change(PendingSchemaChange::TableRemoved(table_id, commit_table)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're removing this code, you need to update the comment about how the commit table isn't eagerly deleted, but what code does delete it, and how queries later in the transaction avoid reading from it.
fundamental problem with removing in-memory
I feel |
This doesn't sound like a fundamental problem. In fact it sounds very solvable. You could either:
|
Yes, I think we can do this to make it work, but I can't say how is this better than not deleting comit table during |
|
Closing in favor of - #3214 |
# Description of Changes Aternative to and closes #3210. This version relies on `pending_schema_changes`. The first commit adds `clear_table` to the datastore that's efficient and can be exposed to the module ABI in a follow up. The second commit fixes `drop_table`. # API and ABI breaking changes None # Expected complexity level and risk 3? # Testing `test_drop_table_is_transactional` is amended to check `TxData`. --------- Signed-off-by: Mazdak Farrokhzad <[email protected]> Co-authored-by: Shubham Mishra <[email protected]> Co-authored-by: Shubham Mishra <[email protected]>
Description of Changes
Current implementation of
drop_tablehave few problems :blob_storebytes are never reclaimed.TxDatadoes not include deleted rows, so the commit log replay logic never drops the in-memory commit table.Changes:
Make
MutTx::drop_tableto not drop commit table but only delete rows (usingdelete_tables) inside it. Later, letCommitState::mergeto detect dropped tables by anaylyzingst_tablerows deletions and remove commit tables accordingly.This change fixes above two problems. However, we still need an additional change to drop empty tables during replay (WIP).
Expected complexity level and risk
3, small change but it is in hotpath.
Testing