-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
accounts-db: relax intrabatch account locks #4253
base: master
Are you sure you want to change the base?
Conversation
3e016b3
to
f748a5f
Compare
2fb8527
to
11d1dcb
Compare
accounts-db/src/account_locks.rs
Outdated
// HANA TODO the vec allocation here is unfortunate but hard to avoid | ||
// we cannot do this in one closure because of borrow rules | ||
// play around with alternate strategies, according to benches this may be up to | ||
// 50% slower for small batches and few locks, but for large batches and many locks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bench using jemalloc? i'd think it would do a reasonable job of just keeping the mem in thread-local cache for re-use
95b17b3
to
d1ec289
Compare
a11ef4d
to
c234cc6
Compare
#[derive(Debug, Default)] | ||
pub struct AccountLocks { | ||
write_locks: AHashSet<Pubkey>, | ||
write_locks: AHashMap<Pubkey, u64>, | ||
readonly_locks: AHashMap<Pubkey, u64>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because the read- and write-lock hashmaps have the same type now, all the functions that change them are basically the same. we could use an enum or hashmap reference to discriminate and delete half of the functions, but i left it like this for your review before butchering it
relax_intrabatch_account_locks: bool, | ||
) -> Vec<Result<()>> { | ||
// Validate the account locks, then get iterator if successful validation. | ||
let tx_account_locks_results: Vec<Result<_>> = txs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accounts::lock_accounts()
could be reimpled as a wrapper on lock_accounts_with_results()
or possibly deleted, it isnt really required anymore since all batch-building needs to have results. but we could leave it as-is, or could do some kind of refactor with TransactionAccountLocksIterator
fn lock_accounts_inner( | ||
fn lock_accounts_inner<'a>( | ||
&self, | ||
tx_account_locks_results: Vec<Result<TransactionAccountLocksIterator<impl SVMMessage>>>, | ||
tx_account_locks_results: impl Iterator< | ||
Item = Result<TransactionAccountLocksIterator<'a, impl SVMMessage + 'a>>, | ||
>, | ||
relax_intrabatch_account_locks: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general where possible i changed vecs to iters, we eliminate several uses of collect()
and just pass around the closure chains. this makes the type signatures look kind of stupid tho and there are possibly things we could refactor to be better (like maybe combining transactions with the transaction results instead of taking for granted they always have the same length). im undecided about style tho
if relax_intrabatch_account_locks { | ||
let validated_batch_keys = tx_account_locks_results.map(|tx_account_locks_result| { | ||
tx_account_locks_result | ||
.map(|tx_account_locks| tx_account_locks.accounts_with_is_writable()) | ||
}); | ||
|
||
account_locks.try_lock_transaction_batch(validated_batch_keys) | ||
} else { | ||
tx_account_locks_results | ||
.map(|tx_account_locks_result| match tx_account_locks_result { | ||
Ok(tx_account_locks) => account_locks | ||
.try_lock_accounts(tx_account_locks.accounts_with_is_writable()), | ||
Err(err) => Err(err), | ||
}) | ||
.collect() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the main branch that the feature controls. we pass it in as a param so Accounts
doesnt need FeatureSet
, only Bank
. the only other place we use the feature is to enable signature-based transaction deduplication
runtime/src/bank.rs
Outdated
// with simd83 enabled, we must deduplicate transactions by signature | ||
// previously, conflicting account locks would do it as a side effect | ||
let mut batch_signatures = AHashSet::with_capacity(transactions.len()); | ||
let transaction_results = | ||
transaction_results | ||
.enumerate() | ||
.map(|(i, tx_result)| match tx_result { | ||
Ok(()) | ||
if relax_intrabatch_account_locks | ||
&& !batch_signatures.insert(transactions[i].signature()) => | ||
{ | ||
Err(TransactionError::AccountInUse) | ||
} | ||
Ok(()) => Ok(()), | ||
Err(e) => Err(e), | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the dedupe step mentioned in a comment above. we could use a double for loop instead of a hashset but this seemed much more straightforward since the inner loop would have to abort based on the outer loop index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd guess it's almost certainly faster to just do a brute-force since our batches are small (replay will be size 1 for unified-scheduler); but might get complicated with the current iterator interface. Fine to leave it as is.
d378d3a
to
c6d7105
Compare
recent sample of bench comparisons, master vs this branch with simd83 enabled
in general we perform slightly worse for tiny batches and as well or better for large batches. note these benches call code in |
runtime/src/bank.rs
Outdated
// with simd83 enabled, we must deduplicate transactions by signature | ||
// previously, conflicting account locks would do it as a side effect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we must dedup because we check for already_processed
in a batch, then process, then add to the status_cache.
Is that correct summary of why we need to do this now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im not fully confident in my understanding of the status cache but i believe the way it works is you execute the batch and the signatures of processed (non-dropped) transactions go in the status cache only after theyve all been run. there are no checks in svm (nor should there be) for duplicate transactions within a batch
if replay is going to single-batch everything i guess it would enforce this as a side effect but it seemed good to do it here, since this code already did enforce this constraint (a malicious block that put the same transaction in one entry multiple times would fail locking in replay, without anything involving status cache, because the transactions would take the same write lock on the fee-payer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we should enforce it for sure
runtime/src/bank.rs
Outdated
|
||
// with simd83 enabled, we must deduplicate transactions by signature | ||
// previously, conflicting account locks would do it as a side effect | ||
let mut batch_signatures = AHashSet::with_capacity(transactions.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's use message_hash
here instead of signature.
status-cache uses both, but signature is only necessary for RPC operation for fast signature lookup.
iirc reason to use message hash is because of signature malleability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i used it because message_hash
isnt provided by SVMMessage
or SVMTransaction
. would you like me to add it to SVMTransaction
? its available on SanitizedTransaction
but providing it from SanitizedMessage
would require us to add it to LegacyMessage
and v0::LoadedMessage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably just change the trait bound to TransactionWithMeta
on this function, that trait should provide it, and I'm fairly certain the things we actually call this with impl it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
had to squash the past because rebasing was getting ugly but this is bae70c0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have the function directly as part of TransactionWithMeta
, just call message_hash()
on the TransactionWithMeta
tx.
as_sanitized_transaction
(unless it IS a sanitized transaction) will create a sanitized transaction. and possibly do 100s of allocations.
That fn is only there because of legacy interfaces - we shouldn't use it anywhere it's not strictly necessary (should be only geyser rn)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha, i see it now. i only looked at TransactionWithMeta
rather than StaticMeta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a comment on that fn. I'll update it to be more clear that basically no one should be using that, except for the couple places we call into geyser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made comments on that trait better here - #4827
This PR contains changes to the solana sdk, which will be moved to a new repo within the next week, when v2.2 is branched from master. Please merge or close this PR as soon as possible, or re-create the sdk changes when the new repository is ready at https://github.com/anza-xyz/solana-sdk |
d2cf375
to
bae70c0
Compare
@joncinque this is a core runtime pr, the only sdk change is adding the feature gate. i assume when the new repo is created the procedure is going to be like:
right? |
Yep that sounds right. I'm also adding a new |
Problem
simd83 proposes removing the constraint that transactions in the same entry cannot have read/write or write/write contention on any account. a previous pr modified svm to be able to carry over state changes between such transactions while processing a transaction batch. this pr modifies consensus to remove the account locking rules that prevent such batches from being created or replayed
Summary of Changes
add a new function to
AccountLocks
,try_lock_transaction_batch
, which only checks for locking conflicts with other batches, allowing any account overlap within the batch itself. modifyAccounts
andBank
to use it instead when the feature gaterelax_intrabatch_account_locks
is activated. also modifyprepare_sanitized_batch_with_results
to deduplicate transactions within a batch by signature to prevent replay attacks, such that two instances of the same transaction cause the first to lock out the second, in a similar manner to the non-simd83 behavior for this special casesince transaction results are used more extensively than previous, some functions with
*_wiith_results
variants have bene collapsed into wrappers around that variant. we also refactor several things to favor iterators over vectors, to avoid places where iters are collected and transformed back into itersimportant code changes are confined to
accounts-db/src/accounts.rs
,accounts-db/src/account_locks.rs
, andruntime/src/bank.rs
. changes in core, ledger, and runtime transaction batch only affect tests. overall the large majority of changes are fixes or improvements to testsFeature Gate Issue: https://github.com/anza-xyz/feature-gate-tracker/issues/76