You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An RpcResponseContext JSON structure including a slot field at which the operation was evaluated.
I thought that this meant that when I call getAccountInfo or getMultipleAccounts, I get the account as it was at the end of the block in slot context.slot, but this is not the case.
Below is a minimal reproducer based on the clock sysvar. I believe that the problem is in context.slot, and not in the the clock sysvar, because I first discovered this bug when fetching vote accounts, where I observed the vote credits increasing more than what should have been possible based on how many slots apart these observations were (according to context.slot). When I changed that program to fetch the vote accounts as well as the clock sysvar in a single getMultipleAccounts, I no longer observed fewer impossible vote credits amounts (based on how many slots apart these observations were according to clock.slot).
I’ll omit the Cargo.lock here as the exact versions of the client libraries are not really relevant, the problem is server-side, and I reproduced against multiple RPCs.
src/main.rs:
use solana_sdk::commitment_config::CommitmentConfig;use solana_sdk::sysvar;use solana_client::rpc_client::RpcClient;fnmain(){let client = RpcClient::new("https://api.mainnet-beta.solana.com");let commitment = CommitmentConfig::finalized();loop{let response = client
.get_account_with_commitment(&sysvar::clock::ID, commitment).expect("Failed to fetch clock account from RPC");let data = &response
.value.expect("The clock sysvar exists.").data[..];let clock: sysvar::clock::Clock = bincode::deserialize(data).expect("The clock sysvar is well-formed.");println!("Context slot: {}, clock slot: {}, diff: {}",
response.context.slot,
clock.slot,
response.context.slot asi64 - clock.slot asi64,);}}
Note that not only is the clock sysvar ahead of the context slot returned by the RPC (which I think should not be possible if my understanding of the docs is correct, but that could be a simple off-by-one), the difference fluctuates, and it fluctuates by many slots.
In particular, I got two responses that claimed to be for slot 318679959, but one had the clock sysvar at 318679960, and the other at 318679961. If I read the same account multiple times at finalized commitment level, and the RPC says it read it at slot 318679959, then I expect the account data to be the same in all responses, but that is not the case here.
I can reproduce this behavior against internal RPC nodes at Chorus One running Agave 2.0.21 and 2.1.12, and against api.mainnet-beta.solana.com which at this time returns context.apiVersion: 2.0.21.
When using getAccountInfo or getMultipleAccounts to fetch the clock sysvar, the the slot returned in the RPC context and the clock sysvar should be the same.
I skimmed through the implementation of those two calls and they are using the same bank for reading the account and providing the context slot, and getting the account eventually calls bank.get_account, which looks correct to me if that method returns the account as it was at the bank’s slot. That method has some comments that worry me though:
I dove into the accounts db code, but at this point it’s getting more subtle than what I can investigate in one evening. Probably somebody who is more familiar with the accounts db code can diagnose this faster.
The text was updated successfully, but these errors were encountered:
After further testing, even getMultipleAccounts does not atomically fetch multiple accounts. Here’s a repro:
use solana_sdk::commitment_config::CommitmentConfig;use solana_sdk::sysvar;use solana_client::rpc_client::RpcClient;fnmain(){let client = RpcClient::new("https://api.mainnet-beta.solana.com");let commitment = CommitmentConfig::finalized();let addrs = vec![sysvar::clock::ID;100];loop{let response = client
.get_multiple_accounts_with_commitment(&addrs[..], commitment).expect("Failed to fetch clock accounts from RPC").value;for account in&response[1..]{assert_eq!(response[0],*account);}println!("ok");}}
Output:
ok
ok
ok
ok
ok
ok
ok
ok
ok
ok
thread 'main' panicked at src/main.rs:19:13:
assertion `left == right` failed
left: Some(Account { lamports: 1169280, data.len: 40, owner: Sysvar1111111111111111111111111111111111111, executable: false, rent_epoch: 18446744073709551615, data: ad950413000000001170a46700000000e202000000000000e3020000000000005df5a56700000000 })
right: Some(Account { lamports: 1169280, data.len: 40, owner: Sysvar1111111111111111111111111111111111111, executable: false, rent_epoch: 18446744073709551615, data: ae950413000000001170a46700000000e202000000000000e3020000000000005df5a56700000000 })
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
ruuda
changed the title
RPC context slot does not match the actual observed slot
RPC context slot does not match the actual observed slot, getMultipleAccounts is not atomic
Feb 7, 2025
Problem
RPC responses include a field
context
, which is documented asI thought that this meant that when I call
getAccountInfo
orgetMultipleAccounts
, I get the account as it was at the end of the block in slotcontext.slot
, but this is not the case.Below is a minimal reproducer based on the clock sysvar. I believe that the problem is in
context.slot
, and not in the the clock sysvar, because I first discovered this bug when fetching vote accounts, where I observed the vote credits increasing more than what should have been possible based on how many slots apart these observations were (according tocontext.slot
). When I changed that program to fetch the vote accounts as well as the clock sysvar in a singlegetMultipleAccounts
, Ino longerobserved fewer impossible vote credits amounts (based on how many slots apart these observations were according toclock.slot
).Repro boilerplate
rust-toolchain.toml
:Cargo.toml
:I’ll omit the
Cargo.lock
here as the exact versions of the client libraries are not really relevant, the problem is server-side, and I reproduced against multiple RPCs.src/main.rs
:One run produced the following output:
Note that not only is the clock sysvar ahead of the context slot returned by the RPC (which I think should not be possible if my understanding of the docs is correct, but that could be a simple off-by-one), the difference fluctuates, and it fluctuates by many slots.
In particular, I got two responses that claimed to be for slot 318679959, but one had the clock sysvar at 318679960, and the other at 318679961. If I read the same account multiple times at
finalized
commitment level, and the RPC says it read it at slot 318679959, then I expect the account data to be the same in all responses, but that is not the case here.I can reproduce this behavior against internal RPC nodes at Chorus One running Agave 2.0.21 and 2.1.12, and against
api.mainnet-beta.solana.com
which at this time returnscontext.apiVersion: 2.0.21
.This looks like a pretty serious issue to me, because it means that indexers and other tooling that were trusting the RPC’s context slot, may all have bogus data. If the problem is in the accounts db, then there might be correctness issues in other places too. Update: The problem also surfaces as
getMultipleAccounts
returning different data for the same account in a single call.Proposed Solution
When using
getAccountInfo
orgetMultipleAccounts
to fetch the clock sysvar, the the slot returned in the RPC context and the clock sysvar should be the same.I skimmed through the implementation of those two calls and they are using the same bank for reading the account and providing the context slot, and getting the account eventually calls
bank.get_account
, which looks correct to me if that method returns the account as it was at the bank’s slot. That method has some comments that worry me though:agave/runtime/src/bank.rs
Lines 4967 to 4984 in bd6e9f9
I dove into the accounts db code, but at this point it’s getting more subtle than what I can investigate in one evening. Probably somebody who is more familiar with the accounts db code can diagnose this faster.
The text was updated successfully, but these errors were encountered: