Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(state-sync): sync to the current epoch instead of the previous #12102

Merged
merged 75 commits into from
Oct 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
ad4d1d1
rename epoch_tail_hash -> sync_hash
marcelo-gonzalez Sep 11, 2024
76f9350
change var names
marcelo-gonzalez Sep 13, 2024
a445393
organize catchup_state_syncs value into CatchupState
marcelo-gonzalez Sep 13, 2024
174b62f
use hashmap entry explicitly so i can use question mark
marcelo-gonzalez Sep 13, 2024
a59e594
rename sync_hash -> epoch_first_block
marcelo-gonzalez Sep 16, 2024
f8f6fb6
state sync to the current epoch during catchup
marcelo-gonzalez Sep 17, 2024
f4eba89
move get_epoch_start_sync_hash to Chain
marcelo-gonzalez Sep 17, 2024
53e933c
dump state for the current epoch
marcelo-gonzalez Sep 17, 2024
5c9055f
get rid of store validator check
marcelo-gonzalez Sep 17, 2024
281c156
add TODOs
marcelo-gonzalez Sep 17, 2024
386df33
remove test_mock_node_basic
marcelo-gonzalez Sep 18, 2024
df90eb0
Merge remote-tracking branch 'origin/master' into state-sync-epoch
marcelo-gonzalez Sep 24, 2024
2eda311
make it a nightly protocol feature
marcelo-gonzalez Sep 24, 2024
5900d8c
comments
marcelo-gonzalez Sep 24, 2024
8dea6ad
map epoch ID to NewChunkTracker
marcelo-gonzalez Sep 24, 2024
78610c7
remove set_state_sync_hash
marcelo-gonzalez Sep 24, 2024
e5d428e
add TODO
marcelo-gonzalez Sep 24, 2024
2379aec
rename sync_header -> next_header
marcelo-gonzalez Sep 24, 2024
5e2f654
add integration test
marcelo-gonzalez Oct 1, 2024
4e9195a
update the StateSyncInfo struct and make it a DB migration
marcelo-gonzalez Oct 3, 2024
78d6a75
update comment :)
marcelo-gonzalez Oct 3, 2024
f07977a
Merge remote-tracking branch 'origin/master' into state-sync-epoch
marcelo-gonzalez Oct 3, 2024
135921d
remove comment
marcelo-gonzalez Oct 3, 2024
15d707b
fix test
marcelo-gonzalez Oct 3, 2024
776de8f
nit
marcelo-gonzalez Oct 3, 2024
9ae019a
move find_sync_hash() to Client so we can call it in tests
marcelo-gonzalez Oct 11, 2024
e4649ec
refactor notify_start_sync calculation
marcelo-gonzalez Oct 15, 2024
7f42d30
sync the new way when state syncing due to being behind the chain rat…
marcelo-gonzalez Oct 15, 2024
abb174d
fix check_sync_hash_validity()
marcelo-gonzalez Oct 15, 2024
e8e9f35
fix test_catchup_gas_price_change
marcelo-gonzalez Oct 15, 2024
b45cff1
fix test_dump_epoch_missing_chunk_in_last_block
marcelo-gonzalez Oct 15, 2024
f6d860f
fix test_state_sync_headers
marcelo-gonzalez Oct 16, 2024
8c58ee5
fix test_state_sync_headers_no_tracked_shards
marcelo-gonzalez Oct 16, 2024
6fa6c3c
fix test_sync_and_call_cached_contract
marcelo-gonzalez Oct 16, 2024
7be8679
fix test_two_deployments
marcelo-gonzalez Oct 16, 2024
80174d5
fix test_sync_after_delete_account
marcelo-gonzalez Oct 16, 2024
97d3abc
fix test_state_sync_w_dumped_parts
marcelo-gonzalez Oct 18, 2024
3566cd0
use ProtocolFeature::enabled()
marcelo-gonzalez Oct 19, 2024
105257e
delete comment
marcelo-gonzalez Oct 21, 2024
c008b90
don't pass block header to should_make_or_delete_snapshot()
marcelo-gonzalez Oct 21, 2024
89bc547
version the state sync struct by enum instead of u32 field
marcelo-gonzalez Oct 21, 2024
1b81f8c
remove first block hash field of add_state_sync_infos again
marcelo-gonzalez Oct 21, 2024
1c10b15
use ShardId in state_downloads field
marcelo-gonzalez Oct 21, 2024
18d3623
rename block -> epoch_first_block in get_state_sync_info()
marcelo-gonzalez Oct 21, 2024
eb88ad3
comments
marcelo-gonzalez Oct 21, 2024
7be1c8a
simplify get_epoch_start_sync_hash()
marcelo-gonzalez Oct 21, 2024
7902bb7
make get_current_epoch_sync_hash() take a block hash argument like ge…
marcelo-gonzalez Oct 21, 2024
7dbf812
consolidate calls to get_current_epoch_sync_hash() and get_epoch_star…
marcelo-gonzalez Oct 21, 2024
741862b
rename get_epoch_start_sync_hash -> get_previous_epoch_sync_hash
marcelo-gonzalez Oct 21, 2024
3176a89
move find_sync_hash() back to the ClientActor
marcelo-gonzalez Oct 21, 2024
6c645ed
implement Store::iter_ser() and call it in the migration
marcelo-gonzalez Oct 22, 2024
f96f762
refactor: extract get_catchup_sync_hash() from run_catchup()
marcelo-gonzalez Oct 22, 2024
0628634
recator: use or_insert_with()
marcelo-gonzalez Oct 22, 2024
7d903ec
add another enum variant to StateSyncInfo for the new version
marcelo-gonzalez Oct 22, 2024
ba3397a
rename StateSyncInfo::block_hash() -> StateSyncInfo::epoch_first_block()
marcelo-gonzalez Oct 22, 2024
78497d4
add comments
marcelo-gonzalez Oct 22, 2024
98b6f6d
match nit
marcelo-gonzalez Oct 22, 2024
f4d4866
rename block -> prev_block
marcelo-gonzalez Oct 22, 2024
dc8e3db
save prev_prev_hash variable
marcelo-gonzalez Oct 22, 2024
0f79f7f
FIXME -> TODO
marcelo-gonzalez Oct 22, 2024
c48db12
match nit
marcelo-gonzalez Oct 22, 2024
3c575f7
Merge remote-tracking branch 'origin/master' into state-sync-epoch
marcelo-gonzalez Oct 23, 2024
720e2c5
nits
marcelo-gonzalez Oct 23, 2024
920156a
add TODO
marcelo-gonzalez Oct 23, 2024
ded0a8e
Merge remote-tracking branch 'origin/master' into state-sync-epoch
marcelo-gonzalez Oct 23, 2024
db3e2cb
migrate test_state_request to test loop
marcelo-gonzalez Oct 24, 2024
9991c89
return None in get_sync_hash() for the genesis block
marcelo-gonzalez Oct 24, 2024
99f98a5
Don't call check_sync_hash_validity() for the genesis block in test_s…
marcelo-gonzalez Oct 24, 2024
66f30f1
fix test_sync_hash_validity for nightly builds
marcelo-gonzalez Oct 24, 2024
902e90a
rm unused uses
marcelo-gonzalez Oct 24, 2024
752f9e8
fix test_process_block_after_state_sync
marcelo-gonzalez Oct 25, 2024
598f49c
don't return a sync hash for the first epoch in the old state sync (p…
marcelo-gonzalez Oct 26, 2024
5133b31
fix state_parts_dump_check.py
marcelo-gonzalez Oct 26, 2024
0572ada
Merge remote-tracking branch 'origin/master' into state-sync-epoch
marcelo-gonzalez Oct 26, 2024
286693b
fix test_sync_hash_validity again
marcelo-gonzalez Oct 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions chain/chain/src/block_processing_utils.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,17 @@ pub(crate) const MAX_PROCESSING_BLOCKS: usize = 5;

/// Contains information from preprocessing a block
pub(crate) struct BlockPreprocessInfo {
/// This field has two related but actually different meanings. For the first block of an
/// epoch, this will be set to false if we need to download state for shards we'll track in
/// the future but don't track currently. This implies the first meaning, which is that if
/// this is true, then we are ready to apply all chunks and update flat state for shards
/// we'll track in this and the next epoch. This comes into play when we decide what ApplyChunksMode
/// to pass to Chain::apply_chunks_preprocessing().
/// The other meaning is that the catchup code should process this block. When the state sync sync_hash
/// is the first block of the epoch, these two meanings are the same. But if the sync_hash is moved forward
/// in order to sync the current epoch's state instead of last epoch's, this field being false no longer implies
/// that we want to apply this block during catchup, so some care is needed to ensure we start catchup at the right
/// point in Client::run_catchup()
pub(crate) is_caught_up: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to split this field into multiple fields (or enum) to differentiate these meanings? it feels like the field being false indicates both we want to apply the chunks and not apply the chunks based on other state such as sync_hash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think actually in both cases if this field is false, we don't want to apply the chunks for shards we don't currently track, and this logic should be the same:

fn get_should_apply_chunk(

I think we probably could split it, but it's a little bit tricky. Let me think about it actually... For now in this PR it is kept as is to not have to touch too many things and possibly break something. the tricky part is that right now we add the first block of the epoch to the BlocksToCatchup column based on this field, which is then read to see if we'll need to catch up the next block after this one as well:

Ok((self.prev_block_is_caught_up(&prev_prev_hash, &prev_hash)?, None))

I guess where that is called maybe we can just call get_state_sync_info() again, and also check if catchup is already done, but it requires some care

pub(crate) state_sync_info: Option<StateSyncInfo>,
pub(crate) incoming_receipts: HashMap<ShardId, Vec<ReceiptProof>>,
Expand Down
262 changes: 186 additions & 76 deletions chain/chain/src/chain.rs

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions chain/chain/src/garbage_collection.rs
Original file line number Diff line number Diff line change
Expand Up @@ -318,6 +318,7 @@ impl ChainStore {
let header = self.get_block_header(&sync_hash)?;
let prev_hash = *header.prev_hash();
let sync_height = header.height();
// TODO(current_epoch_state_sync): fix this when syncing to the current epoch's state
// The congestion control added a dependency on the prev block when
// applying chunks in a block. This means that we need to keep the
// blocks at sync hash, prev hash and prev prev hash. The heigh of that
Expand Down
2 changes: 1 addition & 1 deletion chain/chain/src/store/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2561,7 +2561,7 @@ impl<'a> ChainStoreUpdate<'a> {
for state_sync_info in self.add_state_sync_infos.drain(..) {
store_update.set_ser(
DBCol::StateDlInfos,
state_sync_info.epoch_tail_hash.as_ref(),
state_sync_info.epoch_first_block().as_ref(),
&state_sync_info,
)?;
}
Expand Down
4 changes: 2 additions & 2 deletions chain/chain/src/store_validator/validate.rs
Original file line number Diff line number Diff line change
Expand Up @@ -739,8 +739,8 @@ pub(crate) fn state_sync_info_valid(
state_sync_info: &StateSyncInfo,
) -> Result<(), StoreValidatorError> {
check_discrepancy!(
state_sync_info.epoch_tail_hash,
*block_hash,
state_sync_info.epoch_first_block(),
block_hash,
"Invalid StateSyncInfo stored"
);
Ok(())
Expand Down
215 changes: 194 additions & 21 deletions chain/client/src/client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,8 @@ use near_primitives::merkle::{merklize, MerklePath, PartialMerkleTree};
use near_primitives::network::PeerId;
use near_primitives::receipt::Receipt;
use near_primitives::sharding::{
EncodedShardChunk, PartialEncodedChunk, ShardChunk, ShardChunkHeader,
EncodedShardChunk, PartialEncodedChunk, ShardChunk, ShardChunkHeader, StateSyncInfo,
StateSyncInfoV1,
};
use near_primitives::transaction::SignedTransaction;
use near_primitives::types::chunk_extra::ChunkExtra;
Expand Down Expand Up @@ -104,6 +105,17 @@ pub enum AdvProduceBlocksMode {
OnlyValid,
}

/// The state associated with downloading state for a shard this node will track in the
/// future but does not currently.
pub struct CatchupState {
wacban marked this conversation as resolved.
Show resolved Hide resolved
/// Manages downloading the state.
pub state_sync: StateSync,
/// Keeps track of state downloads, and gets passed to `state_sync`.
pub sync_status: StateSyncStatus,
/// Manages going back to apply chunks after state has been downloaded.
pub catchup: BlocksCatchUpState,
}

pub struct Client {
/// Adversarial controls - should be enabled only to test disruptive
/// behaviour on chain.
Expand Down Expand Up @@ -139,9 +151,12 @@ pub struct Client {
/// Approvals for which we do not have the block yet
pub pending_approvals:
lru::LruCache<ApprovalInner, HashMap<AccountId, (Approval, ApprovalType)>>,
/// A mapping from an epoch that we know needs to be state synced for some shards
/// to a tracker that will find an appropriate sync_hash for state sync to that epoch
catchup_tracker: HashMap<EpochId, NewChunkTracker>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: new_chunk_trackers or similar
Out of curiosity, is there ever more than one epoch in that map? (same question for catchup_state_syncs)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know, that is a good question... It seems this mapping from epoch ID to catchup status has been around since state sync was first implemented, and looking at the code it doesn't seem to me like it should ever have more than one... When we add one, it's because we see a block in a new epoch T, and we know we're going to be a chunk producer for T+1 in a new shard. The epoch ID for T+1 is the hash of the last block in epoch T-1, so if there's a fork at the beginning of the epoch, it will still have the same epoch ID. So that should mean that for any given epoch height, we only have one of these. And then if we ask if it's possible to have epoch T and T+1 in there at the same time, we can look at the fact that we remove the epoch info for epoch T in finish_catchup_blocks() when we call remove_state_sync_info(). If that hasn't happened by the time we get to the first block of epoch T+1, we will not add another catchup state keyed by the epoch ID of epoch T+1, because we'll call that block an orphan until we remove the last block of epoch T from the BlocksToCatchup column in finish_catchup_blocks() .

So idk how it is even possible to have two, but maybe I'm missing something and that logic was put there for a reason?

But now that I think about it, what happens if we apply the first block of a new epoch and then save StateSyncInfo for it, which on the DB is keyed by the hash of that first block, and then that block doesn't end up on the main chain, because there's a fork off the last block of epoch T-1? Is there some reason that won't happen for the first block of an epoch? I'm not sure if there is, but if so, then there's an implicit assumption here that at least should have gotten a comment explaining, and if not, there's a potential bug here. I think in general maybe we should only be working with final blocks when doing all this state sync stuff. If it is a bug, then for now it's a pretty unlikely one I guess, but it's worth investigating more. Shouldn't be too too hard to try causing a fork at the first block of an epoch in a testloop test and seeing what happens

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ehh it's possible that last part is not a bug actually, since there's care put into only removing the particular sync hash from the BlocksToCatchup here, but it's a bit confusing and worth sanity checking

/// A mapping from a block for which a state sync is underway for the next epoch, and the object
/// storing the current status of the state sync and blocks catch up
pub catchup_state_syncs: HashMap<CryptoHash, (StateSync, StateSyncStatus, BlocksCatchUpState)>,
pub catchup_state_syncs: HashMap<CryptoHash, CatchupState>,
/// Keeps track of information needed to perform the initial Epoch Sync
pub epoch_sync: EpochSync,
/// Keeps track of syncing headers.
Expand Down Expand Up @@ -224,6 +239,102 @@ pub struct ProduceChunkResult {
pub transactions_storage_proof: Option<PartialState>,
}

/// This keeps track of the number of new chunks seen in each shard since the block that was passed to new()
/// This whole thing could be replaced with a much simpler function that just computes the number of new chunks
/// in each shard from scratch every time we call it, but in the unlikely and unfortunate case where a shard
/// hasn't had any chunks for a very long time, it would end up being a nontrivial inefficiency to do that
/// every time run_catchup() is called
pub struct NewChunkTracker {
last_checked_hash: CryptoHash,
last_checked_height: BlockHeight,
num_new_chunks: HashMap<ShardId, usize>,
sync_hash: Option<CryptoHash>,
}

impl NewChunkTracker {
fn new(
first_block_hash: CryptoHash,
first_block_height: BlockHeight,
shard_ids: &[ShardId],
) -> Self {
Self {
last_checked_hash: first_block_hash,
last_checked_height: first_block_height,
num_new_chunks: shard_ids.iter().map(|shard_id| (*shard_id, 0)).collect(),
sync_hash: None,
}
}

// TODO(current_epoch_sync_hash): refactor this and use the same logic in get_current_epoch_sync_hash(). Ideally
// that function should go away (at least as it is now) in favor of a more efficient approach that we can call on
// every new block application
fn record_new_chunks(
&mut self,
epoch_manager: &dyn EpochManagerAdapter,
header: &BlockHeader,
) -> Result<bool, Error> {
let shard_layout = epoch_manager.get_shard_layout(header.epoch_id())?;

let mut done = true;
for (shard_id, num_new_chunks) in self.num_new_chunks.iter_mut() {
let shard_index = shard_layout.get_shard_index(*shard_id);
let Some(included) = header.chunk_mask().get(shard_index) else {
return Err(Error::Other(format!(
"can't get shard {} in chunk mask for block {}",
shard_id,
header.hash()
)));
};
if *included {
*num_new_chunks += 1;
}
if *num_new_chunks < 2 {
done = false;
}
}
self.last_checked_hash = *header.hash();
self.last_checked_height = header.height();
Ok(done)
}

fn find_sync_hash(
&mut self,
chain: &Chain,
epoch_manager: &dyn EpochManagerAdapter,
) -> Result<Option<CryptoHash>, Error> {
if let Some(sync_hash) = self.sync_hash {
return Ok(Some(sync_hash));
}

let final_head = chain.final_head()?;

while self.last_checked_height < final_head.height {
let next_hash = match chain.chain_store().get_next_block_hash(&self.last_checked_hash) {
Ok(h) => h,
Err(near_chain_primitives::Error::DBNotFoundErr(_)) => {
return Err(Error::Other(format!(
"final head is #{} {} but get_next_block_hash(#{} {}) is not found",
final_head.height,
final_head.last_block_hash,
self.last_checked_height,
&self.last_checked_hash
)));
}
Err(e) => return Err(e.into()),
};
let next_header = chain.get_block_header(&next_hash)?;
let done = self.record_new_chunks(epoch_manager, &next_header)?;
if done {
// TODO(current_epoch_state_sync): check to make sure the epoch IDs are the same. If there are no new chunks in some shard in the epoch,
// this will be for an epoch ahead of this one
self.sync_hash = Some(next_hash);
break;
}
}
Ok(self.sync_hash)
}
}

impl Client {
pub fn new(
clock: Clock,
Expand Down Expand Up @@ -371,6 +482,7 @@ impl Client {
pending_approvals: lru::LruCache::new(
NonZeroUsize::new(num_block_producer_seats).unwrap(),
),
catchup_tracker: HashMap::new(),
catchup_state_syncs: HashMap::new(),
epoch_sync,
header_sync,
Expand Down Expand Up @@ -2458,6 +2570,57 @@ impl Client {
Ok(false)
}

/// Find the sync hash. Most of the time it will already be set in `state_sync_info`. If not, try to find it,
/// and set the corresponding field in `state_sync_info`.
fn get_catchup_sync_hash_v1(
&mut self,
state_sync_info: &mut StateSyncInfoV1,
epoch_first_block: &BlockHeader,
) -> Result<Option<CryptoHash>, Error> {
if state_sync_info.sync_hash.is_some() {
return Ok(state_sync_info.sync_hash);
}

let new_chunk_tracker = match self.catchup_tracker.entry(*epoch_first_block.epoch_id()) {
std::collections::hash_map::Entry::Occupied(e) => e.into_mut(),
std::collections::hash_map::Entry::Vacant(e) => {
let shard_ids = self.epoch_manager.shard_ids(epoch_first_block.epoch_id())?;
e.insert(NewChunkTracker::new(
*epoch_first_block.hash(),
epoch_first_block.height(),
&shard_ids,
))
}
};

if let Some(sync_hash) =
new_chunk_tracker.find_sync_hash(&self.chain, self.epoch_manager.as_ref())?
{
state_sync_info.sync_hash = Some(sync_hash);
let mut update = self.chain.mut_chain_store().store_update();
// note that iterate_state_sync_infos() collects everything into a Vec, so we're not
// actually writing to the DB while actively iterating this column
update.add_state_sync_info(StateSyncInfo::V1(state_sync_info.clone()));
// TODO: would be nice to be able to propagate context up the call stack so we can just log
// once at the top with all the info. Otherwise this error will look very cryptic
update.commit()?;
}
Ok(state_sync_info.sync_hash)
}

/// Find the sync hash. If syncing to the old epoch's state, it's always set. If syncing to
/// the current epoch's state, it might not yet be known, in which case we try to find it.
fn get_catchup_sync_hash(
&mut self,
state_sync_info: &mut StateSyncInfo,
marcelo-gonzalez marked this conversation as resolved.
Show resolved Hide resolved
epoch_first_block: &BlockHeader,
) -> Result<Option<CryptoHash>, Error> {
match state_sync_info {
StateSyncInfo::V0(info) => Ok(Some(info.sync_hash)),
StateSyncInfo::V1(info) => self.get_catchup_sync_hash_v1(info, epoch_first_block),
}
}

/// Walks through all the ongoing state syncs for future epochs and processes them
pub fn run_catchup(
&mut self,
Expand All @@ -2469,17 +2632,27 @@ impl Client {
let _span = debug_span!(target: "sync", "run_catchup").entered();
let me = signer.as_ref().map(|x| x.validator_id().clone());

for (sync_hash, state_sync_info) in self.chain.chain_store().iterate_state_sync_infos()? {
assert_eq!(sync_hash, state_sync_info.epoch_tail_hash);
for (epoch_first_block, mut state_sync_info) in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method is too large - can you split it down into smaller methods please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extracted a get_catchup_sync_hash() function in f96f762

self.chain.chain_store().iterate_state_sync_infos()?
{
assert_eq!(&epoch_first_block, state_sync_info.epoch_first_block());

let state_sync_timeout = self.config.state_sync_timeout;
let block_header = self.chain.get_block(&sync_hash)?.header().clone();
let block_header = self.chain.get_block(&epoch_first_block)?.header().clone();
let epoch_id = block_header.epoch_id();

let (state_sync, status, blocks_catch_up_state) =
self.catchup_state_syncs.entry(sync_hash).or_insert_with(|| {
tracing::debug!(target: "client", ?sync_hash, "inserting new state sync");
(
StateSync::new(
let sync_hash = match self.get_catchup_sync_hash(&mut state_sync_info, &block_header)? {
Some(h) => h,
None => continue,
};

let CatchupState { state_sync, sync_status: status, catchup } = self
.catchup_state_syncs
.entry(sync_hash)
.or_insert_with(|| {
tracing::debug!(target: "client", ?epoch_first_block, ?sync_hash, "inserting new state sync");
CatchupState {
state_sync: StateSync::new(
self.clock.clone(),
self.runtime_adapter.store().clone(),
self.epoch_manager.clone(),
Expand All @@ -2492,21 +2665,20 @@ impl Client {
self.state_sync_future_spawner.clone(),
true,
),
StateSyncStatus {
sync_status: StateSyncStatus {
sync_hash,
sync_status: HashMap::new(),
download_tasks: Vec::new(),
computation_tasks: Vec::new(),
},
BlocksCatchUpState::new(sync_hash, *epoch_id),
)
catchup: BlocksCatchUpState::new(sync_hash, *epoch_id),
}
});

// For colour decorators to work, they need to printed directly. Otherwise the decorators get escaped, garble output and don't add colours.
debug!(target: "catchup", ?me, ?sync_hash, progress_per_shard = ?status.sync_status, "Catchup");

let tracking_shards: Vec<ShardId> =
state_sync_info.shards.iter().map(|tuple| tuple.0).collect();
state_sync_info.shards().iter().map(|tuple| tuple.0).collect();

// Initialize the new shard sync to contain the shards to split at
// first. It will get updated with the shard sync download status
Expand All @@ -2518,19 +2690,20 @@ impl Client {
self.chain.catchup_blocks_step(
&me,
&sync_hash,
blocks_catch_up_state,
catchup,
block_catch_up_task_scheduler,
)?;

if blocks_catch_up_state.is_finished() {
if catchup.is_finished() {
let mut block_processing_artifacts = BlockProcessingArtifact::default();

self.chain.finish_catchup_blocks(
&me,
&epoch_first_block,
&sync_hash,
&mut block_processing_artifacts,
apply_chunks_done_sender.clone(),
&blocks_catch_up_state.done_blocks,
&catchup.done_blocks,
)?;

self.process_block_processing_artifact(block_processing_artifacts, &signer);
Expand Down Expand Up @@ -2716,11 +2889,11 @@ impl Client {
impl Client {
pub fn get_catchup_status(&self) -> Result<Vec<CatchupStatusView>, near_chain::Error> {
let mut ret = vec![];
for (sync_hash, (_, shard_sync_state, block_catchup_state)) in
for (sync_hash, CatchupState { sync_status, catchup, .. }) in
self.catchup_state_syncs.iter()
{
let sync_block_height = self.chain.get_block_header(sync_hash)?.height();
let shard_sync_status: HashMap<_, _> = shard_sync_state
let shard_sync_status: HashMap<_, _> = sync_status
.sync_status
.iter()
.map(|(shard_id, state)| (*shard_id, state.to_string()))
Expand All @@ -2729,7 +2902,7 @@ impl Client {
sync_block_hash: *sync_hash,
sync_block_height,
shard_sync_status,
blocks_to_catchup: self.chain.get_block_catchup_status(block_catchup_state),
blocks_to_catchup: self.chain.get_block_catchup_status(catchup),
});
}
Ok(ret)
Expand Down
Loading
Loading