Fast find ancestor #4397

jackzhhuang · 2025-02-14T02:26:37Z

Pull request type

Please check the type of change your PR introduces:

What is the current behavior?

Issue Number: N/A

What is the new behavior?

Other information

Summary by CodeRabbit

New Features
- Introduced range locate capabilities for blockchain queries, allowing users to request specific block ranges using start and optional end identifiers.
- Extended the RPC interface and chain services to support new range-based query operations.
- Added a new configuration option for enabling range locate in synchronization tasks.
Tests
- Enhanced test coverage with new tests verifying range locating, common header discovery, and chain ancestry validation.
Chores
- Integrated a new storage dependency to underpin the range locate functionality.

coderabbitai · 2025-02-14T02:26:47Z

Walkthrough

This pull request adds a new range location feature through multiple modules. A dependency on starcoin-storage is introduced, and a new public module range_locate is added to the API. Enum variants for range-based queries are integrated into request/response types, and corresponding functions to locate block ranges and common ancestors are implemented. New methods in chain services, sync tasks, and network RPC allow clients to query ranges. Additionally, the sync configuration is enhanced and tests are updated to validate the new functionalities.

Changes

File(s)	Change Summary
chain/api/*	- Added dependency in `Cargo.toml` (`starcoin-storage` with workspace flag). - Introduced public module `range_locate` in `lib.rs`. - Extended `ChainRequest` and `ChainResponse` enums with `GetRangeInLocation` variants in `message.rs`. - Added range location functions in `range_locate.rs` and new methods in `service.rs`.
chain/mock/src/mock_chain.rs	- Added methods `new_with_genesis_for_test` and `connect` to the `MockChain` struct. - Updated imports for `ExecutedBlock`.
chain/tests/test_range_locate.rs	- Added helper function `create_block` and tests: `test_range_locate`, `test_not_in_range_locate`, and `test_same_range_request`.
config/src/sync_config.rs	- Introduced new optional field `range_locate` in `SyncConfig` with associated getter and merge logic.
flexidag/src/blockdag.rs	- Added `check_ancestor_of_chain` method to verify chain ancestry.
network-rpc/api/src/lib.rs network-rpc/src/rpc.rs	- Added new RPC method `get_range_in_location` and associated data structures (`GetRangeInLocationRequest`, `RangeInLocation`, `GetRangeInLocationResponse`).
sync/*	- Updated `SyncService` to include a `range_locate` parameter in `check_and_start_sync`. - In tasks: added `FindRangeLocateTask`, `DagAncestorCollector`, implemented `BlockIdRangeFetcher` in mock and VerifiedRpcClient, and modified `full_sync_task` and related tests to pass the range locate flag.

Sequence Diagram(s)

sequenceDiagram
    participant C as Client
    participant NR as NetworkRpcImpl
    participant CRS as ChainReaderService
    participant RL as RangeLocate Module

    C->>NR: get_range_in_location(req)
    NR->>CRS: Forward GetRangeInLocation request
    CRS->>RL: Call get_range_in_location(start_id, end_id)
    RL-->>CRS: Return RangeInLocation result
    CRS-->>NR: Return ChainResponse::GetRangeInLocation
    NR-->>C: Return GetRangeInLocationResponse

sequenceDiagram
    participant SS as SyncService
    participant FST as full_sync_task
    participant GAT as generate_ancestor_task
    participant AT as Ancestor Task (RangeLocate/Standard)

    SS->>FST: start sync (with range_locate flag)
    FST->>GAT: generate ancestor task based on flag
    alt range locate enabled
      GAT->>AT: Create FindRangeLocateTask
    else
      GAT->>AT: Create FindAncestorTask
    end
    AT-->>GAT: Return ancestor result
    GAT-->>FST: Pass result
    FST-->>SS: Sync process complete

Suggested reviewers

nkysg
sanlee42
welbon

Poem

Oh, how my paws tap on code so neat,
In fields of ranges, where data and blocks meet.
I nibble on queries, swift and bright,
Hop along sync paths, through day and night.
With a twitch of my ear, I celebrate this feat,
Code carrots in abundance — truly a tasty treat!
(^_^)

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5fffd0c and 7781248.

📒 Files selected for processing (1)

sync/src/tasks/find_common_ancestor_task.rs (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

sync/src/tasks/find_common_ancestor_task.rs

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 5

🔭 Outside diff range comments (1)

sync/src/tasks/find_common_ancestor_task.rs (1)
179-233: ⚠️ Potential issue

Confirm reversed condition for checking a DAG connection.

In DagAncestorCollector::collect(), the code bails out when dag.has_block_connected(&block_header)? is true, reporting a failure. Typically, we would like to confirm the block is indeed in the DAG. Verify if this condition is reversed or if you intend to stop immediately upon finding a connected block as an error.
-if self.dag.has_block_connected(&block_header)? {
-    bail!(
-        "failed to check the found common ancestor in dag, id: {:?}",
-        item
-    );
-}
+if !self.dag.has_block_connected(&block_header)? {
+    bail!("Common ancestor is not connected in DAG, id: {:?}", item);
+}

🧹 Nitpick comments (6)

chain/api/src/range_locate.rs (2)

12-25: Consider clarifying or unifying error naming across enums.

While FindCommonHeaderError::InvalidRange and FindCommonHeaderError::RangeLen both refer to invalid range-related problems, the naming is partially separated. You might unify them for clarity, e.g., using a single variant with a descriptive message for length constraints.

86-135: Review exponential indexing for large chains.

The for-loop uses powers of two (2u64.pow(index)) up to 17, which can skip large sections of the chain. This design might be intentional for acceleration, but confirm it does not unintentionally skip relevant blocks or exceed the chain range in certain scenarios.

sync/src/tasks/find_common_ancestor_task.rs (1)

20-46: Adjust type fields for clarity in FindRangeLocateTask.

While the fields start_id and end_id are clear individually, consider documenting or renaming them to convey how they're updated during range location (e.g., “current_start” or “search_boundary”) to improve readability for future maintainers.

sync/src/tasks/mod.rs (2)

45-46: Consider doc-comment for SyncFetcher's extension of BlockIdRangeFetcher.

SyncFetcher now inherits BlockIdRangeFetcher, expanding its scope. A doc-comment explaining when to prefer range-based vs. single-block fetching would clarify usage for new contributors.

239-246: Ensure consistent error handling in BlockIdRangeFetcher trait implementations.

All three implementations use similar error mapping. Consider centralizing or ensuring they align with your broader sync error-handling strategy (e.g., distinguishing network vs. invalid request errors).

Also applies to: 275-285, 312-324
chain/tests/test_range_locate.rs (1)
77-102: Consider removing debug println statements.

These debug prints should be replaced with proper logging or removed before merging.

Apply this diff to use proper logging:
-        println!(
-            "remote_start_id: {:?}, remote_end_id: {:?}",
-            remote_start_id, remote_end_id
-        );
+        debug!(
+            "remote_start_id: {:?}, remote_end_id: {:?}",
+            remote_start_id, remote_end_id
+        );
-            println!(
-                "result block id: {:?}, number: {:?}",
-                header.id(),
-                header.number()
-            );
+            debug!(
+                "result block id: {:?}, number: {:?}",
+                header.id(),
+                header.number()
+            );

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5c5275c and ebf1826.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (21)

chain/api/Cargo.toml (1 hunks)
chain/api/src/lib.rs (1 hunks)
chain/api/src/message.rs (3 hunks)
chain/api/src/range_locate.rs (1 hunks)
chain/api/src/service.rs (4 hunks)
chain/mock/src/mock_chain.rs (3 hunks)
chain/service/src/chain_service.rs (3 hunks)
chain/tests/test_range_locate.rs (1 hunks)
config/src/sync_config.rs (3 hunks)
flexidag/src/blockdag.rs (1 hunks)
flexidag/src/prune/pruning_point_manager.rs (1 hunks)
network-rpc/api/src/lib.rs (2 hunks)
network-rpc/src/rpc.rs (2 hunks)
sync/src/sync.rs (2 hunks)
sync/src/tasks/find_common_ancestor_task.rs (1 hunks)
sync/src/tasks/mock.rs (2 hunks)
sync/src/tasks/mod.rs (11 hunks)
sync/src/tasks/test_tools.rs (2 hunks)
sync/src/tasks/tests.rs (8 hunks)
sync/src/tasks/tests_dag.rs (1 hunks)
sync/src/verified_rpc_client.rs (2 hunks)

✅ Files skipped from review due to trivial changes (1)

chain/api/src/lib.rs

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Generate code coverage report

🔇 Additional comments (25)

chain/api/src/range_locate.rs (1)

33-84: Validate correctness of partial range checks.

The function checks pairs of hashes and returns early if a pair is not reachable. Ensure that this early-return logic aligns with your intended behavior for multi-segment validation, especially in edge cases where a single unreachable pair still permits other segments to proceed.

sync/src/tasks/find_common_ancestor_task.rs (1)

49-134: Check loop termination condition in new_sub_task.

The loop continues re-fetching ranges until finding an intersection or hitting the genesis block. Ensure there's no infinite loop possibility if start_id becomes stuck in an unrecognized state, e.g., if the DAG or storage has an inconsistency.

sync/src/tasks/mod.rs (1)

728-728: Re-verify full_sync_task fallback flows when range_locate is false.

The new parameter is introduced but the fallback logic might still be relevant for certain chain types. Please confirm that switching between the range-locate approach and the classic approach does not cause confusion or partial usage in the future.

Also applies to: 762-775

config/src/sync_config.rs (2)

44-48: LGTM! Well-documented configuration option.

The new range_locate field is properly implemented with:

Clear documentation explaining the purpose and complexity

Appropriate serde attributes for serialization

Command-line argument support

64-66: LGTM! Clean getter implementation.

The getter method follows the established pattern in the codebase, providing a default value of false when the option is not set.

chain/api/src/message.rs (2)

76-79: LGTM! Well-structured request variant.

The GetRangeInLocation request variant is properly designed with:

Required start_id parameter

Optional end_id parameter for flexible range queries

112-112: LGTM! Matching response variant.

The response variant correctly encapsulates the range data using the RangeInPruningPoint type.

flexidag/src/prune/pruning_point_manager.rs (1)

110-110: LGTM! Simplified return logic.

The code change improves readability by directly returning previous_pruning_point when the selected parents match.

sync/src/tasks/tests_dag.rs (1)

53-53: LGTM! Test updated for range_locate feature.

The full_sync_task call is properly updated with the new range_locate parameter set to false, consistent with the default configuration.

chain/tests/test_range_locate.rs (3)

14-28: LGTM! Well-structured helper function.

The function is well-implemented with proper error handling and clear block creation logic.

145-171: LGTM! Good negative test case.

The test properly verifies the behavior when a block is not in the selected chain.

173-214: LGTM! Good edge case coverage.

The test effectively verifies the behavior when requesting the same range multiple times.

sync/src/tasks/test_tools.rs (1)

161-161: Document the purpose of the new boolean parameter.

The new parameter added to full_sync_task calls is set to false, but its purpose is not documented. Please add documentation explaining what this parameter controls.

Also applies to: 194-194

chain/mock/src/mock_chain.rs (2)

58-77: LGTM!

The function follows the established pattern of other constructor functions in the file and correctly initializes the chain with a specified genesis block and parameters.

349-352: LGTM!

The function is a simple and straightforward wrapper around the connect method of the head field.

chain/service/src/chain_service.rs (2)

264-268: LGTM!

The match arm correctly handles the GetRangeInLocation request by delegating to the inner service and wrapping the response appropriately.

491-497: LGTM!

The function correctly delegates to the range_locate::get_range_in_location function with the appropriate parameters.

chain/api/src/service.rs (1)

82-86: LGTM!

The trait method signature is consistent with other methods in the trait and uses appropriate parameter and return types.

sync/src/verified_rpc_client.rs (2)

18-18: LGTM!

The import statement is correctly placed and follows the file's import organization pattern.

898-943: LGTM! The implementation follows established patterns.

The fetch_range_locate method is well-implemented with:

Proper error handling

Consistent retry mechanism

Clear parameter validation

Follows the same pattern as other RPC methods in the file

flexidag/src/blockdag.rs (1)

207-218: LGTM! The implementation is clean and follows the codebase patterns.

The check_ancestor_of_chain method is well-implemented:

Proper error handling with map_err

Consistent with existing check_ancestor_of method

Clear parameter naming

sync/src/tasks/tests.rs (1)

131-131: LGTM! Test updates are consistent.

The range_locate parameter is consistently set to false across all test functions, maintaining the original test behavior while accommodating the new feature.

Also applies to: 167-167, 219-219, 273-273, 313-313, 368-368, 861-861, 1013-1013

sync/src/sync.rs (1)

461-461: LGTM! Configuration integration is clean.

The range_locate configuration is properly:

Retrieved from the sync config

Passed to the full_sync_task function

Maintains consistency with existing configuration pattern

Also applies to: 499-499
chain/api/Cargo.toml (2)

21-21: LGTM!

The addition of starcoin-storage dependency follows the project's conventions by using workspace-level version management.

34-34:
❓ Verification inconclusive

Verify if package version needs to be updated.

Since this PR adds a new feature for range location, consider if the package version needs to be updated according to semantic versioning principles.

🏁 Script executed:
#!/bin/bash
# Description: Check current version and recent version changes

# Check version in the main branch for comparison
git fetch origin main
git show origin/main:chain/api/Cargo.toml | grep "^version ="

# Check recent version changes
git log -p --grep="version" chain/api/Cargo.toml
Length of output: 166
Action Required: Verify and Possibly Update the Cargo.toml Version

The file chain/api/Cargo.toml still indicates version = "2.0.2", but since this PR introduces a new feature for range location, semantic versioning principles suggest that this new functionality could warrant a version bump (typically a minor update, such as moving to 2.1.0, if the change is backward-compatible).

Next Steps:

Remote Branch Issue: The initial check failed because the script targeted origin/main, which doesn’t exist in this repository. Please manually verify the version from the actual default branch (likely origin/master).

Version Bump Consideration: Once you confirm that the default branch also uses version 2.0.2, decide if the addition of the range location feature merits an update—typically bumping the minor version to 2.1.0 (assuming it’s a backward-compatible enhancement).

Please ensure these points are reviewed before merging.

sync/src/tasks/mod.rs

network-rpc/api/src/lib.rs

network-rpc/src/rpc.rs

sync/src/tasks/mock.rs

chain/api/src/service.rs

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

chain/api/src/range_locate.rs (2)

20-25: Consider adding more descriptive error messages.
While the existing variants communicate error categories, users may benefit from clarifying the context or recommended actions in each message.

86-135: Consider externalizing the skip exponent.
The loop enumerates up to index 17, using 2u64.pow(index) to skip through blocks. This may be satisfactory for typical chain sizes, but if expansions or modifications are needed, externalizing this exponent limit (or providing config) can offer future flexibility.

network-rpc/api/src/lib.rs (1)

325-329: Potential duplication of existing enum.
RangeInLocation appears both here and in chain/api/src/range_locate.rs. If feasible, consider consolidating or referencing a common definition to reduce fragmentation.
sync/src/tasks/mod.rs (1)
651-710: Consider improving the function's readability.

The generate_ancestor_task function is quite complex with many parameters. Consider breaking it down into smaller functions or using a builder pattern.

Consider refactoring to use a builder pattern:
pub struct AncestorTaskBuilder<F> where F: SyncFetcher + 'static {
    range_locate: bool,
    current_block_number: u64,
    target_block_number: u64,
    fetcher: Arc<F>,
    max_retry_times: u64,
    delay_milliseconds_on_error: u64,
    current_block_info: BlockInfo,
    storage: Arc<dyn Store>,
    event_handle: Arc<TaskEventCounterHandle>,
    ext_error_handle: Arc<ExtSyncTaskErrorHandle<F>>,
    dag: BlockDAG,
}

impl<F> AncestorTaskBuilder<F> where F: SyncFetcher + 'static {
    pub fn new(fetcher: Arc<F>) -> Self {
        // Initialize with default values
    }
    
    pub fn range_locate(mut self, value: bool) -> Self {
        self.range_locate = value;
        self
    }
    
    // Add other builder methods...
    
    pub fn build(self) -> (
        std::pin::Pin<Box<dyn Future<Output = std::result::Result<BlockIdAndNumber, TaskError>> + Send>>,
        TaskHandle,
    ) {
        // Current implementation
    }
}

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ebf1826 and a9c2ac5.

📒 Files selected for processing (10)

chain/api/src/message.rs (3 hunks)
chain/api/src/range_locate.rs (1 hunks)
chain/api/src/service.rs (4 hunks)
chain/service/src/chain_service.rs (3 hunks)
chain/tests/test_range_locate.rs (1 hunks)
network-rpc/api/src/lib.rs (2 hunks)
sync/src/tasks/find_common_ancestor_task.rs (1 hunks)
sync/src/tasks/mock.rs (2 hunks)
sync/src/tasks/mod.rs (11 hunks)
sync/src/verified_rpc_client.rs (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

chain/api/src/message.rs
sync/src/tasks/mock.rs
sync/src/tasks/find_common_ancestor_task.rs

🔇 Additional comments (19)

chain/api/src/range_locate.rs (3)

12-19: Enums look good.
The FindCommonHeader variants clearly distinguish distinct scenarios.

27-31: Type clarity is consistent.
The RangeInLocation enum is understandable and reflects the chain membership status.

33-84: Verify short-circuit checks for partial coverage.
In find_common_header_in_range, if any (start, end) pair fails the “in-range” check, the function immediately returns without assessing remaining pairs. Confirm that this short-circuit approach is intended for partial or inconsistent ranges. Otherwise, consider collecting all out-of-range portions before returning.

chain/tests/test_range_locate.rs (4)

14-28: Good helper function.
create_block is straightforward, helps produce and connect blocks for testing.

30-144: Comprehensive test coverage for fork handling.
The test_range_locate function thoroughly checks remote and local chains after diverging. The debug prints are helpful for clarity, though consider switching to logs if noise becomes an issue.

146-171: Verifies chain membership properly.
test_not_in_range_locate confirms that the requested block is not in the selected chain, matching expected logic.

173-215: Ensures correctness of same-range requests.
test_same_range_request validates the edge case where the start and end are identical, confirming a zero-length range.

network-rpc/api/src/lib.rs (3)

303-307: Method signature aligns with new request/response.
get_range_in_location extends the NetworkRpc trait correctly, keeping with the established conventions.

319-323: Reintroduce RPC request validation.
Implementation of RpcRequest was previously recommended for GetRangeInLocationRequest to enforce request size constraints. Since it has not been added, consider adopting the same pattern as other request types to avoid potential misuse.

331-334: Sufficient for encapsulating range results.
GetRangeInLocationResponse neatly wraps the chain membership status and block list for RPC responses.

chain/service/src/chain_service.rs (3)

7-7: LGTM!

The import statement correctly brings in the required range locate module and types.

264-268: LGTM!

The implementation of the GetRangeInLocation request handler follows the established pattern in the codebase and correctly delegates to the inner service.

491-497: LGTM!

The get_range_in_location method implementation correctly delegates to the range_locate module while maintaining proper error handling.

chain/api/src/service.rs (3)

4-4: LGTM!

The imports are correctly organized and bring in the necessary types for range location functionality.

Also applies to: 10-10

239-246: LGTM!

The BlockIdRangeFetcher trait is well-designed:

Follows Rust trait best practices with Send + Sync bounds

Method signature is clear and consistent with other fetcher traits

82-86: LGTM!

The get_range_in_location method in ReadableChainService trait has a clear signature that matches the implementation.

sync/src/tasks/mod.rs (2)

44-46: LGTM!

The SyncFetcher trait correctly composes multiple fetcher traits using the trait composition pattern.

728-728: LGTM!

The range_locate parameter addition to full_sync_task is backward compatible and follows the function's parameter pattern.

sync/src/verified_rpc_client.rs (1)

18-18: LGTM!

The import statement correctly brings in the required types for range location functionality.

sync/src/verified_rpc_client.rs

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (12)

chain/api/src/range_locate.rs (3)
20-25: Potentially rename RangeLen.

While FindCommonHeaderError::RangeLen conveys the error meaning, consider something more descriptive like RangeLengthError or InsufficientRangeLength for improved clarity.
-enum FindCommonHeaderError {
-    InvalidRange(String),
-    CheckAncestor(String),
-    RangeLen,
-}
+enum FindCommonHeaderError {
+    InvalidRange(String),
+    CheckAncestor(String),
+    InsufficientRangeLength,
+}
33-84: Validate iteration logic for multi-range checks.

Function find_common_header_in_range iterates pairwise through the ranges. This is correct for a typical adjacency check, but if you anticipate non-adjacent checks or deeper validation, you might expand how these ranges are evaluated. Current approach is efficient for adjacency-based constraints.

86-135: Consider introducing a configurable exponent limit.

for index in 0..=17 with 2u64.pow(index) effectively checks up to 2^17 offsets. For some ramps, making this limit configurable or derived dynamically from chain height could improve flexibility.
-    for index in 0..=17 {
+    let max_exponent = 17; // or derive from chain size, e.g. log2 of chain.current_header().number()
+    for index in 0..=max_exponent {
         let block_number = start_block_header.number().saturating_add(2u64.pow(index));
         ...
     }
config/src/sync_config.rs (1)

83-85: Merging of range_locate is consistent with other fields.

In the future, consider a more descriptive doc comment for advanced users to know how to handle partial merges, but this is sufficient for now.

chain/tests/test_range_locate.rs (1)

30-143: Test coverage is thorough, but consider smaller logically focused tests.

test_range_locate covers end-to-end forking, block application, and partial range retrieval. Splitting it into multiple smaller tests could improve clarity and maintainability, though this is optional since it's still quite readable.
sync/src/tasks/find_common_ancestor_task.rs (2)
73-76: Consider adding error context for genesis retrieval failure.

The error message for genesis retrieval failure could be more descriptive.

Apply this diff to improve the error message:
-                                format_err!("faild to get the genesis in find range locate task")
+                                format_err!("Failed to get the genesis block while finding range location. This could indicate database corruption or initialization issues.")
52-134: Consider adding timeout for the infinite loop.

The new_sub_task method contains a potentially infinite loop that could hang if the network is unresponsive.

Consider adding a timeout mechanism or maximum iteration count to prevent potential hangs:
+    const MAX_ITERATIONS: u32 = 100; // Or any other reasonable number
     fn new_sub_task(self) -> BoxFuture<'static, Result<Vec<Self::Item>>> {
         async move {
             let mut start_id = self.start_id;
             let mut end_id = self.end_id;
             let mut found_common_header = None;
+            let mut iterations = 0;
             loop {
+                if iterations >= Self::MAX_ITERATIONS {
+                    bail!("Exceeded maximum iterations while finding common ancestor");
+                }
+                iterations += 1;
                 match self
                     .fetcher
                     .fetch_range_locate(None, start_id, end_id)
network-rpc/api/src/lib.rs (1)
326-329: Document the enum variants.

The RangeInLocation enum lacks documentation explaining the purpose and usage of each variant.

Apply this diff to add documentation:
 #[derive(Debug, Serialize, Deserialize, Clone)]
 pub enum RangeInLocation {
+    /// Indicates that the requested range is not in the selected chain
     NotInSelectedChain,
+    /// Contains the found hash value and a vector of hash values in the selected chain
+    /// First HashValue: The found hash
+    /// Vec<HashValue>: Additional hash values in the range
     InSelectedChain(HashValue, Vec<HashValue>),
 }
chain/mock/src/mock_chain.rs (1)
58-77: Add documentation for the new test method.

The new_with_genesis_for_test method lacks documentation explaining its purpose and parameters.

Add documentation to clarify the method's usage:
+/// Creates a new mock chain with a custom genesis block for testing purposes.
+/// 
+/// # Arguments
+/// * `net` - The chain network configuration
+/// * `genesis` - The custom genesis block
+/// * `k` - The k-value parameter for the DAG
+/// 
+/// # Returns
+/// A new MockChain instance initialized with the custom genesis block
 pub fn new_with_genesis_for_test(
     net: ChainNetwork,
     genesis: Genesis,
     k: KType,
 ) -> anyhow::Result<Self> {
chain/service/src/chain_service.rs (1)
491-497: Consider enhancing error handling.

The get_range_in_location method could benefit from more descriptive error messages to aid in debugging.

Consider wrapping the error with additional context:
-        range_locate::get_range_in_location(self.get_main(), self.storage.clone(), start_id, end_id)
+        range_locate::get_range_in_location(self.get_main(), self.storage.clone(), start_id, end_id)
+            .map_err(|e| format_err!("Failed to get range in location for start_id={}, end_id={:?}: {}", start_id, end_id, e))
sync/src/tasks/mod.rs (2)
651-710: Consider adding documentation for the generate_ancestor_task function.

The function has many parameters and complex logic but lacks documentation explaining its purpose and parameter requirements.

Add documentation to improve maintainability:
+/// Generates an ancestor task based on the specified parameters.
+///
+/// # Arguments
+/// * `range_locate` - Whether to use range location for finding ancestors
+/// * `current_block_number` - The current block number
+/// * `target_block_number` - The target block number
+/// * `fetcher` - The fetcher implementation for retrieving blocks
+/// * `max_retry_times` - Maximum number of retry attempts
+/// * `delay_milliseconds_on_error` - Delay between retries in milliseconds
+/// * `current_block_info` - Current block information
+/// * `storage` - Storage implementation
+/// * `event_handle` - Event counter handle
+/// * `ext_error_handle` - External error handler
+/// * `dag` - DAG implementation
+///
+/// # Returns
+/// A tuple containing the future task and its handle
 pub fn generate_ancestor_task<F>(
672-709: Consider extracting task generation logic.

The task generation logic in the if-else block is quite similar and could be extracted to reduce duplication.

Consider refactoring to:
+fn create_task_generator<T, F>(
+    task: T,
+    fetcher: Arc<F>,
+    max_retry_times: u64,
+    delay_milliseconds_on_error: u64,
+    collector: impl TaskCollector,
+    event_handle: Arc<TaskEventCounterHandle>,
+    ext_error_handle: Arc<ExtSyncTaskErrorHandle<F>>,
+) -> TaskGenerator<T, F>
+where
+    F: SyncFetcher + 'static,
+{
+    TaskGenerator::new(
+        task,
+        2,
+        max_retry_times,
+        delay_milliseconds_on_error,
+        collector,
+        event_handle,
+        ext_error_handle,
+    )
+}
+
 let sync_task = if range_locate {
-    TaskGenerator::new(
+    create_task_generator(
         FindRangeLocateTask::new(
             *current_block_info.block_id(),
             None,
             fetcher.clone(),
             storage.clone(),
             dag.clone(),
         ),
-        2,
-        max_retry_times,
-        delay_milliseconds_on_error,
         DagAncestorCollector::new(dag, storage),
-        event_handle.clone(),
-        ext_error_handle.clone(),
     )
-    .generate()
 } else {
-    TaskGenerator::new(
+    create_task_generator(
         FindAncestorTask::new(
             current_block_number,
             target_block_number,
             MAX_BLOCK_IDS_REQUEST_SIZE,
             fetcher.clone(),
         ),
-        2,
-        max_retry_times,
-        delay_milliseconds_on_error,
         AncestorCollector::new(Arc::new(MerkleAccumulator::new_with_info(
             current_block_info.block_accumulator_info.clone(),
             storage.get_accumulator_store(AccumulatorStoreType::Block),
         ))),
-        event_handle.clone(),
-        ext_error_handle.clone(),
     )
-    .generate()
 }
+.generate()

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a9c2ac5 and 5fffd0c.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (20)

chain/api/Cargo.toml (1 hunks)
chain/api/src/lib.rs (1 hunks)
chain/api/src/message.rs (3 hunks)
chain/api/src/range_locate.rs (1 hunks)
chain/api/src/service.rs (4 hunks)
chain/mock/src/mock_chain.rs (3 hunks)
chain/service/src/chain_service.rs (3 hunks)
chain/tests/test_range_locate.rs (1 hunks)
config/src/sync_config.rs (3 hunks)
flexidag/src/blockdag.rs (1 hunks)
network-rpc/api/src/lib.rs (2 hunks)
network-rpc/src/rpc.rs (2 hunks)
sync/src/sync.rs (2 hunks)
sync/src/tasks/find_common_ancestor_task.rs (1 hunks)
sync/src/tasks/mock.rs (2 hunks)
sync/src/tasks/mod.rs (11 hunks)
sync/src/tasks/test_tools.rs (2 hunks)
sync/src/tasks/tests.rs (8 hunks)
sync/src/tasks/tests_dag.rs (1 hunks)
sync/src/verified_rpc_client.rs (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (9)

chain/api/src/lib.rs
chain/api/Cargo.toml
sync/src/tasks/tests_dag.rs
sync/src/tasks/test_tools.rs
chain/api/src/message.rs
sync/src/tasks/mock.rs
chain/api/src/service.rs
sync/src/sync.rs
sync/src/tasks/tests.rs

🔇 Additional comments (16)

chain/api/src/range_locate.rs (2)

12-18: Enums naming looks clear and self-explanatory.

The FindCommonHeader variants are well-documented, reflecting distinct states effectively. No immediate changes needed here.

27-31: Usage of PartialEq and Eq is good.

RangeInLocation enabling equality checks is helpful for straightforward result comparisons in tests.

config/src/sync_config.rs (2)

44-48: Good addition of the range_locate field.

Declaring range_locate as an Option<bool> and skipping serialization if None allows for backward compatibility.

64-66: Method name is concise and consistent.

Defaulting range_locate to false when unset is a sensible approach. No changes needed here.

chain/tests/test_range_locate.rs (3)

1-28: Helper function looks solid.

create_block is a clear test utility, systematically creating, applying, and connecting blocks. This helps keep test logic concise.

145-171: Negative test scenario is well-handled.

test_not_in_range_locate ensures coverage for blocks not in the selected chain. This scenario is often overlooked; good job addressing it.

173-215: Verifies repeated requests correctly.

test_same_range_request adds a helpful edge case to confirm that multiple identical requests produce stable outcomes. This ensures correct handling of consecutive or redundant calls.

network-rpc/api/src/lib.rs (1)

319-323: Add size validation for request parameters.

The GetRangeInLocationRequest should implement the RpcRequest trait to enforce size limits on the request parameters, similar to other request types in this file.
chain/mock/src/mock_chain.rs (1)

349-352: Consider adding validation for block state.

The connect method should validate that the block's state matches the chain's state before connecting.

Add validation to ensure the block's state is consistent with the chain:
 pub fn connect(&mut self, block: ExecutedBlock) -> Result<()> {
+    // Verify block's parent exists and state is consistent
+    if !self.head.has_block(block.header().parent_hash())? {
+        bail!("Parent block not found");
+    }
     self.head.connect(block)?;
     Ok(())
 }
chain/service/src/chain_service.rs (2)

7-7: LGTM! Clean import organization.

The new import for range_locate is properly organized with the existing imports.

264-268: LGTM! Clean request handler implementation.

The GetRangeInLocation request handler follows the established pattern of other request handlers in the service.

sync/src/tasks/mod.rs (1)

239-246: LGTM! Well-designed trait interface.

The BlockIdRangeFetcher trait has a clean and focused interface with appropriate use of BoxFuture for async operations.

sync/src/verified_rpc_client.rs (1)

898-943: LGTM! Well-implemented RPC method.

The fetch_range_locate implementation follows the established patterns:

Consistent error handling with other RPC methods

Proper retry logic with G_RPC_RETRY_COUNT

Clean peer selection logic

flexidag/src/blockdag.rs (1)

207-219: LGTM! Clean and focused implementation.

The check_ancestor_of_chain method:

Has clear separation from check_ancestor_of

Uses appropriate error handling with map_err

Maintains consistent style with the rest of the codebase

network-rpc/src/rpc.rs (2)

14-17: LGTM!

The new imports for range location types are correctly placed and properly organized within the existing import group.

331-339: LGTM!

The implementation follows the established pattern for RPC methods and correctly forwards the request to the chain service. Since GetRangeInLocationRequest contains only scalar fields and validation is handled in the implementation layer, no additional validation is needed at the RPC layer.

sync/src/tasks/find_common_ancestor_task.rs

jackzhhuang requested review from sanlee42 and nkysg as code owners February 14, 2025 02:26

coderabbitai bot reviewed Feb 14, 2025

View reviewed changes

sync/src/tasks/mod.rs Show resolved Hide resolved

network-rpc/api/src/lib.rs Show resolved Hide resolved

network-rpc/src/rpc.rs Show resolved Hide resolved

sync/src/tasks/mock.rs Outdated Show resolved Hide resolved

chain/api/src/service.rs Outdated Show resolved Hide resolved

coderabbitai bot reviewed Feb 14, 2025

View reviewed changes

sync/src/verified_rpc_client.rs Outdated Show resolved Hide resolved

jackzhhuang added 15 commits February 14, 2025 11:26

fix the pruning point calc

a2e27a4

remove test log

d91f939

add locate range logic

06ff3fc

commit temporarily

dd9f24a

add test case

36371dd

add test case for location range

5f3114e

add first test case

76784a2

add 2 test case

ffbcdc1

add rpc code

25c7016

add dag ancestor task

7c5210f

no chain reader in dag ancestor task

0e33a3b

add sync using range locate

2509870

fix clippy

1bf43b1

change name from RangeInPruningPoint to RangeInLocation

9749b42

fix for the comments from the code rabbit

5fffd0c

jackzhhuang force-pushed the fast-find-ancestor branch from a9c2ac5 to 5fffd0c Compare February 14, 2025 03:27

coderabbitai bot reviewed Feb 14, 2025

View reviewed changes

sync/src/tasks/find_common_ancestor_task.rs Outdated Show resolved Hide resolved

fix according to the code rabbit comment

7781248

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast find ancestor #4397

Fast find ancestor #4397

jackzhhuang commented Feb 14, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 14, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

Fast find ancestor #4397

Are you sure you want to change the base?

Fast find ancestor #4397

Conversation

jackzhhuang commented Feb 14, 2025 • edited by coderabbitai bot Loading

Pull request type

What is the current behavior?

What is the new behavior?

Other information

Summary by CodeRabbit

coderabbitai bot commented Feb 14, 2025 • edited Loading

Walkthrough

Changes

Sequence Diagram(s)

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

jackzhhuang commented Feb 14, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 14, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)