Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for 64bit immediate with type 2 #820

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Alan-Jowett
Copy link
Contributor

@Alan-Jowett Alan-Jowett commented Jan 7, 2025

From the ISA RFC:
5.4.  64-bit immediate instructions

   Instructions with the IMM 'mode' modifier use the wide instruction
   encoding defined in Instruction encoding (Section 3), and use the
   'src_reg' field of the basic instruction to hold an opcode subtype.

   The following table defines a set of {IMM, DW, LD} instructions with
   opcode subtypes in the 'src_reg' field, using new terms such as "map"
   defined further below:

    +=========+================================+==========+==========+
    | src_reg | pseudocode                     | imm type | dst type |
    +=========+================================+==========+==========+
    | 0x0     | dst = (next_imm << 32) | imm   | integer  | integer  |
    +---------+--------------------------------+----------+----------+
    | 0x1     | dst = map_by_fd(imm)           | map fd   | map      |
    +---------+--------------------------------+----------+----------+
    | 0x2     | dst = map_val(map_by_fd(imm))  | map fd   | data     |
    |         | + next_imm                     |          | address  |
    +---------+--------------------------------+----------+----------+
    | 0x3     | dst = var_addr(imm)            | variable | data     |
    |         |                                | id       | address  |
    +---------+--------------------------------+----------+----------+
    | 0x4     | dst = code_addr(imm)           | integer  | code     |
    |         |                                |          | address  |
    +---------+--------------------------------+----------+----------+
    | 0x5     | dst = map_by_idx(imm)          | map      | map      |
    |         |                                | index    |          |
    +---------+--------------------------------+----------+----------+
    | 0x6     | dst = map_val(map_by_idx(imm)) | map      | data     |
    |         | + next_imm                     | index    | address  |
    +---------+--------------------------------+----------+----------+

                 Table 12: 64-bit immediate instructions

Summary by CodeRabbit

  • New Features

    • Added support for loading map addresses and file descriptors.
    • Introduced new instruction types for map-related operations.
    • Enhanced eBPF instruction set with additional loading modes.
    • Implemented handling for new assembly instruction formats related to maps.
  • Bug Fixes

    • Improved handling of global variables in the verification process.
  • Tests

    • Added new test cases for map address and file descriptor loading.
    • Expanded YAML test suite with map-related test scenarios.
    • Verified functionality of the LoadMapAddress instruction.
  • Refactor

    • Updated instruction parsing and marshaling logic.
    • Extended instruction type handling across multiple components.

Copy link

coderabbitai bot commented Jan 7, 2025

Walkthrough

This pull request introduces support for a new LoadMapAddress instruction type in the eBPF verifier. The changes span multiple files, adding infrastructure to handle loading map addresses, including parsing, marshaling, unmarshaling, and verification. New constants and instruction modes are defined to support this functionality, enabling more complex map-related operations in eBPF programs.

Changes

File Change Summary
external/libbtf Subproject commit updated
src/asm_cfg.cpp Extended instype function to handle LoadMapAddress instruction type
src/asm_files.cpp Introduced relocate_global_variable function and updated read_elf for global variable sections
src/asm_marshal.cpp Updated makeLddw function signature and added handling for LoadMapAddress
src/asm_ostream.cpp Added method for printing LoadMapAddress and updated size calculation
src/asm_parse.cpp Introduced regex patterns for LoadMapAddress and LoadMapFd
src/asm_syntax.hpp Added LoadMapAddress structure and updated Instruction variant type
src/asm_unmarshal.cpp Updated conditions in makeLddw for handling LoadMapAddress
src/assertions.cpp Added handling for LoadMapAddress in AssertExtractor
src/crab/ebpf_transformer.cpp Introduced methods for processing LoadMapAddress
src/ebpf_vm_isa.hpp Added new instruction mode constants for map-related operations
src/test/test_marshal.cpp Added test case for LoadMapAddress instruction
src/test/test_verify.cpp Updated test case for global variable verification
src/test/test_yaml.cpp Added YAML test case for map-related instructions
test-data/map.yaml New YAML configuration defining test cases for map functionality

Possibly related issues

Possibly related PRs

Suggested reviewers

  • dthaler
  • elazarg

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46344ae and a864629.

📒 Files selected for processing (15)
  • external/libbtf (1 hunks)
  • src/asm_cfg.cpp (1 hunks)
  • src/asm_files.cpp (8 hunks)
  • src/asm_marshal.cpp (3 hunks)
  • src/asm_ostream.cpp (2 hunks)
  • src/asm_parse.cpp (2 hunks)
  • src/asm_syntax.hpp (2 hunks)
  • src/asm_unmarshal.cpp (2 hunks)
  • src/assertions.cpp (1 hunks)
  • src/crab/ebpf_transformer.cpp (3 hunks)
  • src/ebpf_vm_isa.hpp (1 hunks)
  • src/test/test_marshal.cpp (2 hunks)
  • src/test/test_verify.cpp (1 hunks)
  • src/test/test_yaml.cpp (1 hunks)
  • test-data/map.yaml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: Analyze (cpp)
  • GitHub Check: build_ubuntu (Debug)
🔇 Additional comments (29)
src/asm_files.cpp (7)

174-174: Assignment of INST_LD_MODE_MAP_FD

inst.src is correctly set to INST_LD_MODE_MAP_FD for loading map file descriptors.


199-223: Implementation of relocate_global_variable function

Function correctly handles the relocation of global variables.


350-350: Initialization of global_variable_section_indices

Declaration and initialization are appropriate.


370-377: Checking for old-style map sections

Loop correctly identifies the presence of old-style map sections.


Line range hint 380-416: Handling BTF data and global variable sections

Proper handling of BTF data and initialization of section indices for .maps, .data, .bss, and .rodata.


499-504: Relocating global variables

Relocation for global variables using relocate_global_variable is correctly implemented.


477-479: Verify inst_next assignment logic

Ensure that the calculation of inst_next handles boundary conditions correctly.

Review the assignment:

ebpf_inst& inst_next = offset / sizeof(ebpf_inst) + 1 < prog.prog.size()
                           ? prog.prog[offset / sizeof(ebpf_inst) + 1]
                           : inst;

Confirm that when offset / sizeof(ebpf_inst) + 1 equals prog.prog.size(), inst_next correctly references inst to avoid out-of-bounds access.

✅ Verification successful

Boundary condition handling is correct

The assignment logic safely handles the boundary case by falling back to inst when at the end of the program, preventing out-of-bounds access. This is consistent with the codebase's instruction handling patterns and is validated by the test suite.

🏁 Scripts executed

The following scripts were executed for the analysis:

Error: No shell scripts found


Script:

#!/bin/bash
# Get the complete context of the function containing this code
ast-grep --pattern 'function $FUNC_NAME($$$) {
  $$$
  inst_next = $$$
  $$$
}'

# Search for similar patterns and usage of prog.prog
rg "prog\.prog\[" -A 2 -B 2

# Look for test files
fd "test.*cpp$"

Length of output: 1097


Script:

#!/bin/bash
# Check test_marshal.cpp for relevant test cases
cat src/test/test_marshal.cpp

# Look for verification functions
rg "verify|validate" src/test/test_marshal.cpp -A 2 -B 2

Length of output: 44773

src/test/test_yaml.cpp (1)

34-34: Addition of map.yaml to YAML test suite

New test case YAML_CASE("test-data/map.yaml") added to the test suite.

src/ebpf_vm_isa.hpp (1)

91-93: Defining new load modes

New constants INST_LD_MODE_IMM, INST_LD_MODE_MAP_FD, and INST_LD_MODE_MAP_VALUE are appropriately defined and adhere to the instruction set.

src/asm_syntax.hpp (2)

96-103: LGTM: Clean struct definition.

The struct follows the established pattern with appropriate member types and a defaulted equality operator.


261-261: LGTM: Proper variant type update.

LoadMapAddress is correctly added to the Instruction variant type.

src/asm_marshal.cpp (3)

Line range hint 90-97: LGTM: Enhanced flexibility with type parameter.

The function now supports multiple operation modes through the type parameter instead of a boolean flag.


108-108: LGTM: Proper mode constant usage.

Correctly uses INST_LD_MODE_MAP_FD constant with the updated makeLddw function.


110-112: LGTM: Clean implementation of LoadMapAddress marshaling.

The implementation properly handles both mapfd and offset parameters using the correct mode constant.

src/assertions.cpp (1)

48-48: LGTM: Consistent with LoadMapFd behavior.

Correctly returns an empty vector of assertions, matching the pattern of similar operations.

src/asm_cfg.cpp (1)

402-403: LGTM: Consistent instruction type categorization.

LoadMapAddress is correctly categorized as an "assign" type instruction, matching the behavior of similar operations.

src/asm_ostream.cpp (1)

362-363: LGTM!

The string representation matches the assembly syntax defined by the MAP_VAL regex pattern.

src/asm_unmarshal.cpp (3)

Line range hint 432-441: LGTM!

The validation logic for instruction fields is correct and comprehensive.


442-449: LGTM!

The handling of map file descriptor mode is correct.


450-452: LGTM!

The creation of LoadMapAddress instruction is correct.

src/test/test_verify.cpp (1)

427-427: Verify global variable support in the verifier.

The change from TEST_SECTION_FAIL to TEST_SECTION indicates that global variables are now supported.

Run this script to verify global variable support:

✅ Verification successful

Global variable support is properly implemented

  • Implementation in src/asm_files.cpp handles global variable relocation, section management (.data, .bss, .rodata), and validates operations
  • The change from TEST_SECTION_FAIL to TEST_SECTION is consistent with the codebase
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if global variables are properly supported in the verifier
# Test: Search for global variable handling code
ast-grep --pattern $'class $_$ {
  $$$
  handle_global_variable($_, $_) {
    $$$
  }
  $$$
}'

Length of output: 95


Script:

#!/bin/bash
# Search for test files and code related to global variables
echo "=== Test files with 'global' in name ==="
fd -t f "global" --exec echo {}

echo -e "\n=== Code mentions of global variables ==="
rg -i "global.*variable" -A 2

echo -e "\n=== Global symbol handling ==="
ast-grep --pattern 'global_symbol'

Length of output: 2087

src/test/test_marshal.cpp (1)

44-45: LGTM: Added support for LDDW with src_reg=2.

The addition of {{0x18, DST, 2, 0, IMM}, bpf_conformance_groups_t::base64} to the instruction template table correctly implements support for 64-bit immediate with type 2.

src/crab/ebpf_transformer.cpp (2)

1844-1854: Implementation correctly handles map address loading.

The do_load_map_address method:

  1. Retrieves map descriptor and type
  2. Sets shared region size and offset
  3. Assigns valid pointer type (non-null)

1856-1857: LGTM: Simple delegation to do_load_map_address.

The operator correctly forwards the call to the implementation method.

external/libbtf (1)

1-1: Verify the submodule commit update

The submodule update needs verification to ensure it aligns with the 64-bit immediate instruction support.

Run this script to verify the commit:

✅ Verification successful

Submodule update aligns with 64-bit immediate support

The commit adds parsing support for data sections into array maps, which is consistent with the implementation of 64-bit immediate instruction support.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the submodule commit exists and examine its changes

# Check if the commit exists and is accessible
git ls-remote https://github.com/vbpf/libbtf.git 6aba1609ce53e11639cecb4fa8750dd01d5ce209

# Get the commit message and changed files
gh api repos/vbpf/libbtf/commits/6aba1609ce53e11639cecb4fa8750dd01d5ce209 --jq '.commit.message, .files[].filename'

Length of output: 448


Script:

#!/bin/bash
# Get the actual submodule URL and configuration
cat .gitmodules
git config --file .gitmodules --get-regexp url | grep libbtf
git submodule status external/libbtf

Length of output: 688


Script:

#!/bin/bash
# Verify commit in the correct repository
git ls-remote https://github.com/Alan-Jowett/libbtf.git 6aba1609ce53e11639cecb4fa8750dd01d5ce209

# Get commit details
gh api repos/Alan-Jowett/libbtf/commits/6aba1609ce53e11639cecb4fa8750dd01d5ce209 --jq '.commit.message, .files[].filename'

Length of output: 383

test-data/map.yaml (4)

1-2: License headers are properly formatted.


11-12: Document magic numbers and verify region size.

  1. Document the significance of 2147418112 in svalue array
  2. Confirm if shared_region_size=4 is sufficient for 64-bit operations mentioned in PR objectives

8-8: Verify map_val(1) existence.

Add precondition to verify that map_val(1) exists before the test.


23-26: Verify map_fd postconditions.

  1. Document the significance of svalue=[1, 2147418112]
  2. Add assertions to verify map_fd validity
✅ Verification successful

Map FD value range [1, 2147418112] ensures valid map operations.

  1. svalue=[1, 2147418112] represents valid map descriptor range:

    • Lower bound 1: Ensures map_fd is valid (0 is invalid)
    • Upper bound 2147418112 (0x7FFF_FFF0): Maximum system-allowed map FD
  2. Built-in validation:

    • Type system enforces map_fd type constraints
    • Range checks prevent invalid map operations
    • System validates map descriptors during creation
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for map_fd validation logic
rg "map_fd.*validation|verify" -A 5

Length of output: 14496


Script:

#!/bin/bash
# Search for map-related code and documentation
echo "=== Map Creation and Initialization ==="
ast-grep --pattern 'create_map'

echo -e "\n=== Map Descriptor and FD Handling ==="
rg "map.*descriptor|map.*fd" -A 3

echo -e "\n=== Map Value Documentation ==="
rg -l "\.md$" | xargs rg "map.*value|map.*fd"

Length of output: 71240

src/asm_files.cpp Outdated Show resolved Hide resolved
src/asm_parse.cpp Show resolved Hide resolved
src/asm_parse.cpp Show resolved Hide resolved
src/asm_parse.cpp Show resolved Hide resolved
test-data/map.yaml Show resolved Hide resolved
@coveralls
Copy link

coveralls commented Jan 7, 2025

Coverage Status

coverage: 89.566% (-0.09%) from 89.655%
when pulling f863c07 on Alan-Jowett:load_map_address
into 46344ae on vbpf:main.

src/ebpf_vm_isa.hpp Outdated Show resolved Hide resolved
src/test/test_verify.cpp Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a864629 and 76a6a71.

📒 Files selected for processing (3)
  • src/asm_files.cpp (7 hunks)
  • src/ebpf_vm_isa.hpp (1 hunks)
  • src/test/test_verify.cpp (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: Analyze (cpp)
  • GitHub Check: build_ubuntu (Debug)
🔇 Additional comments (7)
src/ebpf_vm_isa.hpp (1)

90-93: LGTM! Well-defined instruction load mode constants.

The new constants follow a logical sequence and are well-documented.

src/asm_files.cpp (4)

174-174: LGTM! Correct use of the new load mode constant.

The source register is properly set to INST_LD_MODE_MAP_FD for map relocations.


199-223: LGTM! Well-structured global variable relocation handling.

The function properly handles global variable relocations with appropriate error checking and uses the correct load mode constant.


368-374: LGTM! Clear separation of old-style map section detection.

The code cleanly identifies old-style map sections before processing.


400-414: LGTM! Comprehensive global variable section handling.

All standard global variable sections (.data, .bss, .rodata) are properly tracked.

src/test/test_verify.cpp (2)

528-528: Move this test case to the appropriate section.

Move this line down to around line 519 where the other TEST_SECTION("build", ...) lines are grouped.


528-528: Verify the change in test expectation.

The test case has been changed from TEST_SECTION_FAIL to TEST_SECTION, indicating that global variables are now expected to pass verification.

Run the following script to verify the change:

✅ Verification successful

Test change from TEST_SECTION_FAIL to TEST_SECTION is correct

The codebase shows proper implementation of global variable support, including section handling (.data, .bss, .rodata) and relocation logic. The test expectation change reflects this implementation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the change in global variable handling.

# Test: Search for any related changes in global variable handling.
rg -A 5 'global_variable' --type cpp --type hpp

# Test: Search for any related test cases.
rg -A 5 'TEST_SECTION.*global_variable'

Length of output: 618


Script:

#!/bin/bash
# Find the TEST_SECTION_FAIL occurrences and macro definitions
rg -A 2 'TEST_SECTION_FAIL'

# Find the test macro definitions
rg -B 2 -A 5 '#define TEST_SECTION'

# Look for global variable related changes in recent commits
git log -n 5 --oneline --grep="global.*variable"

Length of output: 7945


Script:

#!/bin/bash
# Search for global variable verification implementation
ast-grep --pattern 'global' src/

# Search for recent changes in verifier code
rg -l 'global.*variable' src/

Length of output: 107


Script:

#!/bin/bash
# Check implementation in asm_files.cpp
rg -B 3 -A 3 'global.*variable' src/asm_files.cpp

# Look for recent changes in the file
git log -p src/asm_files.cpp | rg -B 3 -A 3 'global'

Length of output: 1651

src/asm_files.cpp Outdated Show resolved Hide resolved
src/asm_files.cpp Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76a6a71 and 14b2b3f.

📒 Files selected for processing (1)
  • src/asm_files.cpp (7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: Analyze (cpp)
  • GitHub Check: build_ubuntu (Debug)
🔇 Additional comments (1)
src/asm_files.cpp (1)

501-504: Potential inconsistency in symbol names

In relocate_global_variable, using reader.sections[symbol_section_index]->get_name() as symbol_name may not match the original symbol_name used elsewhere. Verify that the correct symbol name is passed to maintain consistency.

src/asm_files.cpp Show resolved Hide resolved
src/asm_files.cpp Outdated Show resolved Hide resolved
src/asm_files.cpp Show resolved Hide resolved
src/asm_files.cpp Outdated Show resolved Hide resolved
@Alan-Jowett
Copy link
Contributor Author

@coderabbitai review

Copy link

coderabbitai bot commented Jan 8, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14b2b3f and 0eec122.

📒 Files selected for processing (1)
  • src/asm_files.cpp (7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Debug)
🔇 Additional comments (5)
src/asm_files.cpp (5)

165-170: LGTM! Well-structured helper function.

The function effectively centralizes load instruction validation logic, reducing code duplication.


Line range hint 172-203: LGTM! Improved error handling and clarity.

The function is now more robust with better error messages and clearer parameter naming.


355-355: LGTM! Comprehensive global variable section handling.

The implementation correctly handles all standard global variable sections (.data, .bss, .rodata).

Also applies to: 411-421


Line range hint 481-516: LGTM! Robust relocation handling.

The relocation logic is well-structured with appropriate error checking and clear separation between map and global variable handling.


205-214: ⚠️ Potential issue

Add bounds check for next instruction access.

The function assumes the next instruction exists but doesn't verify this assumption before accessing it.

Apply this diff to add bounds checking:

     verify_load_instruction(reloc_inst, symbol_name, offset);
 
+    // Verify next instruction exists as this is a wide instruction
+    if (offset / sizeof(ebpf_inst) + 1 >= info.map_descriptors.size()) {
+        throw UnmarshalError("Missing next instruction for wide load at offset " +
+                            std::to_string(offset / sizeof(ebpf_inst)));
+    }
     // Copy the immediate value to the next instruction.
     next_reloc_inst.imm = reloc_inst.imm;

Likely invalid or redundant comment.

src/asm_files.cpp Show resolved Hide resolved
From the ISA RFC:
5.4.  64-bit immediate instructions

   Instructions with the IMM 'mode' modifier use the wide instruction
   encoding defined in Instruction encoding (Section 3), and use the
   'src_reg' field of the basic instruction to hold an opcode subtype.

   The following table defines a set of {IMM, DW, LD} instructions with
   opcode subtypes in the 'src_reg' field, using new terms such as "map"
   defined further below:

    +=========+================================+==========+==========+
    | src_reg | pseudocode                     | imm type | dst type |
    +=========+================================+==========+==========+
    | 0x0     | dst = (next_imm << 32) | imm   | integer  | integer  |
    +---------+--------------------------------+----------+----------+
    | 0x1     | dst = map_by_fd(imm)           | map fd   | map      |
    +---------+--------------------------------+----------+----------+
    | 0x2     | dst = map_val(map_by_fd(imm))  | map fd   | data     |
    |         | + next_imm                     |          | address  |
    +---------+--------------------------------+----------+----------+
    | 0x3     | dst = var_addr(imm)            | variable | data     |
    |         |                                | id       | address  |
    +---------+--------------------------------+----------+----------+
    | 0x4     | dst = code_addr(imm)           | integer  | code     |
    |         |                                |          | address  |
    +---------+--------------------------------+----------+----------+
    | 0x5     | dst = map_by_idx(imm)          | map      | map      |
    |         |                                | index    |          |
    +---------+--------------------------------+----------+----------+
    | 0x6     | dst = map_val(map_by_idx(imm)) | map      | data     |
    |         | + next_imm                     | index    | address  |
    +---------+--------------------------------+----------+----------+

                 Table 12: 64-bit immediate instructions

Signed-off-by: Alan Jowett <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/asm_files.cpp (1)

499-504: ⚠️ Potential issue

Ensure next_reloc_inst is initialized before use

Before calling relocate_global_variable, next_reloc_inst should be properly defined to prevent undefined behavior. Currently, next_reloc_inst may not be initialized when relocate_global_variable is called.

Apply this diff to define next_reloc_inst before calling relocate_global_variable:

     ebpf_inst& reloc_inst = prog.prog[offset / sizeof(ebpf_inst)];
+    // Load instructions are two instructions long, so we need to check the next instruction.
+    if (prog.prog.size() <= offset / sizeof(ebpf_inst) + 1) {
+        throw UnmarshalError("Invalid relocation data");
+    }
+    ebpf_inst& next_reloc_inst = prog.prog[offset / sizeof(ebpf_inst) + 1];

     if (map_section_indices.contains(symbol_section_index)) {
         relocate_map(reloc_inst, symbol_name, map_record_size_or_map_offsets, info, offset, index, symbols);
         continue;
     }

     if (global_variable_section_indices.contains(symbol_section_index)) {
         relocate_global_variable(reloc_inst, next_reloc_inst,
                                  reader.sections[symbol_section_index]->get_name(), info,
                                  map_record_size_or_map_offsets, offset);
         continue;
     }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0eec122 and f863c07.

📒 Files selected for processing (15)
  • external/libbtf (1 hunks)
  • src/asm_cfg.cpp (1 hunks)
  • src/asm_files.cpp (7 hunks)
  • src/asm_marshal.cpp (3 hunks)
  • src/asm_ostream.cpp (2 hunks)
  • src/asm_parse.cpp (2 hunks)
  • src/asm_syntax.hpp (2 hunks)
  • src/asm_unmarshal.cpp (2 hunks)
  • src/assertions.cpp (1 hunks)
  • src/crab/ebpf_transformer.cpp (3 hunks)
  • src/ebpf_vm_isa.hpp (1 hunks)
  • src/test/test_marshal.cpp (2 hunks)
  • src/test/test_verify.cpp (1 hunks)
  • src/test/test_yaml.cpp (1 hunks)
  • test-data/map.yaml (1 hunks)
🧰 Additional context used
📓 Learnings (3)
src/asm_parse.cpp (3)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:177-179
Timestamp: 2025-01-08T16:39:55.117Z
Learning: The ASM parser in src/asm_parse.cpp is only used for parsing hand-written assembly snippets from YAML test files, not for parsing untrusted input, so input validation is unnecessary.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:173-176
Timestamp: 2025-01-08T16:39:47.584Z
Learning: In ebpf-verifier, ASM parsing (parse_instruction) is specifically used in src/test/ebpf_yaml.cpp for parsing controlled test inputs from YAML files in the test-data directory, and is not exposed to external inputs that would require validation.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:173-176
Timestamp: 2025-01-08T16:39:47.584Z
Learning: In ebpf-verifier, ASM parsing is only used for parsing hand-written test snippets in YAML files, not for parsing arbitrary input from external sources. Therefore, extensive input validation is not required.
test-data/map.yaml (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: test-data/map.yaml:20-20
Timestamp: 2025-01-08T16:32:48.309Z
Learning: The eBPF verifier test framework does not support testing invalid file descriptors, making it impossible to add test cases for invalid map_fd values or non-existent file descriptors.
src/asm_files.cpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_files.cpp:205-229
Timestamp: 2025-01-08T17:37:07.223Z
Learning: Global variable relocations in eBPF are only supported with the new map types that use BTF data, not with the old-style map sections.
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: build_ubuntu (Debug)
  • GitHub Check: Analyze (cpp)
🔇 Additional comments (23)
src/ebpf_vm_isa.hpp (1)

91-93: New load mode constants defined

Constants INST_LD_MODE_IMM, INST_LD_MODE_MAP_FD, and INST_LD_MODE_MAP_VALUE are added to represent new load modes.

src/asm_syntax.hpp (1)

96-104: LoadMapAddress struct integrated into instruction set

The LoadMapAddress struct is defined and included in the Instruction variant to support loading map addresses.

src/assertions.cpp (1)

48-48: LGTM!

The empty assertion vector for LoadMapAddress is consistent with LoadMapFd's behavior.

src/asm_cfg.cpp (1)

402-403: LGTM!

Categorizing LoadMapAddress as "assign" type is consistent with LoadMapFd handling.

src/asm_parse.cpp (2)

57-62: LGTM!

The regex patterns clearly define the expected formats for map value and map file descriptor instructions.


173-179: LGTM!

The instruction parsing logic correctly creates LoadMapAddress and LoadMapFd instructions from the matched patterns.

src/crab/ebpf_transformer.cpp (3)

47-47: LGTM!

The declaration follows the class's operator overloading pattern and is well-placed with other map-related operations.


1844-1854: LGTM!

The implementation correctly:

  • Retrieves map information
  • Sets up shared memory region
  • Initializes type and pointer validity

1856-1856: LGTM!

The operator follows the single responsibility principle by delegating to the implementation method.

external/libbtf (1)

1-1: Verify the submodule update impact.

The submodule update needs verification to ensure compatibility with the new 64-bit immediate support.

Run this script to analyze the changes between the old and new commits:

✅ Verification successful

Submodule update appears safe

The libbtf integration is straightforward, using standard BTF interfaces for debug information, line numbers, and map definitions. No version-specific dependencies found that could be affected by the update.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Analyze the submodule update impact

# Store the commit hashes
OLD_COMMIT="3c0cbd6ca0920e4d5df6fb88a892f0b5ac5425d7"
NEW_COMMIT="6aba1609ce53e11639cecb4fa8750dd01d5ce209"

# Check if the submodule exists and is initialized
if [ -d "external/libbtf/.git" ]; then
    cd external/libbtf

    # Fetch the commits if they don't exist
    git fetch origin $OLD_COMMIT $NEW_COMMIT

    # Show commit messages between the two versions
    echo "Commit messages between versions:"
    git log --oneline $OLD_COMMIT..$NEW_COMMIT

    # Show breaking changes in public APIs
    echo -e "\nPotential breaking changes in public headers:"
    git diff $OLD_COMMIT..$NEW_COMMIT -- '*.h'

    # Show changes in build system files
    echo -e "\nBuild system changes:"
    git diff $OLD_COMMIT..$NEW_COMMIT -- '*CMakeLists.txt' '*.cmake' 'Makefile*'
fi

Length of output: 141


Script:

#!/bin/bash
# Check submodule directory contents and integration
echo "Checking submodule directory contents:"
fd . external/libbtf -t f

echo -e "\nSearching for libbtf references in build files:"
rg -g 'CMakeLists.txt' -g 'Makefile' -g '*.cmake' 'libbtf'

echo -e "\nSearching for libbtf usage in source files:"
rg -t cpp -t c -t h 'libbtf|btf_' -A 2

Length of output: 10421

test-data/map.yaml (2)

4-14: LGTM! Comprehensive test case for map value loading.

The test case thoroughly verifies the map value loading functionality, including offset, region size, value array, type, and value matching.


16-26: LGTM! Comprehensive test case for map file descriptor loading.

The test case thoroughly verifies the map file descriptor loading functionality, including fd value, type, and value array.

src/test/test_yaml.cpp (1)

34-34: LGTM! Test case added in correct order.

The map.yaml test case is properly integrated into the test suite.

src/asm_marshal.cpp (4)

90-93: LGTM! Improved type safety with explicit mode constants.

The change from boolean flag to type parameter enhances code clarity and maintainability.


108-108: LGTM! Consistent use of mode constant.

The operator correctly uses INST_LD_MODE_MAP_FD, aligning with the new type system.


110-112: LGTM! Well-structured LoadMapAddress operator.

The operator follows the established pattern and correctly uses INST_LD_MODE_MAP_VALUE.


311-313: LGTM! Correct instruction size handling.

LoadMapAddress is properly handled as a double-word instruction.

src/asm_ostream.cpp (2)

362-363: LGTM! Clear and consistent output format.

The operator generates readable assembly syntax for LoadMapAddress instructions.


556-558: LGTM! Consistent size handling.

LoadMapAddress size handling matches the implementation in asm_marshal.cpp.

src/asm_unmarshal.cpp (3)

Line range hint 432-441: LGTM! Improved validation logic.

The validation correctly handles the extended range of source types with clear error messages.


442-449: LGTM! Clear LoadMapFd handling.

The code properly validates and creates LoadMapFd instructions with helpful comments.


450-452: LGTM! Complete LoadMapAddress handling.

The code properly creates LoadMapAddress instructions with all necessary fields.

src/test/test_marshal.cpp (1)

44-45: LGTM! Template addition aligns with ISA RFC.

The new instruction template for LDDW with src_reg=2 is correctly placed in sequence.

src/test/test_verify.cpp Show resolved Hide resolved
src/test/test_marshal.cpp Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants