Skip to content

Update block along with the block cache #115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

zyd2001
Copy link

@zyd2001 zyd2001 commented May 18, 2025

In pt_blk_proceed_no_event_fill_cache, if a valid block cache entry is found for the next instruction, the function only fill the cache without updating block. This let pt_blk_next sometimes return unexpected small block.
For example, I had instructions like this:
image
Instruction at 0x114c formed a block cache entry with size 2 in the first iteration. Second iteration starts with 0x1148, and the block should be 0x1148-0x1153 with 3 instructions. However, pt_blk_proceed_no_event_fill_cache only fills the block cache for 0x1148 without updating block, and pt_blk_next returns a block with only 1 instruction.
I just copied and modified the code in pt_blk_proceed_no_event_cached that updating block when cache entry is found. I don't know whether this is appropriate, but it seems to work fine.

I want to ask about how large block is handled? Since ninsn is only 16bits, if a block has more than 32768 instructions, what will pt_blk_next return? I don't see much information in the documentation.
I'm also curious about this part:

binsn = block->ninsn;
ninsn = binsn + (uint16_t) bce.ninsn;
if (ninsn < binsn)
return 0;

If cache is too large for current block, it just returns with success. Then how does the user know the block is smaller than expected? My fix also copied this, and I'm confused.

Copy link
Contributor

@markus-metzger markus-metzger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to ask about how large block is handled?

If a block gets too big, it is split. If that happens because of a big block cache entry, we don't try to fill the block completely, since that would require more decoding. Instead, we omit the entire block cache entry. On the next pt_blk_next() call, we start with that entry.

Do you have some trace and corresponding binaries plus a ptxed command-line you may share so I could debug this?

if (status < 0)
return status;

/* We need to update the block accordingly when we got a valid cache entry */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we simply return pt_blk_proceed_no_event_cached(...)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't try this. Since block is already proceeded for certain instructions, will this cause inconsistency?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with 'proceeded for certain instructions'? We still need the if (valid_cache) guard, of course, since we can only proceed if we haven't done so already with the call to pt_blk_proceed_no_event_fill_cache() on L2461.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I had some misunderstanding of how pt_blk_proceed_no_event_cached works before. Now I feel we can just return it.

@zyd2001
Copy link
Author

zyd2001 commented May 20, 2025

files.zip
Here are the trace and binary. The ptxed command: ptxed -v --pt perf.data --elf a.out:0x555555554000 --block:show-blocks
You can see the first add gets its own block.
image

For the question about big block, is there any sign that the block is not finished? Do I need to manually check the last instruction?

Copy link
Contributor

@markus-metzger markus-metzger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the question about big block, is there any sign that the block is not finished? Do I need to manually check the last instruction?

Why would you care?

if (status < 0)
return status;

/* We need to update the block accordingly when we got a valid cache entry */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with 'proceeded for certain instructions'? We still need the if (valid_cache) guard, of course, since we can only proceed if we haven't done so already with the call to pt_blk_proceed_no_event_fill_cache() on L2461.

@zyd2001
Copy link
Author

zyd2001 commented May 20, 2025

I'm using PT trace to do some basic block level profiling. Although I don't think I will have that big block, but I want to make sure I have some way to handle it.

Copy link
Contributor

@markus-metzger markus-metzger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash your patches and sign-off the remaining patch (git commit -s). The commit message summary is no longer accurate. Please also tag it with the affected component. E.g. 'libipt, block: proceed with cache when filling reaches an existing entry'.

@@ -2431,12 +2435,14 @@ pt_blk_proceed_no_event_fill_cache(struct pt_block_decoder *decoder,
if (status < 0)
return status;


int valid_cache = pt_bce_is_valid(bce);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please declare variables at the beginning of the function. We also need to find a better name, because valid_cache refers to an older cache entry that might have been updated when we use the variable. E.g. keep_filling, or just fill.


if (valid_cache)
return pt_blk_proceed_no_event_cached(decoder, block,
bcache, msec);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please indent until the opening parenthesis using tabs (tab width 8) as far as possible, then switching to spaces.

Let's also add an empty line between the two returns.

if (status < 0)
return status;

if (valid_cache)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd need some comment explaining that we can only proceed with the new cache entry if we have not filled the block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants