Update block along with the block cache #115

zyd2001 · 2025-05-18T15:55:06Z

In pt_blk_proceed_no_event_fill_cache, if a valid block cache entry is found for the next instruction, the function only fill the cache without updating block. This let pt_blk_next sometimes return unexpected small block.
For example, I had instructions like this:

Instruction at 0x114c formed a block cache entry with size 2 in the first iteration. Second iteration starts with 0x1148, and the block should be 0x1148-0x1153 with 3 instructions. However, pt_blk_proceed_no_event_fill_cache only fills the block cache for 0x1148 without updating block, and pt_blk_next returns a block with only 1 instruction.
I just copied and modified the code in pt_blk_proceed_no_event_cached that updating block when cache entry is found. I don't know whether this is appropriate, but it seems to work fine.

I want to ask about how large block is handled? Since ninsn is only 16bits, if a block has more than 32768 instructions, what will pt_blk_next return? I don't see much information in the documentation.
I'm also curious about this part:

libipt/libipt/src/pt_block_decoder.c

Lines 2656 to 2659 in 60dc2df

    
           binsn = block->ninsn; 
        
           ninsn = binsn + (uint16_t) bce.ninsn; 
        
           if (ninsn < binsn) 
        
           	return 0;

If cache is too large for current block, it just returns with success. Then how does the user know the block is smaller than expected? My fix also copied this, and I'm confused.

markus-metzger

I want to ask about how large block is handled?

If a block gets too big, it is split. If that happens because of a big block cache entry, we don't try to fill the block completely, since that would require more decoding. Instead, we omit the entire block cache entry. On the next pt_blk_next() call, we start with that entry.

Do you have some trace and corresponding binaries plus a ptxed command-line you may share so I could debug this?

markus-metzger · 2025-05-20T10:11:34Z

libipt/src/pt_block_decoder.c

+	if (status < 0)
+		return status;
+
+	/* We need to update the block accordingly when we got a valid cache entry */


Could we simply return pt_blk_proceed_no_event_cached(...)?

I didn't try this. Since block is already proceeded for certain instructions, will this cause inconsistency?

What do you mean with 'proceeded for certain instructions'? We still need the if (valid_cache) guard, of course, since we can only proceed if we haven't done so already with the call to pt_blk_proceed_no_event_fill_cache() on L2461.

I think I had some misunderstanding of how pt_blk_proceed_no_event_cached works before. Now I feel we can just return it.

zyd2001 · 2025-05-20T10:45:31Z

files.zip
Here are the trace and binary. The ptxed command: ptxed -v --pt perf.data --elf a.out:0x555555554000 --block:show-blocks
You can see the first add gets its own block.

For the question about big block, is there any sign that the block is not finished? Do I need to manually check the last instruction?

markus-metzger

For the question about big block, is there any sign that the block is not finished? Do I need to manually check the last instruction?

Why would you care?

markus-metzger · 2025-05-20T10:56:04Z

libipt/src/pt_block_decoder.c

+	if (status < 0)
+		return status;
+
+	/* We need to update the block accordingly when we got a valid cache entry */


What do you mean with 'proceeded for certain instructions'? We still need the if (valid_cache) guard, of course, since we can only proceed if we haven't done so already with the call to pt_blk_proceed_no_event_fill_cache() on L2461.

zyd2001 · 2025-05-20T11:17:48Z

I'm using PT trace to do some basic block level profiling. Although I don't think I will have that big block, but I want to make sure I have some way to handle it.

markus-metzger

Please squash your patches and sign-off the remaining patch (git commit -s). The commit message summary is no longer accurate. Please also tag it with the affected component. E.g. 'libipt, block: proceed with cache when filling reaches an existing entry'.

markus-metzger · 2025-05-21T06:32:28Z

libipt/src/pt_block_decoder.c

@@ -2431,12 +2435,14 @@ pt_blk_proceed_no_event_fill_cache(struct pt_block_decoder *decoder,
 	if (status < 0)
 		return status;

+
+	int valid_cache = pt_bce_is_valid(bce);


Please declare variables at the beginning of the function. We also need to find a better name, because valid_cache refers to an older cache entry that might have been updated when we use the variable. E.g. keep_filling, or just fill.

markus-metzger · 2025-05-21T06:34:46Z

libipt/src/pt_block_decoder.c

+
+	if (valid_cache)
+		return pt_blk_proceed_no_event_cached(decoder, block,
+						bcache, msec);


Please indent until the opening parenthesis using tabs (tab width 8) as far as possible, then switching to spaces.

Let's also add an empty line between the two returns.

markus-metzger · 2025-05-21T06:38:52Z

libipt/src/pt_block_decoder.c

+	if (status < 0)
+		return status;
+
+	if (valid_cache)


We'd need some comment explaining that we can only proceed with the new cache entry if we have not filled the block.

zyd2001 · 2025-06-02T22:03:57Z

I updated and signed-off the patch.

markus-metzger · 2025-06-03T08:05:35Z

libipt/src/pt_block_decoder.c

@@ -2431,6 +2436,8 @@ pt_blk_proceed_no_event_fill_cache(struct pt_block_decoder *decoder,
 	if (status < 0)
 		return status;

+
+	fill_block = pt_bce_is_valid(bce);


Let's use that local variable also in the if expression below and let's invert it and rename to fill_cache. This better matches the use in the if expression.

markus-metzger · 2025-06-03T08:05:47Z

libipt/src/pt_block_decoder.c

+static int pt_blk_proceed_no_event_cached(struct pt_block_decoder *decoder,
+					  struct pt_block *block,
+					  struct pt_block_cache *bcache,
+					  const struct pt_mapped_section *msec);


Please leave an empty line to the next function.

Thanks. I haven't noticed before that the line got a bit too long. If you removed the parameter names, like in the other forward declaration above, it would fit.

markus-metzger · 2025-06-03T08:10:28Z

libipt/src/pt_block_decoder.c

+	 */
+	if (fill_block)
+		return pt_blk_proceed_no_event_cached(decoder, block,
+							    bcache, msec);


Indentation looks off. If you handled the !fill_block case first, there'd be enough space to fit the call onto a single line.

markus-metzger · 2025-06-03T12:51:35Z

libipt/src/pt_block_decoder.c

@@ -2431,6 +2436,8 @@ pt_blk_proceed_no_event_fill_cache(struct pt_block_decoder *decoder,
 	if (status < 0)
 		return status;

+


Please remove any trailing whitespace.

markus-metzger · 2025-06-04T06:49:37Z

libipt/src/pt_block_decoder.c

+	if (status < 0)
+		return status;
+
+	/* After we fill a new cache entry, we also need to update the block 


The comment isn't quite right. We fill the cache in both cases. With the inverted check, how about something like:

/* Filling the cache may have extended @block, so we cannot proceed. */ if (fill_cache) return status; return pt_blk_proceed_no_event_cached(decoder, block, bcache, msec);

zyd2001 · 2025-06-05T05:09:40Z

I updated the patch.

markus-metzger

One last round and we're done.

markus-metzger · 2025-06-05T05:19:38Z

libipt/src/pt_block_decoder.c

+static int pt_blk_proceed_no_event_cached(struct pt_block_decoder *decoder,
+					  struct pt_block *block,
+					  struct pt_block_cache *bcache,
+					  const struct pt_mapped_section *msec);


Thanks. I haven't noticed before that the line got a bit too long. If you removed the parameter names, like in the other forward declaration above, it would fit.

libipt/src/pt_block_decoder.c

Signed-off-by: Yida Zhang <[email protected]>

markus-metzger reviewed May 20, 2025

View reviewed changes

markus-metzger reviewed May 21, 2025

View reviewed changes

zyd2001 force-pushed the master branch from e0b029a to 9725471 Compare June 2, 2025 22:01

markus-metzger reviewed Jun 4, 2025

View reviewed changes

zyd2001 force-pushed the master branch from 9725471 to 8354920 Compare June 5, 2025 05:08

markus-metzger reviewed Jun 5, 2025

View reviewed changes

libipt, block: proceed with cache when filling reaches an existing entry

1a33cd6

Signed-off-by: Yida Zhang <[email protected]>

zyd2001 force-pushed the master branch from 8354920 to 1a33cd6 Compare June 5, 2025 05:47

markus-metzger merged commit 62aa406 into intel:master Jun 5, 2025

	binsn = block->ninsn;
	ninsn = binsn + (uint16_t) bce.ninsn;
	if (ninsn < binsn)
	return 0;

		@@ -2431,6 +2436,8 @@ pt_blk_proceed_no_event_fill_cache(struct pt_block_decoder *decoder,
		if (status < 0)
		return status;

Update block along with the block cache #115

Update block along with the block cache #115

Conversation

zyd2001 commented May 18, 2025

Uh oh!

markus-metzger left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zyd2001 commented May 20, 2025

Uh oh!

markus-metzger left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zyd2001 commented May 20, 2025

Uh oh!

markus-metzger left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zyd2001 commented Jun 2, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zyd2001 commented Jun 5, 2025

Uh oh!

markus-metzger left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!