Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporate page state into ETag computation #1722

Merged

Conversation

ShyamGadde
Copy link
Contributor

@ShyamGadde ShyamGadde commented Dec 5, 2024

Summary

Fixes #1466

Relevant technical choices

Factored the following into the ETag computation:

  • The queried object
  • The post IDs and last modified of posts in The Loop
  • The active theme (stylesheet, template, and their versions)
  • The current theme template, with the base name of of the template PHP file (e.g. single.php for classic themes, or else the WP_Block_Template attributes in case of block themes.

static function ( WP_Post $post ): array {
return array(
'ID' => $post->ID,
'last_modified' => $post->post_modified_gmt,
Copy link
Contributor Author

@ShyamGadde ShyamGadde Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider a homepage that shows 10 posts. None of the posts have featured images. URL metrics have been fully populated for the page in this state. Then someone publishes a new post that includes a featured image. All of a sudden there is a new image on the page which may be LCP, and yet the URL metrics have no record of this element.

Continuing the analogy from the issue description, I thought it might be helpful to include the last_modified for posts from The Loop as well. This would address cases where a post is updated (e.g., adding a featured image) rather than just when new posts are added.

Comment on lines 179 to 182
'template' => get_template(),
'template_version' => wp_get_theme( get_template() )->get( 'Version' ),
'stylesheet' => get_stylesheet(),
'stylesheet_version' => wp_get_theme()->get( 'Version' ),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including the theme's version here seemed relevant since a theme update could potentially change the template content, making the URL metrics stale.

Copy link
Contributor Author

@ShyamGadde ShyamGadde Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inclusion of template_version was considered useful to account for cases where the current template is provided by the parent theme. Any updates to the parent theme could potentially alter the template content, making the URL metrics stale. Including the version seemed like a reasonable way to address this dependency.

$data = array(
'tag_visitors' => array_keys( iterator_to_array( $tag_visitor_registry ) ),
'tag_visitors' => array_keys( iterator_to_array( $tag_visitor_registry ) ),
'queried_posts' => $queried_posts,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered using separate keys like queried_object_id for singular pages and queried_post_ids for non-singular ones. However, since the queried post is always included in the $GLOBALS['wp_the_query']->posts array (even for singular pages), using a single queried_posts key felt simpler and more consistent.

Would separate keys be more appropriate instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The queried object ID may not be a post at all, however. It may be a WP_Term, WP_User, or even a WP_Post_Type (for post type archives).

Additionally, in Reading settings when you select a static homepage and designate a page as the "posts page" I don't believe this page is included in the posts loop array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the approach to collecting IDs, I referenced the issue description, which suggested something like this:

if ( is_singular() ) {
    $normalized_query_vars['queried_object_id'] = get_queried_object_id();
} else {
    $normalized_query_vars['queried_post_ids'] = join( ',', wp_list_pluck( $GLOBALS['wp_query']->posts, 'ID' ) );
}

Based on this, my understanding was that whenever is_singular() is true, get_queried_object_id() always returns a post (of some type). I might be mistaken, but I couldn’t identify cases where queried_object_id would return anything other than a post in this scenario.

In the else case, I assumed these scenarios were primarily archives, where the layout or template focuses on displaying posts matching the criteria (e.g., posts in a term archive or author/user page). Consequently, I considered storing only the post IDs and their modification dates as sufficient for ETag purposes, as the queried object itself (e.g., a term or user) didn’t seem to significantly influence the layout or behavior. That said, I’m happy to revisit this assumption if there are aspects I might have missed.

Regarding static homepages, I observed the following:

  1. For the "Homepage," the page does appear in the posts array.
  2. For the "Posts Page," the page itself does not appear in the posts array—only the displayed posts are included.

Given this, I didn’t see a need to include the static page’s ID or modification date in the ETag data for the "Posts Page" since its content is not rendered and is essentially overridden by the home template displaying the posts.

Are there additional considerations or scenarios I may not have accounted for?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. For the "Posts Page," the page itself does not appear in the posts array—only the displayed posts are included.

Given this, I didn’t see a need to include the static page’s ID or modification date in the ETag data for the "Posts Page" since its content is not rendered and is essentially overridden by the home template displaying the posts.

Actually, why wouldn't the Posts Page content not be rendered? I mean, some aspect of the posts page will be rendered. At the very least, the title of the Posts Page is displayed: https://stimulating-mosquito-343fdf.instawp.xyz/the-bloooooooooog/

With Full Site Editing, the home template would be expected to be much more modified.

I think there should be two keys then, one for queried_object (which should include the type, id, etc) and another for queried_posts which would be information from the main loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and even a classic theme template may decide to show the featured image for the Posts Page. For example, Twenty Seventeen has this header logic:

	if ( twentyseventeen_should_show_featured_image() ) :
		echo '<div class="single-featured-image-header">';
		echo get_the_post_thumbnail( get_queried_object_id(), 'twentyseventeen-featured-image' );
		echo '</div><!-- .single-featured-image-header -->';
	endif;

And you can opt-in to showing the featured image with plugin code like this:

add_filter( 'twentyseventeen_should_show_featured_image', '__return_false' );

Example: https://stimulating-mosquito-343fdf.instawp.xyz/the-bloooooooooog/

But any other theme may decide to show the featured image by default in this case. There's no telling with WordPress 😄

So yeah, I think all of the information about the queried object should be stored in addition to whatever is stored for the posts in the loop, even perhaps if this means some data is duplicated. It won't matter since it all ends up as a hash anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, some aspect of the posts page will be rendered. At the very least, the title of the Posts Page is displayed: https://stimulating-mosquito-343fdf.instawp.xyz/the-bloooooooooog/

Ah, that's a great point! I was testing with the Twenty Twenty-Five theme without much modification, so none of the page content seemed to be used—clearly, that was a gap in my testing. 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a key for queried_object in 8651d05.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With Full Site Editing, the home template would be much more modified.

Would it make sense to also account for the template content when generating the ETag? While global styles might only have an indirect impact on URL metrics, the template itself seems directly relevant since it determines the layout and rendered content.

For block themes, instead of just considering the template slug, perhaps we should also include the template's modified date?

One way to implement this might be:

$data['current_template'] = get_block_template( $GLOBALS['_wp_current_template_id'], 'wp_template' );

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including the modified date makes sense to me.

@westonruter westonruter added [Type] Enhancement A suggestion for improvement of an existing feature [Plugin] Optimization Detective Issues for the Optimization Detective plugin labels Dec 6, 2024
@ShyamGadde
Copy link
Contributor Author

@westonruter Testing this function has been challenging, mainly due to its reliance on global variables. To address this, I introduced dependency injection for od_get_current_url_metrics_etag() in 56cbe8e, adding $wp_query and $current_template as parameters to avoid directly accessing globals like $GLOBALS['wp_the_query'] or $GLOBALS['_wp_current_template_id'].

Another challenge lies in testing global functions like get_stylesheet and get_template. Verifying how changes in their return values impact the ETag doesn’t seem feasible without external libraries such as WP_Mock.

@westonruter
Copy link
Member

Another challenge lies in testing global functions like get_stylesheet and get_template. Verifying how changes in their return values impact the ETag doesn’t seem feasible without external libraries such as WP_Mock.

What about adding filters for stylesheet and template?

@ShyamGadde ShyamGadde marked this pull request as ready for review December 9, 2024 20:35
Copy link

github-actions bot commented Dec 9, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: ShyamGadde <[email protected]>
Co-authored-by: westonruter <[email protected]>
Co-authored-by: adamsilverstein <[email protected]>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

Copy link
Member

@adamsilverstein adamsilverstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!


if ( $queried_object instanceof WP_Post ) {
$queried_object_data['id'] = $queried_object->ID;
$queried_object_data['type'] = 'post';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be the post type?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hummm. Then when WP_Term is the queried object, would the type be the taxonomy name? I think actually this isn't needed because all that is needed here is to uniquely identify the queried object with a hash of this data here. The point of the type of post is to differentiate it from a type of term since it is entirely possible for a post and term to have the same ID.

@westonruter westonruter force-pushed the update/incorporate-page-state-for-etag branch from 3032e75 to 18e0476 Compare December 15, 2024 04:24
Copy link
Member

@westonruter westonruter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great. I'm approving now and if any of the additional changes look amiss, please flag them. Otherwise, I'll merge tomorrow (likely morning Pacific time).

@westonruter westonruter force-pushed the update/incorporate-page-state-for-etag branch from f2cff28 to 1fc20af Compare December 15, 2024 05:26
@westonruter westonruter force-pushed the update/incorporate-page-state-for-etag branch from e941ceb to b083a29 Compare December 15, 2024 05:41
@westonruter westonruter force-pushed the update/incorporate-page-state-for-etag branch from b8cdf88 to b083a29 Compare December 15, 2024 05:48
@westonruter
Copy link
Member

The linting error is unrelated to the changes in this PR. It seems to be a problem that just popped up in shivammathur/setup-php: shivammathur/setup-php#893.

@westonruter westonruter merged commit 2359b9a into WordPress:trunk Dec 15, 2024
27 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Enhancement A suggestion for improvement of an existing feature
Projects
None yet
3 participants