Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement garbage collection of URL metrics including at uninstallation #1055

Conversation

westonruter
Copy link
Member

@westonruter westonruter commented Mar 14, 2024

Summary

This PR implements two things:

  • Garbage collection of URL Metrics posts which haven't been modified in a month.
  • Introduction of uninstall.php which deletes all of the URL Metrics and unschedules the garbage collection event.

Fixes #895

Relevant technical choices

Note that the first commit in this PR (21a5cec) refactors the post type logic from a set of functions to a single class OD_URL_Metrics_Post_Type. To facilitate review, it may be helpful to look at that commit separately from the ones that follow: 21a5cec...1a9532c.

This implements garbage collection of URL Metrics posts which haven't been modified in a month. By default, a URL Metric has a freshness TTL of 1 day. This means that after 1 day, the detection logic will attempt to gather a new URL Metric to refresh the one that is stale. There is also a sample size of 3 for each URL Metrics group (divided by the breakpoints), so if there are 3 breakpoints there would be 4 groups each of which seeks to have a total of 3 fresh URL Metrics. So there could be a maximum of 12 updates to a given URL Metrics post on a given day. If there are no visitors to a given URL for an entire month, and thus the URL Metrics post for that URL hasn't been modified in a month, then it is eligible to be deleted. The garbage collection process is configured to run on a daily basis.

An uninstall.php is also added which deletes all URL Metrics posts from the current site and other sites on a multisite network (up to 100). The post deletion here is done in batch using $wpdb->delete() for the sake of speed. We used the same approach in the AMP plugin.

@westonruter westonruter added [Type] Enhancement A suggestion for improvement of an existing feature [Focus] Images no milestone PRs that do not have a defined milestone for release [Plugin] Optimization Detective Issues for the Optimization Detective plugin labels Mar 14, 2024
@westonruter westonruter force-pushed the add/optimization-detective-garbage-collection branch from 3b9f27c to a52e0ce Compare March 18, 2024 23:09
@westonruter westonruter marked this pull request as ready for review March 19, 2024 00:31
@westonruter westonruter requested a review from felixarntz as a code owner March 19, 2024 00:31
Copy link

github-actions bot commented Mar 19, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: westonruter <[email protected]>
Co-authored-by: felixarntz <[email protected]>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@westonruter westonruter requested a review from thelovekesh March 19, 2024 00:34
Comment on lines 250 to 265
public static function schedule_garbage_collection() {
if ( ! is_user_logged_in() ) {
return;
}

// Unschedule any existing event which had a differing recurrence.
$scheduled_event = wp_get_scheduled_event( self::GC_CRON_EVENT_NAME );
if ( $scheduled_event && self::GC_CRON_RECURRENCE !== $scheduled_event->schedule ) {
wp_unschedule_event( $scheduled_event->timestamp, self::GC_CRON_EVENT_NAME );
$scheduled_event = null;
}

if ( ! $scheduled_event ) {
wp_schedule_event( time(), self::GC_CRON_RECURRENCE, self::GC_CRON_EVENT_NAME );
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 68 to 69
add_action( 'admin_init', array( __CLASS__, 'schedule_garbage_collection' ) );
add_action( self::GC_CRON_EVENT_NAME, array( __CLASS__, 'delete_stale_posts' ) );
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 300 to 328
public static function delete_all_posts() {
global $wpdb;

// phpcs:disable WordPress.DB.DirectDatabaseQuery.DirectQuery, WordPress.DB.DirectDatabaseQuery.NoCaching

// Delete all related post meta for URL Metrics posts.
$wpdb->query(
$wpdb->prepare(
"
DELETE meta
FROM $wpdb->postmeta AS meta
INNER JOIN $wpdb->posts AS posts
ON posts.ID = meta.post_id
WHERE posts.post_type = %s;
",
self::SLUG
)
);

// Delete all URL Metrics posts.
$wpdb->delete(
$wpdb->posts,
array(
'post_type' => self::SLUG,
)
);

// phpcs:enable WordPress.DB.DirectDatabaseQuery.DirectQuery, WordPress.DB.DirectDatabaseQuery.NoCaching
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 286 to 290
foreach ( $query->posts as $post ) {
if ( self::SLUG === $post->post_type ) { // Sanity check.
wp_delete_post( $post->ID, true );
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how costly this is, given it can be 100 posts. It may be more efficient to run custom DB queries like you do in the delete_all_posts() method, using ID IN (... ), rather than lots of individual queries plus all those hooks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I was thinking it is worthwhile to run the hooks since the plugin code is not uninstalled and there should be potential side effects.

Comment on lines 68 to 69
add_action( 'admin_init', array( __CLASS__, 'schedule_garbage_collection' ) );
add_action( self::GC_CRON_EVENT_NAME, array( __CLASS__, 'delete_stale_posts' ) );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit odd adding general hooks in the same method where the post type is registered. Two alternative approaches I think would be cleaner:

  • Either break out those two hook additions into a separate method, e.g. add_hooks().
  • Or use this method as the single "add hook" entry point (could be called add_hooks(), or you stick with register()), and move the register_post_type() call into a distinct register_post_type() method, which this entry point method then adds as an init action callback.

I think I prefer the second approach. This way you have one dedicated method to register the post type, but the only method that needs to be called to bootstrap what the class does is the one entry point method which adds the three hooks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in f6e0683

Base automatically changed from update/url-handling to feature/image-loading-optimization March 19, 2024 20:46
@felixarntz
Copy link
Member

@westonruter Pending #1055 (comment), this LGTM. :)

@westonruter westonruter requested a review from felixarntz March 19, 2024 23:27
Copy link
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter Looks great, just one small detail.

plugins/optimization-detective/load.php Outdated Show resolved Hide resolved
@westonruter westonruter requested a review from felixarntz March 19, 2024 23:49
Comment on lines 31 to 35
require_once __DIR__ . '/class-od-data-validation-exception.php';
require_once __DIR__ . '/class-od-url-metric.php';
require_once __DIR__ . '/class-od-url-metrics-group.php';
require_once __DIR__ . '/class-od-url-metrics-group-collection.php';
require_once __DIR__ . '/class-od-storage-lock.php';
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess all of these should go into storage then too.

Copy link
Member Author

@westonruter westonruter Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved in b5623fd 814f57d

Copy link
Member

@felixarntz felixarntz Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter Maybe. OD_Storage_Lock probably yes, for the URL metric classes not necessarily - though I don't feel strongly about it. Those are the central infrastructure classes used for the entire functionality, IMO they aren't just storage related as e.g. the post type class is.

I'm okay with either, though please align the test files structure with whatever you go with here as well (e.g. right now the post type class tests are not in the storage folder even though that class is in the storage folder). Nit-picking... :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 33b0272

@westonruter westonruter force-pushed the add/optimization-detective-garbage-collection branch from b5623fd to 814f57d Compare March 20, 2024 00:04
@westonruter westonruter merged commit 25bebc1 into feature/image-loading-optimization Mar 20, 2024
37 checks passed
@westonruter westonruter deleted the add/optimization-detective-garbage-collection branch March 20, 2024 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no milestone PRs that do not have a defined milestone for release [Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Enhancement A suggestion for improvement of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants