Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEV-1203: Additional ingest performance monitoring (collate/storage) #128

Merged
merged 1 commit into from
Jul 25, 2024

Conversation

mwarin
Copy link
Contributor

@mwarin mwarin commented Jun 10, 2024

This PR does the following changes to JobMetrics:

  • Changed metric naming conventions
  • Added labels
  • Expanded coverage
  • TIDYing of some prominent classes
  • Split new class into JobMetrics and JobMetricsCLI, to simplify main class and not pollute it with CLI stuff.

In most cases where monitoring is added, I've added some variation on:

my $start_time = $self->{job_metrics}->time;
# ...
# ... the existing piece of code doing the work that we want to measure
# ...
my $end_time   = $self->{job_metrics}->time;
my $delta_time = $end_time - $start_time;
my $labels = {foo => 'bar'}; # something about the operation being measured
$self->{job_metrics}->add("ingest_xxx_bytes_r_total", -s $infile, $labels); # if we did any r/w op
$self->{job_metrics}->add("ingest_xxx_bytes_w_total", -s $outfile, $labels); # if we did any r/w op
$self->{job_metrics}->add("ingest_xxx_seconds_total", $delta_time, $labels); # the time it took
$self->{job_metrics}->inc("ingest_xxx_items_total", $labels); # if we did anything with an item
$self->{job_metrics}->inc("ingest_xxx_images_total", $labels); # if we did anything with an image

... plus a matching metric declaration in HTFeed::JobMetrics.

I've also added 2 convenience subs to HTFeed::JobMetrics:

  • time for providing high resolution time
  • dir_size for calculating the size of a directory

... which are both data points that are often needed for measuring.

Reviewer info

I originally tried to separate and label commits thematically, but this started falling apart as the PR grew too large. There are some TIDY commits, a big FUNC commit with mostly additions (that nevertheless has some tidyings in it, discipline slipped), and a TEST commit. I will squash them all before merging.

Please check:

  • the new tests in t/job_metrics.t
  • the changes to JobMetrics::_setup_metrics
  • that the TIDY commits aren't going too far
  • that I'm measuring what I say I'm measuring (i.e. not using the wrong metric in the wrong place)

Please be on the lookout for any log warnings about WARN: invalid metric name "ingest_xxx_yyy", this is a sign that we are trying to record data for a metric that has not been declared in HTFeed::JobMetrics.

@mwarin mwarin force-pushed the DEV-1203 branch 3 times, most recently from f92443a to dc7507a Compare June 20, 2024 19:38
@mwarin mwarin force-pushed the DEV-1203 branch 3 times, most recently from 9b4e556 to f7d2463 Compare June 20, 2024 21:35
@mwarin mwarin marked this pull request as ready for review June 20, 2024 22:11
@mwarin mwarin requested review from aelkiss and moseshll June 20, 2024 22:13
Copy link
Contributor

@moseshll moseshll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See two comments, one on naming conventions and the other on tabs. Don't feel like making either an impediment to merging, though I think the naming convention is worth further thought/brainstorming.

lib/HTFeed/JobMetrics.pm Outdated Show resolved Hide resolved
t/job_metrics.t Outdated Show resolved Hide resolved
@aelkiss
Copy link
Member

aelkiss commented Jun 24, 2024

@mwarin Have we looked at perltidy https://metacpan.org/pod/Perl::Tidy recently? It might be able to do some of these tidyings/reformattings automatically, although it looks like it has quite a few options...

Copy link
Member

@aelkiss aelkiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this is looking like a good approach. There are a couple things I'd like to see:

  • labels for the storage name
  • labels for the specific image remediate operation
  • consider extracting the metrics recording stuff to a method to reduce duplication

lib/HTFeed/Stage/ImageRemediate.pm Outdated Show resolved Hide resolved
lib/HTFeed/Stage/ImageRemediate.pm Outdated Show resolved Hide resolved
lib/HTFeed/Storage.pm Outdated Show resolved Hide resolved
lib/HTFeed/Storage.pm Show resolved Hide resolved
t/job_metrics.t Outdated Show resolved Hide resolved
lib/HTFeed/QueueRunner.pm Outdated Show resolved Hide resolved
Copy link
Member

@aelkiss aelkiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking in pretty good shape; I don't see anything I think needs to block merge, although there are a couple small things to potentially consider now and a few things I think we should note as issues for future work.

I did want to verify that although this PR adds support for labels on metrics, there aren't yet places we're using those labels -- correct?

lib/HTFeed/JobMetrics.pm Outdated Show resolved Hide resolved
lib/HTFeed/JobMetrics.pm Show resolved Hide resolved
lib/HTFeed/JobMetrics.pm Show resolved Hide resolved
lib/HTFeed/JobMetricsCLI.pm Show resolved Hide resolved
lib/HTFeed/Stage/Download.pm Show resolved Hide resolved
lib/HTFeed/Stage/Pack.pm Show resolved Hide resolved
lib/HTFeed/Stage/Unpack.pm Show resolved Hide resolved
@mwarin mwarin merged commit 175ef5b into main Jul 25, 2024
1 check passed
@mwarin mwarin deleted the DEV-1203 branch July 25, 2024 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants