Skip to content
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.

Add support for running perftools to use hardware performance counters when benchmarking. #98

Merged
merged 14 commits into from
Apr 18, 2024

Conversation

whesse
Copy link
Contributor

@whesse whesse commented Jan 12, 2024

To use hardware performance counters to measure benchmark performance, this PR adds hooks (methods that can be overridden in subclasses) to the base benchmark harness that run immediately before and after the main benchmarking loop.

It also exposes the total count of benchmark iterations to the call that creates measurements, so this count can be made visible in the output.

The rest of the implementation, using environment variables to control optionally signalling perftools using named pipes, could be located in this package or put in a different package.

@whesse whesse requested a review from sortie January 12, 2024 13:38
@whesse
Copy link
Contributor Author

whesse commented Jan 31, 2024

@jensjoha You might be interested in this change

@jensjoha
Copy link

jensjoha commented Feb 1, 2024

Do you have numbers for how stable it is when it's started and stopped in the middle of a run?
My immediate thought is that it isn't --- which would make it not useful --- so I'm hoping to be proven wrong :)

lib/src/benchmark_base_perf.dart Outdated Show resolved Hide resolved
lib/src/benchmark_base_perf.dart Outdated Show resolved Hide resolved
When PerfBenchmarkBase also starts, attaches, and stops the
"perf stat" subprocess, measure() and report() will need
to be async functions, so add measurePerf() and reportPerf() functions.

This version still requires "perf stat" to be wrapped around the
benchmark command.
@whesse
Copy link
Contributor Author

whesse commented Feb 8, 2024

The benchmark harness is now starting up the "perf stat" subprocess and attaching it to the benchmark process.
Example output is:

dart run BinaryTrees/dart/BinaryTrees.dart 
BinaryTrees(MajorPageFaults)(RunTime): 587.8266666666667 us.
BinaryTrees(CpuCycles)(RunTime): 250666196.78666666 us.
BinaryTrees.totalIterations(RunTime): 75.0 us.
BinaryTrees(RunTime): 60710.38636363636 us.

Remaining TODO: remove the extra (RunTime) from the output, and create and destroy the fifo named pipes in the harness.

Create the temporary pipes used to communicate with perf
in a temporary directory in the system temp directory, and
clean them up afterwards.

Remove the conditional code based on environment variables,
and use the perf stat wrapper around the benchmark unconditionally.
@whesse
Copy link
Contributor Author

whesse commented Feb 20, 2024

This PR is now ready for review. It has been restructured to move the "perf stat" subprocess into the PerfBenchmarkBase class, and to create the named pipes needed for that class in the system temp directory.

An integration test is added, that will only pass when run on a machine with the "mkfifo" and "perf" commands.

integration_test/perf_benchmark_test.dart Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base_stub.dart Show resolved Hide resolved
lib/src/perf_benchmark_base_stub.dart Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
integration_test/perf_benchmark_test.dart Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
@whesse whesse requested a review from sortie April 16, 2024 14:19
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Show resolved Hide resolved
}

Future<void> _startPerfStat() async {
await _createFifos();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this is its own method? Perhaps it might be nicer with a utility function to make a singular fifo

lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
'fifo:$perfControlFifo,$perfControlAck',
'-j',
'-p',
'$pid',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's nicer to use the = syntax so each option is on its own line

E.g. --delay=-1 --control=fifo:... --pid=$pid

lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved

void _waitForAck() {
// Perf writes 'ack\n\x00' to the acknowledgement fifo.
const ackLength = 'ack\n\x00'.length;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to check if this message is actually written and throw if not?

lib/src/perf_benchmark_base.dart Outdated Show resolved Hide resolved
import 'benchmark_base.dart';
import 'score_emitter.dart';

class PerfBenchmarkBase extends BenchmarkBase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you need to write a CHANGELOG entry about this new feature and bump the minor version number.

The json output format is only in the most recent versions of perf stat.
Use the separated value output format instead.

Address review comments, ensure perf process is terminated if
failures occur.
@whesse whesse merged commit aa139fd into dart-archive:master Apr 18, 2024
6 checks passed
copybara-service bot pushed a commit to dart-lang/sdk that referenced this pull request Apr 22, 2024
Revisions updated by `dart tools/rev_sdk_deps.dart`.

benchmark_harness (https://github.com/dart-lang/benchmark_harness/compare/d23112a..aa139fd):
  aa139fd  2024-04-18  William Hesse  Add support for running perftools to use hardware performance counters when benchmarking. (dart-archive/benchmark_harness#98)

dartdoc (https://github.com/dart-lang/dartdoc/compare/592694e..1ae5b4c):
  1ae5b4c0  2024-04-22  dependabot[bot]  Bump actions/upload-artifact from 4.3.1 to 4.3.3 (dart-lang/dartdoc#3755)
  5934d2d6  2024-04-22  dependabot[bot]  Bump actions/checkout from 4.1.2 to 4.1.3 (dart-lang/dartdoc#3754)
  a9500f0d  2024-04-22  Sam Rawlins  Replace hasPrivateName/hasPublicName with one extension getter (dart-lang/dartdoc#3752)

tools (https://github.com/dart-lang/tools/compare/d86ea23..3e21ae9):
  3e21ae9  2024-04-22  Elias Yishak  `Event` Constructor added for devtools event (dart-lang/tools#258)

webdev (https://github.com/dart-lang/webdev/compare/3a10b76..d4f9f67):
  d4f9f672  2024-04-22  Devon Carew  remove the dep on package:uuid (dart-lang/webdev#2415)
  78555cdd  2024-04-18  Kevin Moore  FE server common: move from pkg:usage to pkg:uuid (dart-lang/webdev#2412)
  16dce1f2  2024-04-18  Elliott Brooks  Pin DCM to latest version (dart-lang/webdev#2410)
  b0494fb5  2024-04-18  Elliott Brooks  Catch errors when calculating frames in the stack trace (dart-lang/webdev#2408)

Change-Id: I1d206665e33e5936f4a5bf93e029a7290803895d
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/364021
Commit-Queue: Devon Carew <[email protected]>
Reviewed-by: Konstantin Shcheglov <[email protected]>
@@ -3,14 +3,16 @@
// BSD-style license that can be found in the LICENSE file.

abstract class ScoreEmitter {
void emit(String testName, double value);
void emit(String testName, double value,
{String metric = 'RunTime', String unit});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to note that this is a breaking change to anyone who implements ScoreEmitter.

https://github.com/search?q=%22implements+ScoreEmitter%22&type=code

Is the breaking change worth it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth it, but it should be made in a major release of benchmark_harness. I'm reverting this interface change, using a new final subclass instead, and filing an issue to make the change in a major release.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

mosuem pushed a commit to dart-lang/tools that referenced this pull request Oct 25, 2024
…s when benchmarking. (dart-archive/benchmark_harness#98)

Add a new class PerfBenchmarkBase that extends BenchmarkBase, with a new asynchronous reportPerf() method that runs the benchmark while attached to a "perf stat" process that measures performance with CPU hardware performance counters.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants