Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use WaitingTaskHolder to signal doneWaiting() instead of WaitingTaskWithArenaHolder in framework #47029

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

makortel
Copy link
Contributor

@makortel makortel commented Dec 27, 2024

PR description:

In the framework the doneWaiting() is always called from within the main arena in the TBB thread pool, and therefore using
WaitingTaskHolder is safe (WaitingTaskWithArenaHolder is needed only when doneWaiting() is called outside of the TBB arena).

Avoiding WaitingTaskWithArenaHolder allows to avoid enqueue() operation when the doneWaiting() calls in the framework are the ones that decrease the task reference count to 0.

Resolves cms-sw/framework-team#1125

PR validation:

Checked with gdb that a single-threaded test configuration with a module that uses ExternalWork (Alpaka-based module on the CPU serial backend, to be exact) where the acquire() does not leave any WaitingTaskWithArenaHolder alive (i.e. the acquire() does not really launch any asynchronous work) does not lead anymore to two TBB threads as a consequence of the task_arena::enqueue().

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 27, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47029/43145

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @makortel for master.

It involves the following packages:

  • FWCore/Concurrency (core)
  • FWCore/Framework (core)

@Dr15Jones, @cmsbuild, @makortel, @smuzaffar can you please review it and eventually sign? Thanks.
@missirol, @wddgit this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

And I forgot the EventSetup side

ModuleContextSentry moduleContextSentry(&moduleCallingContext_, parentContext);
try {
convertException::wrap([&]() { this->implDoAcquire(info, &moduleCallingContext_, holder); });
WaitingTaskWithArenaHolder holderWithArena{holder};
convertException::wrap([&]() { this->implDoAcquire(info, &moduleCallingContext_, holderWithArena); });
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runAcquire() is called by runAcquireAfterAsyncPrefetch() that owns the holder beyond the scope of runAcquire() function. Therefore the destructor or doneWaiting() of this holderWithArena should not lead to the task reference count to drop to 0.

It could be cleaner to percolate the holder as WaitingTaskHolder through the implDoAcquire() functions, but that seemed a little bit tedious. I could do it nevertheless, but I wanted to get feedback on this PR in general first.

@@ -42,7 +42,7 @@ namespace edm {

// Takes ownership of the underlying task and uses the current
// arena.
explicit WaitingTaskWithArenaHolder(WaitingTaskHolder&& iTask);
explicit WaitingTaskWithArenaHolder(WaitingTaskHolder iTask);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change allows a WaitingTaskHolder to be "implicitly copied" into WaitingTaskWithArenaHolder, i.e.

WaitingTaskHolder h;
WaitingTaskWithArenaHolder wta{h};
// instead of
WaitingTaskHolder h;
WaitingTaskWithArenaHolder wta{WaitingTaskHolder{h}};

The pattern of WaitingTaskHolder being moved should work as before.

…ithArenalHolder in framework

In the framework the doneWaiting() is always called from within the
main arena in the TBB thread pool, and therefore using
WaitingTaskHolder is safe (WaitingTaskWithArenaHolder is needed only
when doneWaiting() is called outside of the TBB arena).

Avoiding WaitingTaskWithArenaHolder allows to avoid enqueue()
operation when the doneWaiting() calls in the framework are the ones
that decrease the task reference count to 0.
@makortel makortel force-pushed the waitingTaskWithArenaHolder branch from 1d9168d to 16a69d8 Compare December 27, 2024 22:00
@makortel
Copy link
Contributor Author

And I forgot the EventSetup side

Now the EventSetup side should be included as well

@makortel
Copy link
Contributor Author

enable gpu, threading

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47029/43146

@cmsbuild
Copy link
Contributor

Pull request #47029 was updated. @Dr15Jones, @cmsbuild, @makortel, @smuzaffar can you please check and sign again.

@makortel
Copy link
Contributor Author

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: GpuUnitTests
Size: This PR adds an extra 52KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-450e38/43604/summary.html
COMMIT: 16a69d8
CMSSW: CMSSW_15_0_X_2024-12-27-1100/el8_amd64_gcc12
Additional Tests: GPU,THREADING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/47029/43604/install.sh to create a dev area with all the needed externals and cmssw changes.

GPU Unit Tests

I found 2 errors in the following unit tests:

---> test testCudaDeviceAdditionWrapper had ERRORS
---> test testCudaDeviceAdditionKernel had ERRORS

Comparison Summary

Summary:

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 24 differences found in the comparisons
  • DQMHistoTests: Total files compared: 7
  • DQMHistoTests: Total histograms compared: 53071
  • DQMHistoTests: Total failures: 898
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 52173
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 6 files compared)
  • Checked 24 log files, 30 edm output root files, 7 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor Author

---> test testCudaDeviceAdditionWrapper had ERRORS
---> test testCudaDeviceAdditionKernel had ERRORS

AFAICT these tests are independent of the changes in this PR

@makortel
Copy link
Contributor Author

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: GpuUnitTests
Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-450e38/43608/summary.html
COMMIT: 16a69d8
CMSSW: CMSSW_15_0_X_2024-12-30-1100/el8_amd64_gcc12
Additional Tests: GPU,THREADING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/47029/43608/install.sh to create a dev area with all the needed externals and cmssw changes.

GPU Unit Tests

I found 2 errors in the following unit tests:

---> test testCudaDeviceAdditionKernel had ERRORS
---> test testCudaDeviceAdditionWrapper had ERRORS

Comparison Summary

Summary:

  • You potentially removed 1 lines from the logs
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3818730
  • DQMHistoTests: Total failures: 457
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3818253
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 214 log files, 184 edm output root files, 49 DQM output files
  • TriggerResults: found differences in 1 / 47 workflows

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 24 differences found in the comparisons
  • DQMHistoTests: Total files compared: 7
  • DQMHistoTests: Total histograms compared: 53071
  • DQMHistoTests: Total failures: 874
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 52197
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 6 files compared)
  • Checked 24 log files, 30 edm output root files, 7 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor Author

makortel commented Dec 30, 2024

Ah, the tests actually fail in the IBs (#46864)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use WaitingTaskHolder to signal doneWaiting() instead of WaitingTaskWithArenaHolder in framework
2 participants