Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiprocessing #486

Draft
wants to merge 169 commits into
base: master
Choose a base branch
from
Draft

Multiprocessing #486

wants to merge 169 commits into from

Conversation

nkrah
Copy link
Collaborator

@nkrah nkrah commented Oct 9, 2024

Enable GATE 10 to split a simulation into multiple parallel processes.
THIS IS WORK IN PROGRESS

First implemented items:

  • split run timing intervals
  • adapt dynamic objects (run-based)
  • spawn processes via Pool
  • write output into a separate subfolder per process

Still missing:

  • merge actor output from different processes

output = se.run_engine()
return output

def run(self, start_new_process=False):
def generate_run_timing_interval_map(self, number_of_processes):
if number_of_processes % len(self.run_timing_intervals) != 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why ? I thought we just divide ALL time_interval by the number_of_processes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but letting the user define the total number of processes rather than the process per run is more intuitive and will not require an API change if we implement a more advanced splitting scheme in the future. So I think it's better this way.

@nkrah
Copy link
Collaborator Author

nkrah commented Oct 11, 2024 via email

@nkrah
Copy link
Collaborator Author

nkrah commented Oct 11, 2024

I figured out a flexible mechanism to merge data back into one single actor output (if data is mergeable: true for images, not true yet for ROOT).
We will need a new type of method, common to all actors, namely FinalizeSimulation(), to be triggered from the Simulation after all processes have finished. Writing the combined output (from the processes) to disk will be done in FinalizeSimulation(). The EndOfSimulation(), where writing currently takes place, is called inside the process and therefore before combining the output. We can also add an option to not store intermediate, i.e. per process, output on disk if not needed. For example: images are accessible directly via memory and can be merged that way. No need to access data from disk.

Note: FinalizeSimulation() will not have access to engines because they do not exist any more outside of the subprocess.

@nkrah
Copy link
Collaborator Author

nkrah commented Oct 28, 2024

New:
The following actors now work in multiprocessing (local machine):

  • SimulationStatisticsActor: data is merged in memory and accessible after the simulation; written to disk if requested

  • Actors with ROOT output: root files (from subdirectories per process) are merged into new root file in main output folder structure. Event IDs are automatically incremented. RunIDs are recreated as per the original simulation.

Works with test019_phsp_actor -> created a new variant of the test.

Still need to create variants of other tests that use ROOT output to check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants