Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hpc rework #289

Merged
merged 26 commits into from
Jan 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
6200ab7
Added newest campaigns, functionality for running more S3 files per s…
Gregtom3 Dec 6, 2023
3ff3b8f
brycecanyon --> craterlake
Gregtom3 Dec 6, 2023
ed95cf8
Changed campaign names
Gregtom3 Dec 6, 2023
f11d136
.out and .err from slurm jobs sent to LogSubDir
Gregtom3 Dec 6, 2023
39e6046
rm rf the clean directories
Gregtom3 Dec 6, 2023
892d088
changes to file
Gregtom3 Dec 6, 2023
7cb9189
Speed up by reading the TemplateFile only once
Gregtom3 Dec 6, 2023
4bd7535
Added capability to manually set Q2weight for each range from the con…
Gregtom3 Dec 8, 2023
dbb2a75
Manual weight and total events
Gregtom3 Dec 8, 2023
be80aee
Increment total number of RECO events
Gregtom3 Dec 8, 2023
4436b36
Overhaul of prepare-multi-roots. Added local database lookup for the …
Gregtom3 Dec 8, 2023
e6345c3
Added pipeline script for running across multiple campaigns, energies…
Gregtom3 Dec 8, 2023
805bdd3
Modified prefix for slurm
Gregtom3 Dec 11, 2023
92208a7
Added a count_events.py script which will run over the desired campai…
Gregtom3 Dec 11, 2023
0614bbd
nevents_databases updated for 23.11.0 and 23.10.0 campaigns
Gregtom3 Dec 12, 2023
ec9f8e1
Merge script float --> double , also increased MEM size for merge script
Gregtom3 Dec 12, 2023
d199318
README and project_scripts dir
Gregtom3 Dec 13, 2023
31cdbea
More memory for larger sim batches and refactoring of the job name
Gregtom3 Dec 13, 2023
4424812
Refactoring of the Weight parameter to decimal form (not scientific n…
Gregtom3 Dec 13, 2023
a75eb65
Required change to Analysis.cxx such that the Q2min bug (mentioned in…
Gregtom3 Dec 13, 2023
989143c
Memory allocation issues mentioned in README
Gregtom3 Dec 13, 2023
6c5cbad
Merge script now avoids combining all cycles of a TTree
Gregtom3 Dec 13, 2023
b8ef6b6
README update and default parameters
Gregtom3 Dec 19, 2023
7dd2076
Merge branch 'main' into hpc_rework
Gregtom3 Jan 12, 2024
f6caf9d
Update ci.yml
Gregtom3 Jan 12, 2024
ba1bdc4
Update ci.yml
Gregtom3 Jan 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -140,8 +140,8 @@ jobs:
fail-fast: true
matrix:
include:
- { id: epic_latest, options: --version epic.latest --limit 20 --detector brycecanyon --no-radcor }
- { id: epic_previous, options: --version epic.previous --limit 20 --detector brycecanyon --no-radcor }
- { id: epic_latest, options: --version epic.latest --limit 20 --detector craterlake --no-radcor }
- { id: epic_previous, options: --version epic.previous --limit 20 --detector craterlake --no-radcor }
- { id: ecce, options: --version ecce.22.1 --limit 40 }
- { id: athena, options: --version athena.deathvalley-v1.0 --limit 20 }
steps:
Expand Down
33 changes: 33 additions & 0 deletions hpc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,30 @@ issues; you are welcome to contribute your own scripts to support your preferred
It is highly recommended to test jobs with small samples, before launching a full-scale analysis
on all available data.

# Pipeline Automation

The `hpc` toolkit has a built-in macro for streamlining analysis across many campaigns, Q2 ranges, energy configurations, and even detector setups. The pipeline's aim is to automate the following steps entirely on Jefferson Lab's slurm system:

1. Creation of the main s3 `.config` file (typically stored in `datarec/`)
2. Calculation of the number of events stored within each `s3` file's TTree (used for calculating event-by-event weights). These are also cached in `hpc/nevents_databases` for faster computation of future pipelines.
3. Splitting of the main s3 `.config` file into batches (for parallel computing).
4. Execution of the analysis macro for each batched `.config` file
5. Merging of the output analysis `.root` files into a single `analysis.root` file

The script that handles the pipeline is `run-local-slurm-pipeline.rb`. The user should edit this script with the desired configurations. These include the campaigns, the energies of interest within those campaigns, the detector configuration, the number of files from `s3` to analyze (per Q2 range) and the number of root files which are analyzed per slurm job. By default, several of these parameters will trip the error handler until the user sets them accordingly.

We note that the calculation of the `nevents` for each s3 `TTree`, albeit time-consuming, is very important for our parallel computing needs. This is because the event-by-event Q2weights depend on how many total events are simulated for each Q2 range. Since we are batching the main s3 `.config` into smaller chunks, this information is lost unless we calculate the number of events before running the analysis. These event counts are then used to set manual weights in the batched `.config` files.

To run the pipeline:

```
hpc/run-local-slurm-pipeline.rb
```

Optionally, you can use the `--overwrite` flag to skip the query to delete pre-existing project files.

There are several known issues to be aware of pertaining to memory usage. If `NROOT_FILES_PER_JOB` is too large, then the per job memory allocation listed in `run-local-slurm.rb` may be too small to create the ROOT TTree's from the analysis macro. Additionally, the merging of all ROOT TFile's into one may run out of memory. This would be limited by the memory allocation listed in the pipeline job created by `run-local-slurm-pipeline.rb`. It is set to `4000mb` now which is reasonable and should not run out of memory.

## 1. Preparation
```bash
hpc/prepare.rb
Expand All @@ -21,6 +45,15 @@ file into one `config` file per `ROOT` file; these `config` files can each be fe
macro. Total yields per Q2 bin are automatically obtained and stored in all `config` files, to make
sure the resulting Q2 weights are correct for the combined set of files.

Alternatively, one can split the starting `config` file into multiple `config` files where the user specifies (as a third argument) the number of `ROOT` files per `config`. To do so, one would utilize the following script.

```bash
hpc/prepare-multi-roots.rb
```




## 2. Run Jobs
This step depends on where you want to run jobs. In general, output ROOT files will be written
to a user-specified subdirectory of `out/`.
Expand Down

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

10,080 changes: 10,080 additions & 0 deletions hpc/nevents_databases/RECO/23.10.0/epic_craterlake/DIS/NC/18x275/minQ2=1/data.csv

Large diffs are not rendered by default.

10,755 changes: 10,755 additions & 0 deletions hpc/nevents_databases/RECO/23.10.0/epic_craterlake/DIS/NC/18x275/minQ2=10/data.csv

Large diffs are not rendered by default.

12,790 changes: 12,790 additions & 0 deletions hpc/nevents_databases/RECO/23.10.0/epic_craterlake/DIS/NC/18x275/minQ2=100/data.csv

Large diffs are not rendered by default.

14,975 changes: 14,975 additions & 0 deletions hpc/nevents_databases/RECO/23.10.0/epic_craterlake/DIS/NC/18x275/minQ2=1000/data.csv

Large diffs are not rendered by default.

Large diffs are not rendered by default.

2,800 changes: 2,800 additions & 0 deletions hpc/nevents_databases/RECO/23.10.0/epic_craterlake/DIS/NC/5x41/minQ2=1/data.csv

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

10,861 changes: 10,861 additions & 0 deletions hpc/nevents_databases/RECO/23.11.0/epic_craterlake/DIS/NC/10x100/minQ2=1000/data.csv

Large diffs are not rendered by default.

Large diffs are not rendered by default.

12,685 changes: 12,685 additions & 0 deletions hpc/nevents_databases/RECO/23.11.0/epic_craterlake/DIS/NC/18x275/minQ2=1/data.csv

Large diffs are not rendered by default.

13,590 changes: 13,590 additions & 0 deletions hpc/nevents_databases/RECO/23.11.0/epic_craterlake/DIS/NC/18x275/minQ2=10/data.csv

Large diffs are not rendered by default.

15,925 changes: 15,925 additions & 0 deletions hpc/nevents_databases/RECO/23.11.0/epic_craterlake/DIS/NC/18x275/minQ2=100/data.csv

Large diffs are not rendered by default.

19,764 changes: 19,764 additions & 0 deletions hpc/nevents_databases/RECO/23.11.0/epic_craterlake/DIS/NC/18x275/minQ2=1000/data.csv

Large diffs are not rendered by default.

Large diffs are not rendered by default.

3,385 changes: 3,385 additions & 0 deletions hpc/nevents_databases/RECO/23.11.0/epic_craterlake/DIS/NC/5x41/minQ2=1/data.csv

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Loading
Loading