-
Notifications
You must be signed in to change notification settings - Fork 30
CMS
- ACAT 2022:
- CERN-CMS-DP-2022-061: http://cds.cern.ch/record/2842375
- ACAT 2021:
- J. Phys. Conf. Ser. 2438 012100: http://dx.doi.org/10.1088/1742-6596/2438/1/012100
- CERN-CMS-DP-2021-030: https://cds.cern.ch/record/2792320
---
config:
markdownAutoWrap: false
---
graph TD;
subgraph genjob [genjob_pu55to75,genjob_nopu.sh]
Samples(TTbar_14TeV_TuneCUETP8M1_cfi.py)-->|cmsDriver.py| gensim(standard GEN-SIM-RECO)
gensim -->|PFAnalysisNtuplizer.cc| pfntuple(PFElements, CaloParticles, SimClusters: flat *.root)
end
subgraph dataprep [Dataset preprocessing]
pfntuple-->|postprocessing2.py| postprocessing(MLPF inputs and targets: *.pkl.bz2);
postprocessing -->|tfds build heptfds/cms_pf/ttbar.py| tfds(ML dataset splits 1-10: *.tfrecords)
end
pfntuple -->|plots_cms.py| dataset_plots(Dataset plots: *.pkl)
postprocessing -->|plots_cms.py| dataset_plots
subgraph ml [ML training & eval]
tfds -->|mlpf/pipeline.py --train ...| checkpoints(checkpoint-epoch-loss.pth)
checkpoints -->|mlpf/pipeline.py --load checkpoint.pth --test ... | predictions(Predictions: *.parquet)
checkpoints -->|cms-validate-onnx.ipynb| onnx(ONNX model: *.onnx)
predictions -->|mlpf/pipeline.py --load checkpoint.pth --make-plots | eval_plots(Validation plots: *.png)
end
The current model outputs from CMSSW can be found at
gfal-copy -r root://xrootd.hep.kbfi.ee:1094//store/user/jpata/mlpf/results/cms/CMSSW_14_1_0_pre3 ./
See below for the steps to reproduce this using CMSSW.
The resulting DQM plots can be found at:
https://jpata.web.cern.ch/jpata/mlpf/cms/results/acat2022_20221004_model40M_revalidation20240523/
https://jpata.web.cern.ch/jpata/mlpf/cms/results/acat2022_20221004_model40M_revalidation_CMSSW14_20240527/
See below for the steps to produce these plots.
The following should work on lxplus:
#ensure proxy is set
voms-proxy-init -voms cms -valid 192:00
voms-proxy-info
#Initialize EL8
cmssw-el8
export SCRAM_ARCH=el8_amd64_gcc12
cmsrel CMSSW_14_1_0_pre3
cd CMSSW_14_1_0_pre3/src
cmsenv
git cms-init
#set the directories we want to check out
echo "/Configuration/Generator/" >> .git/info/sparse-checkout
echo "/IOMC/ParticleGuns/" >> .git/info/sparse-checkout
echo "/RecoParticleFlow/PFProducer/" >> .git/info/sparse-checkout
echo "/Validation/RecoParticleFlow/" >> .git/info/sparse-checkout
#checkout the CMSSW code
git remote add jpata https://github.com/jpata/cmssw.git
git fetch -a jpata
git checkout pfanalysis_caloparticle_CMSSW_14_1_0_pre3_acat2022
#compile
scram b -j4
#download the latest MLPF model
mkdir -p RecoParticleFlow/PFProducer/data/mlpf/
wget https://huggingface.co/jpata/particleflow/resolve/main/cms/2022_10_04_gnnlsh_model40M_acat2022/dev.onnx?download=true -O RecoParticleFlow/PFProducer/data/mlpf/dev.onnx
# must be b786aa6de49b51f703c87533a66326d6
md5sum RecoParticleFlow/PFProducer/data/mlpf/dev.onnx
We use the following datasets for rerunning reconstruction and PF:
/RelValQCD_FlatPt_15_3000HS_14/CMSSW_14_1_0_pre3-PU_140X_mcRun3_2024_realistic_v8_STD_2024_PU-v2/GEN-SIM-DIGI-RAW
/RelValTTbar_14TeV/CMSSW_14_1_0_pre3-PU_140X_mcRun3_2024_realistic_v8_STD_2024_PU-v2/GEN-SIM-DIGI-RAW
/RelValQQToHToTauTau_14TeV/CMSSW_14_1_0_pre3-PU_140X_mcRun3_2024_realistic_v8_STD_2024_PU-v2/GEN-SIM-DIGI-RAW
/RelValSingleEFlatPt2To100/CMSSW_14_1_0_pre3-PU_140X_mcRun3_2024_realistic_v8_STD_2024_PU-v2/GEN-SIM-DIGI-RAW
/RelValSingleGammaFlatPt8To150/CMSSW_14_1_0_pre3-PU_140X_mcRun3_2024_realistic_v8_STD_2024_PU-v2/GEN-SIM-DIGI-RAW
/RelValSinglePiFlatPt0p7To10/CMSSW_14_1_0_pre3-PU_140X_mcRun3_2024_realistic_v8_STD_2024_PU-v2/GEN-SIM-DIGI-RAW
Since we need to rerun reconstruction, the datasets need to be in the GEN-SIM-DIGI-RAW tier. Currently, only RelVal datasets are available at this tier. These datasets have been copied to disk at T2_EE_Estonia to ensure access.
The PF validation workflows can be run using the scripts in
cd particleflow
#the number 1 signifies the row index (filename) in the input file to process
./scripts/cmssw/validation_job.sh mlpf scripts/cmssw/qcd_pu.txt QCD_PU 1
./scripts/cmssw/validation_job.sh pf scripts/cmssw/qcd_pu.txt QCD_PU 1
The MINIAOD output will be in $CMSSW_BASE/out/QCD_PU_mlpf
and $CMSSW_BASE/out/QCD_PU_pf
.
If you want to regenerate ML training samples from scratch with CMSSW, check the scripts
mlpf/data_cms/genjob_nopu.sh
mlpf/data_cms/genjob_pu55to75.sh
Copy the datasets from EOS:
rsync -r --progress lxplus.cern.ch:/eos/user/j/jpata/mlpf/tensorflow_datasets/cms ./tensorflow_datasets
Download the pytorch distribution:
wget https://jpata.web.cern.ch/jpata/pytorch.simg
On a machine with a single GPU, the following is a quick test of the training workflow
singularity exec --env CUDA_VISIBLE_DEVICES=0 -B /scratch/persistent --nv \
--env PYTHONPATH=`pwd` \
--env KERAS_BACKEND=torch \
pytorch.simg python3.10 mlpf/pipeline.py --dataset cms --gpus 1 \
--data-dir ./tensorflow_datasets --config parameters/pytorch/pyg-cms.yaml \
--train --test --make-plots --conv-type attention --num-epochs 10 --gpu-batch-multiplier 1 \
--num-workers 4 --prefetch-factor 100 --checkpoint-freq 1 --ntrain 1000 --ntest 1000 --nvalid 1000