-
Notifications
You must be signed in to change notification settings - Fork 21
/
Copy pathREADME_juskas_batch_recipe
83 lines (64 loc) · 5.67 KB
/
README_juskas_batch_recipe
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
## (These instructions assume that you have EOS filesystem mounted.
## if you dont, do this:
[juska@lxplus0023 thesearch]$ cd
[juska@lxplus0023 thesearch]$ mkdir eos
[juska@lxplus0023 thesearch]$ eosmount eos
## I also assume that everything's just as I want in the Git repository
## codes. Of cource in practice I check multiple things like JEC files
## and cuts in cutFile_mainDijetAnalysis.txt etc.)
## Setup a fresh analysis directory
[juska@lxplus0023 thesearch]$ scram p -n oct15_CMSSW743 CMSSW CMSSW_7_4_3
[juska@lxplus0023 thesearch]$ cd oct15_CMSSW743/src/
[juska@lxplus0023 src]$ git clone https://github.com/CMSDIJET/DijetRootTreeAnalyzer.git CMSDIJET/DijetRootTreeAnalyzer
[juska@lxplus0023 DijetRootTreeAnalyzer]$ ./scripts/make_rootNtupleClass.sh -f ~/dijet_eos/santanas/crab_JetHT__Run2015D-PromptReco-v3__MINIAOD__151012_071719/JetHT__Run2015D-PromptReco-v3__MINIAOD_1.root -t dijets/events
mv: overwrite `include/rootNtupleClass.h'? yes
mv: overwrite `src/rootNtupleClass.C'? yes
[juska@lxplus0023 DijetRootTreeAnalyzer]$ ln -sf analysisClass_mainDijetSelection.C src/analysisClass.C
[juska@lxplus0023 DijetRootTreeAnalyzer]$ make clean
[juska@lxplus0023 DijetRootTreeAnalyzer]$ cmsenv
[juska@lxplus0023 DijetRootTreeAnalyzer]$ make
## Create file lists
[juska@lxplus0023 DijetRootTreeAnalyzer]$ mkdir lists/2015D_456
[juska@lxplus0023 DijetRootTreeAnalyzer]$ touch lists/2015D_456/2015D_part1.txt
[juska@lxplus0023 DijetRootTreeAnalyzer]$ touch lists/2015D_456/2015D_part2.txt
[juska@lxplus0023 DijetRootTreeAnalyzer]$ touch lists/2015D_456/2015D_part3.txt
# With this cryptic command I write a sorted lists of absolute file paths
[juska@lxplus0023 DijetRootTreeAnalyzer]$ ls -d -1 ~/eos/cms/store/group/phys_exotica/dijet/Dijet13TeV/santanas/crab_JetHT__Run2015D-PromptReco-v3__MINIAOD__150930_155913/*.root | sort -V >> lists/2015D_456/2015D_part1.txt
[juska@lxplus0023 DijetRootTreeAnalyzer]$ ls -d -1 ~/eos/cms/store/group/phys_exotica/dijet/Dijet13TeV/santanas/crab_JetHT__Run2015D-PromptReco-v3__MINIAOD__151007_090551/*.root | sort -V >> lists/2015D_456/2015D_part2.txt
[juska@lxplus0023 DijetRootTreeAnalyzer]$ ls -d -1 ~/eos/cms/store/group/phys_exotica/dijet/Dijet13TeV/santanas/crab_JetHT__Run2015D-PromptReco-v3__MINIAOD__151012_071719/*.root | sort -V >> lists/2015D_456/2015D_part3.txt
# The above lists would work just fine with a single analysis job if EOS
# is mounted, but in batch processing the paths have to start with
# 'root://eoscms//eos/cms/store/', so lets search and replace:
# (I don't know if there's a script for this)
[juska@lxplus0023 DijetRootTreeAnalyzer]$ sed -i 's/\/afs\/cern.ch\/user\/j\/juska\/eos\/cms\/store/root:\/\/eoscms\/\/eos\/cms\/store/g' lists/2015D_456/2015D_part1.txt
[juska@lxplus0023 DijetRootTreeAnalyzer]$ sed -i 's/\/afs\/cern.ch\/user\/j\/juska\/eos\/cms\/store/root:\/\/eoscms\/\/eos\/cms\/store/g' lists/2015D_456/2015D_part2.txt
[juska@lxplus0023 DijetRootTreeAnalyzer]$ sed -i 's/\/afs\/cern.ch\/user\/j\/juska\/eos\/cms\/store/root:\/\/eoscms\/\/eos\/cms\/store/g' lists/2015D_456/2015D_part3.txt
## Download fresh cert JSON and change the path accordingly in cutfile
[juska@lxplus0023 DijetRootTreeAnalyzer]$ wget https://cms-service-dqm.web.cern.ch/cms-service-dqm/CAF/certification/Collisions15/13TeV/Cert_246908-258159_13TeV_PromptReco_Collisions15_25ns_JSON_v3.txt -P data/json/
# Replace the beginning of the cutfile with path to new JSON. Use your favourite editor.
[juska@lxplus0023 DijetRootTreeAnalyzer]$ vim config/cutFile_mainDijetSelection.txt
## Create necessary directories and start batch analysis (I have symbolic link in 'dijet_eos')
[juska@lxplus0022 DijetRootTreeAnalyzer]$ mkdir ~/dijet_eos/juska/reduced_skims_2015D_round3
[juska@lxplus0022 DijetRootTreeAnalyzer]$ mkdir batch
# Just before starting you may want to do a test job with something like this:
$ ./main file.txt config/cutFile_mainDijetSelection.txt dijets/events test test
# where file.txt contains a path to a big ntuple file. Now to full production:
[juska@lxplus0023 DijetRootTreeAnalyzer]$ python scripts/submit_batch_EOS_split.py -i lists/2015D_456/ -o /eos/cms/store/group/phys_exotica/dijet/Dijet13TeV/juska/reduced_skims_2015D_round3 -q 1nh --tag Run2015D_round3 --split 5 --cut config/cutFile_mainDijetSelection.txt
# Note that the path given for the '-o' handle is not a real path at all at lxplus,
# but something that the script will work it's magic with. Things go bad if
# you try to give a real path to the output handle.
## Monitor runs with batch tools and by checking the data output directory
[juska@lxplus0023 DijetRootTreeAnalyzer]$ bjobs
[juska@lxplus0023 DijetRootTreeAnalyzer]$ ls batch/Run2015D_round3_20151013_172845/*.src | wc -l
# Seems like I sent 197 jobs, so I expect to get 197 result files once the production is done.
# While waiting, let's monitor how many jobs are still running (or queued) in batch:
[juska@lxplus0023 DijetRootTreeAnalyzer]$ bjobs | grep 'juska' | wc -l
# When I get less than 197 with the above command, I can start monitoring the output dir:
[juska@lxplus0023 DijetRootTreeAnalyzer]$ ls ~/dijet_eos/juska/reduced_skims_2015D_round3_20151013_172845/*skim.root | wc -l
# When this command gives 197, all files are there. It is not likely to happen in one try.
## Once you have all the files processed, you can hadd them together:
# (before this I really combined files from two different run result folders
# to complete the dataset. My trick to survive with random batch crashes is duplicate jobs.)
[juska@lxplus0022 DijetRootTreeAnalyzer]$ cd ~/dijet_eos/juska/reduced_skims_2015D_round3/
[juska@lxplus0022 reduced_skims_2015D_round3]$ hadd rootfile_2015D_Run2015D_round3_20151013.root *.root
# There is also a scrip for hadding files. I prefer to do it by hand.