Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jet Veto Map Selector #5

Open
wants to merge 68 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
665afc4
Add tasks for the combine workflow to perform fits and plot impacts.
jomatthi Jun 6, 2024
01555d9
Fix broken physics model.
jomatthi Jun 6, 2024
9885c7c
Add predefined extra commands to confirm_and_run script.
jomatthi Jun 11, 2024
678423f
Add wp parameter to topsf.CreateDatacards task that now filters the c…
jomatthi Jun 11, 2024
18ae12a
Add CombineBaseTask that defines common parameters, sandbox, requirem…
jomatthi Jun 11, 2024
21f83e3
Add CreateWorkspaceV2 that runs combine's text2workspace.py given an …
jomatthi Jun 11, 2024
8543358
Add RunCommand task defining common requirements and parameters used …
jomatthi Jun 11, 2024
967ec90
Add FitFixin defining the parameters specifying the fit.
jomatthi Jun 11, 2024
f0c28df
Add GenToysV2 generating toys from workspace.
jomatthi Jun 11, 2024
824d85a
Add MultiDimFitV2 performing expected fit first without then with fro…
jomatthi Jun 11, 2024
2bcb230
Add PostFitShapesFromWorkspaceV2 calculating and storing both pre- an…
jomatthi Jun 11, 2024
20e9ffd
Add PlotShapesV2 plotting pre- and postfit shapes, missing: stacked p…
jomatthi Jun 11, 2024
2a5030e
Add ImpactsV2 performing intial fit followed by calculating impacts o…
jomatthi Jun 11, 2024
45e8708
Add PlotImpactsV2 plotting the impacts given a json file.
jomatthi Jun 11, 2024
65fcd22
Add V2 tasks to law config.
jomatthi Jun 11, 2024
22eaa78
Init file for inference tasks V2.
jomatthi Jun 11, 2024
7edf2a6
Add definition of colors to confirm_and_run script.
jomatthi Jul 31, 2024
ac0bed3
Minor fix of law config.
jomatthi Jul 31, 2024
a0d9f3d
Fix scram arch, cmssw and combine versions to adapt to NAF migration …
jomatthi Jul 31, 2024
f580008
Include wp name in cat_label and change position of label in plot.
jomatthi Jul 31, 2024
c130797
Update config for 22preEE analysis, including dataset names and clean…
jomatthi Aug 1, 2024
87411ad
Update wp config for 22preEE analysis, including dataset names and cl…
jomatthi Aug 1, 2024
e340f41
Fix broken increment_stats function.
jomatthi Aug 1, 2024
79ba342
Fix postfit shapes task to adapt to change in combine syntax.
jomatthi Aug 1, 2024
b5e9d51
Fix physics model to write out correct (Anti)SF name.
jomatthi Aug 1, 2024
4d8739c
Move fit setup parameter and process rates to config for easier access.
jomatthi Aug 1, 2024
8fbcb02
Further small adaptations and typo fixes.
jomatthi Aug 1, 2024
8e7f9be
Add CombineHarvester to cmssw_combine sandbox setup script.
jomatthi Aug 1, 2024
7c46170
Add probejet_tau2 and probejet_tau3 variables.
jomatthi Aug 1, 2024
23cf0a4
Include updated cmsdb.
jomatthi Aug 1, 2024
f881c37
Add possibility to produce plots with a cleaned up legend.
jomatthi Aug 15, 2024
4aa786b
Remove fit mode (exp, obs) from CombineBaseTask.
jomatthi Aug 16, 2024
3e110f0
Small change of workspace name to include hashed physics model instea…
jomatthi Aug 16, 2024
a94fb11
Use fit mode param as job_name, adapt reqs in exp fit mode.
jomatthi Aug 16, 2024
7b22376
Add fit mode (exp, obs) param to various tasks and adapt the combine …
jomatthi Aug 16, 2024
a55c15c
Merge branch 'master' into feature/fit_tasks_v2
jomatthi Aug 22, 2024
753aa42
Remove property definition of parameters previously ending with _inst.
jomatthi Aug 22, 2024
b62cbbd
Change default analysis to also be Run 3 SF analysis to match default…
jomatthi Aug 22, 2024
8042b4d
Change dy_lep to dy to match cmsdb convention.
jomatthi Aug 22, 2024
cb33d19
Added comment about not needing minbias_xs any more after move to cor…
jomatthi Aug 22, 2024
b0fe448
Rephrased description of analysis_id in Run 3 to include campaign.
jomatthi Aug 22, 2024
82e3b36
Small adaptations of the Run 3 config.
jomatthi Aug 22, 2024
a693b4a
Rename base inference and combine tasks.
jomatthi Aug 23, 2024
fc8506a
Rename combine verbosity and help parameter.
jomatthi Aug 23, 2024
caf8d77
Use shorter idiom for touching output directory.
jomatthi Aug 23, 2024
d7e546f
Renamed names and cleaned up variable duplicate.
jomatthi Aug 23, 2024
2fe8856
Reverted significance of per_catergory parameter to be True.
jomatthi Aug 23, 2024
a6cca62
Use SettingsParameter to resolve fit_modes.
jomatthi Aug 23, 2024
a153afb
Log files now also inludes the run combine command and the cwd inform…
jomatthi Aug 23, 2024
558943d
Simplify axes transformation.
jomatthi Aug 23, 2024
fc6ffd5
Renamed files to match name change of base classes.
jomatthi Aug 23, 2024
9ed30c4
Simplify workflow requirement implementation.
jomatthi Aug 23, 2024
5cc280e
Restructured inference tasks to use mixins for parameter definitions.
jomatthi Aug 29, 2024
032c4f1
Fixed output path of topsf.CreateDatacards.
jomatthi Aug 29, 2024
cd644e5
Fix plot name.
jomatthi Aug 29, 2024
e7a42d4
Small addition to 'confirm and run' script.
jomatthi Dec 3, 2024
49e1056
Add PSWeights producer and add FSR/ISR as unc. shifts to configs.
jomatthi Dec 3, 2024
5d3e762
Register jec/jer as uncertainty shifts.
jomatthi Dec 3, 2024
7c5a4eb
Small changes and adaptations of fit setup and configs.
jomatthi Dec 3, 2024
3bd89d1
Linting.
jomatthi Dec 3, 2024
040b285
Update columnflow.
jomatthi Dec 3, 2024
12ba4e7
Uncomment datasets and small fixes to configs after testing.
jomatthi Dec 6, 2024
594fc8d
Columnflow patch to prevent comitting too many jobs.
jomatthi Dec 6, 2024
761b84b
Physics Model to rerun with Christopher's old datacards/templates.
jomatthi Dec 6, 2024
6a3a364
Small update to cmsdb.
jomatthi Dec 6, 2024
625ecb3
Adapt template comparison script to changes.
jomatthi Dec 6, 2024
a768cbc
Missing import.
jomatthi Dec 6, 2024
062be1c
Add jet veto map selector step.
jomatthi Dec 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions confirm_and_run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Function to prompt user for confirmation if command is supposed to be skipped
# if user types n or no then the command is executed
# if user types anything else or nothing then the command is skipped
# it is also possible to extend the command afte typing n or no

# Define color variables
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
NC='\033[0m' # No Color

confirm_and_run() {
cmd="$1"
echo "________________________________________________________"
echo -e "${YELLOW}$cmd${NC}"
echo -e -n " ${RED}Skip?${NC} (y/n): "
read response
case "$response" in
[nN][oO]|[nN])
echo -e " ${RED}Any additional parameters?${NC} ('-ps' -> '--print-status 2,0', '-ro' -> '--remove-output 0,a,y' predefined, others possible)"
read -e -i " " extra_params

# Add another case statement to handle the expansion
case "$extra_params" in
-ps)
extra_params="--print-status 2,0"
echo -e " ${GREEN}Printing status...${NC}"
;;
-ro)
extra_params="--remove-output 0,a,y"
echo -e " ${RED}Removing output...${NC}"
;;
esac
echo -e " ${GREEN}Running...${NC}"
eval "$cmd $extra_params"
;;
*)
echo -e " ${GREEN}Skipped!${NC}"
;;
esac
}

no_confirm() {
cmd="$1"
echo "________________________________________________________"
echo -e "${YELLOW}$cmd${NC}"
echo -e " ${GREEN}Running...${NC}"
eval "$cmd"
}
43 changes: 31 additions & 12 deletions law.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,17 @@ columnflow.tasks.cms.external
topsf.tasks.plotting
topsf.tasks.wp.efficiency
topsf.tasks.inference
topsf.tasks.inference_tasks.create_workspace
topsf.tasks.inference_tasks.combine_task
topsf.tasks.inference_tasks.postfitshapes
topsf.tasks.inference_tasks.impacts
topsf.tasks.inference_v2.workspace
topsf.tasks.inference_v2.gen_toys
topsf.tasks.inference_v2.multi_dim_fit
topsf.tasks.inference_v2.post_fit_shapes
topsf.tasks.inference_v2.impacts
topsf.tasks.inference_v2.plot_impacts
topsf.tasks.inference_v2.plot_shapes

[logging]

Expand All @@ -22,19 +33,26 @@ columnflow.columnar_util-perf: INFO

[analysis]

default_analysis: topsf.config.run2.analysis_sf.analysis_sf
default_config: run2_sf_2017_nano_v9_limited
default_analysis: topsf.config.run3.analysis_sf.analysis_sf
default_config: run3_sf_2022_postEE_nano_v12
default_dataset: tt_fh_powheg
run3_analysis: topsf.config.run3.analysis_sf.analysis_sf
run3_config: run3_sf_2022_preEE_nano_v12

default_keep_reduced_events: True

production_modules: columnflow.production.{categories,normalization,mc_weight,pileup,processes,seeds}, columnflow.production.cms.{btag,electron,mc_weight,muon,pdf,pileup,scale,seeds}, topsf.production.{default,gen_top}
calibration_modules: columnflow.calibration.cms.{jets,met}, topsf.calibration.default
calibration_modules: columnflow.calibration.cms.{jets,met}, topsf.calibration.{default,skip_jec}
selection_modules: columnflow.selection.cms.{json_filter,met_filters}, topsf.selection.{default,categories,jet,bjet,fatjet,lepton,wp}
ml_modules: columnflow.ml
inference_modules: columnflow.inference, topsf.inference.{default,uhh2}

# namespace of all columnflow tasks
cf_task_namespace: cf

# sandbox for working with combine tasks
combine_sandbox: bash::$TOPSF_BASE/sandboxes/combine_cmssw.sh

# wether or not the ensure_proxy decorator should be skipped, even if used by task's run methods
skip_ensure_proxy: False

Expand All @@ -47,26 +65,27 @@ slurm_partition: $CF_SLURM_PARTITION
# ChunkedIOHandler defaults
chunked_io_chunk_size: 100000
chunked_io_pool_size: 2
chunked_io_debug: False
chunked_io_debug: True

# csv list of task families that inherit from ChunkedReaderMixin and whose output arrays should be
# checked for non-finite values before saving them to disk (right now, supported tasks are
# cf.CalibrateEvents, cf.SelectEvents, cf.ProduceColumns, cf.PrepareMLEvents, cf.MLEvaluation,
# cf.UniteColumns)
check_finite_output: cf.CalibrateEvents, cf.SelectEvents, cf.ProduceColumns
# check_finite_output: cf.CalibrateEvents, cf.SelectEvents, cf.ProduceColumns
check_finite_output: None

# whether to log runtimes of array functions by default
log_array_function_runtime: False


[outputs]
[outputs]

# list of all used file systems
wlcg_file_systems: wlcg_fs, wlcg_fs_infn_redirector, wlcg_fs_global_redirector

# list of file systems used by columnflow.tasks.external.GetDatasetLFNs.iter_nano_files to
# look for the correct fs per nano input file (in that order)
lfn_sources: local_desy_dcache,wlcg_fs_infn_redirector, wlcg_fs_global_redirector
lfn_sources: local_desy_dcache, wlcg_fs_infn_redirector, wlcg_fs_global_redirector

# output locations per task family
# for local targets : "local[, LOCAL_FS_NAME or STORE_PATH]"
Expand All @@ -78,12 +97,12 @@ cf.BundleBashSandbox: local
cf.BundleCMSSWSandbox: local
cf.BundleExternalFiles: local
# GetDatasetLFNs requires a Grid certificate -> use a common space to store the output
cf.GetDatasetLFNs: local, /nfs/dust/cms/user/dsavoiu/store/mttbar/data
cf.GetDatasetLFNs: local
cf.CalibrateEvents: wlcg
cf.SelectEvents: wlcg
cf.CreateCutflowHistograms: wlcg
cf.PlotCutflow: wlcg
cf.PlotCutflowVariables: wlcg
cf.PlotCutflow: local
cf.PlotCutflowVariables: local
cf.ReduceEvents: wlcg
cf.MergeReducedEvents: wlcg
cf.ProduceColumns: wlcg
Expand All @@ -94,7 +113,7 @@ cf.MLEvaluation: wlcg
cf.CreateHistograms: local
cf.MergeHistograms: local
cf.MergeShiftedHistograms: local
cf.PlotVariables: local
cf.PlotVariables1D: local
cf.PlotShiftedVariables: local
cf.CreateDatacards: local

Expand Down Expand Up @@ -131,7 +150,7 @@ cache_max_size: 50GB

xrootd_base: root://dcache-cms-xrootd.desy.de:1094/pnfs/desy.de/cms/tier2/store/user/$CF_CERN_USER/$CF_STORE_NAME
gsiftp_base: gsiftp://dcache-door-cms04.desy.de:2811/pnfs/desy.de/cms/tier2/store/user/$CF_CERN_USER/$CF_STORE_NAME
base: &::gsiftp_base
base: &::xrootd_base

[wlcg_fs_infn_redirector]

Expand Down
2 changes: 1 addition & 1 deletion modules/cmsdb
Submodule cmsdb updated 98 files
+7 −12 .github/workflows/lint_and_test.yaml
+19 −1 README.md
+50 −0 cmsdb/campaigns/run2_2016_HIPM_nano_uhh_v12/__init__.py
+340 −0 cmsdb/campaigns/run2_2016_HIPM_nano_uhh_v12/data.py
+484 −0 cmsdb/campaigns/run2_2016_HIPM_nano_uhh_v12/ewk.py
+1,404 −0 cmsdb/campaigns/run2_2016_HIPM_nano_uhh_v12/hh2bbtautau.py
+106 −0 cmsdb/campaigns/run2_2016_HIPM_nano_uhh_v12/higgs.py
+126 −0 cmsdb/campaigns/run2_2016_HIPM_nano_uhh_v12/st.py
+258 −0 cmsdb/campaigns/run2_2016_HIPM_nano_uhh_v12/ttbar.py
+50 −0 cmsdb/campaigns/run2_2016_nano_uhh_v12/__init__.py
+220 −0 cmsdb/campaigns/run2_2016_nano_uhh_v12/data.py
+614 −0 cmsdb/campaigns/run2_2016_nano_uhh_v12/ewk.py
+1,278 −0 cmsdb/campaigns/run2_2016_nano_uhh_v12/hh2bbtautau.py
+162 −0 cmsdb/campaigns/run2_2016_nano_uhh_v12/higgs.py
+461 −0 cmsdb/campaigns/run2_2016_nano_uhh_v12/top.py
+0 −1 cmsdb/campaigns/run2_2016_nano_v9/__init__.py
+34 −0 cmsdb/campaigns/run2_2017_JMEnano_v9/__init__.py
+88 −0 cmsdb/campaigns/run2_2017_JMEnano_v9/data.py
+117 −0 cmsdb/campaigns/run2_2017_JMEnano_v9/qcd.py
+2 −0 cmsdb/campaigns/run2_2017_nano_uhh_v11/__init__.py
+84 −80 cmsdb/campaigns/run2_2017_nano_uhh_v11/ewk.py
+226 −226 cmsdb/campaigns/run2_2017_nano_uhh_v11/hh2bbtautau.py
+22 −22 cmsdb/campaigns/run2_2017_nano_uhh_v11/higgs.py
+2 −2 cmsdb/campaigns/run2_2017_nano_uhh_v11/qcd.py
+8 −8 cmsdb/campaigns/run2_2017_nano_uhh_v11/top.py
+3 −0 cmsdb/campaigns/run2_2017_nano_v9/__init__.py
+2,895 −0 cmsdb/campaigns/run2_2017_nano_v9/azh.py
+81 −1 cmsdb/campaigns/run2_2017_nano_v9/data.py
+71 −124 cmsdb/campaigns/run2_2017_nano_v9/ewk.py
+226 −226 cmsdb/campaigns/run2_2017_nano_v9/hh2bbtautau.py
+729 −24 cmsdb/campaigns/run2_2017_nano_v9/hh2bbww.py
+26 −26 cmsdb/campaigns/run2_2017_nano_v9/higgs.py
+10 −10 cmsdb/campaigns/run2_2017_nano_v9/qcd.py
+10 −10 cmsdb/campaigns/run2_2017_nano_v9/top.py
+34 −0 cmsdb/campaigns/run2_2018_JMEnano_v9/__init__.py
+73 −0 cmsdb/campaigns/run2_2018_JMEnano_v9/data.py
+114 −0 cmsdb/campaigns/run2_2018_JMEnano_v9/qcd.py
+2 −0 cmsdb/campaigns/run2_2018_nano_uhh_v11/__init__.py
+49 −0 cmsdb/campaigns/run2_2018_nano_uhh_v12/__init__.py
+269 −0 cmsdb/campaigns/run2_2018_nano_uhh_v12/data.py
+792 −0 cmsdb/campaigns/run2_2018_nano_uhh_v12/ewk.py
+1,261 −0 cmsdb/campaigns/run2_2018_nano_uhh_v12/hh2bbtautau.py
+138 −0 cmsdb/campaigns/run2_2018_nano_uhh_v12/higgs.py
+406 −0 cmsdb/campaigns/run2_2018_nano_uhh_v12/top.py
+14 −8 cmsdb/campaigns/run2_2018_nano_v9/__init__.py
+44 −0 cmsdb/campaigns/run3_2022_postEE_nano_v11/__init__.py
+107 −0 cmsdb/campaigns/run3_2022_postEE_nano_v11/data.py
+130 −0 cmsdb/campaigns/run3_2022_postEE_nano_v11/ewk.py
+397 −0 cmsdb/campaigns/run3_2022_postEE_nano_v11/qcd.py
+575 −0 cmsdb/campaigns/run3_2022_postEE_nano_v11/top.py
+47 −0 cmsdb/campaigns/run3_2022_postEE_nano_v12/__init__.py
+161 −0 cmsdb/campaigns/run3_2022_postEE_nano_v12/data.py
+963 −0 cmsdb/campaigns/run3_2022_postEE_nano_v12/ewk.py
+57 −0 cmsdb/campaigns/run3_2022_postEE_nano_v12/hh2bbww.py
+15 −0 cmsdb/campaigns/run3_2022_postEE_nano_v12/hhh4b2tau.py
+1,207 −0 cmsdb/campaigns/run3_2022_postEE_nano_v12/higgs.py
+392 −0 cmsdb/campaigns/run3_2022_postEE_nano_v12/qcd.py
+1,074 −0 cmsdb/campaigns/run3_2022_postEE_nano_v12/top.py
+45 −0 cmsdb/campaigns/run3_2022_preEE_nano_uhh_v12/__init__.py
+181 −0 cmsdb/campaigns/run3_2022_preEE_nano_uhh_v12/data.py
+430 −0 cmsdb/campaigns/run3_2022_preEE_nano_uhh_v12/ewk.py
+1,070 −0 cmsdb/campaigns/run3_2022_preEE_nano_uhh_v12/hh2bbtautau.py
+220 −0 cmsdb/campaigns/run3_2022_preEE_nano_uhh_v12/higgs.py
+739 −0 cmsdb/campaigns/run3_2022_preEE_nano_uhh_v12/top.py
+44 −0 cmsdb/campaigns/run3_2022_preEE_nano_v11/__init__.py
+77 −0 cmsdb/campaigns/run3_2022_preEE_nano_v11/data.py
+129 −0 cmsdb/campaigns/run3_2022_preEE_nano_v11/ewk.py
+394 −0 cmsdb/campaigns/run3_2022_preEE_nano_v11/qcd.py
+565 −0 cmsdb/campaigns/run3_2022_preEE_nano_v11/top.py
+47 −0 cmsdb/campaigns/run3_2022_preEE_nano_v12/__init__.py
+176 −0 cmsdb/campaigns/run3_2022_preEE_nano_v12/data.py
+951 −0 cmsdb/campaigns/run3_2022_preEE_nano_v12/ewk.py
+56 −0 cmsdb/campaigns/run3_2022_preEE_nano_v12/hh2bbww.py
+15 −0 cmsdb/campaigns/run3_2022_preEE_nano_v12/hhh4b2tau.py
+1,246 −0 cmsdb/campaigns/run3_2022_preEE_nano_v12/higgs.py
+411 −0 cmsdb/campaigns/run3_2022_preEE_nano_v12/qcd.py
+1,132 −0 cmsdb/campaigns/run3_2022_preEE_nano_v12/top.py
+61 −8 cmsdb/constants/__init__.py
+4 −1 cmsdb/processes/__init__.py
+2,129 −0 cmsdb/processes/azh.py
+21 −1 cmsdb/processes/data.py
+1,193 −168 cmsdb/processes/ewk.py
+433 −0 cmsdb/processes/hh.py
+1,180 −424 cmsdb/processes/hh2bbtautau.py
+713 −0 cmsdb/processes/hh2bbvv.py
+0 −151 cmsdb/processes/hh2bbww.py
+39 −0 cmsdb/processes/hhh.py
+1,366 −238 cmsdb/processes/higgs.py
+369 −22 cmsdb/processes/qcd.py
+254 −87 cmsdb/processes/top.py
+108 −1 cmsdb/util.py
+187 −0 cmsdb/xsec_bsm_nodes.py
+2 −2 requirements.txt
+1 −2 requirements_dev.txt
+269 −0 scripts/get_das_info.py
+13 −0 tests/__init__.py
+153 −0 tests/test_campaigns.py
+34 −0 tests/test_processes.py
2 changes: 1 addition & 1 deletion modules/columnflow
Submodule columnflow updated 216 files
16 changes: 16 additions & 0 deletions sandboxes/_setup_combine.sh
Original file line number Diff line number Diff line change
Expand Up @@ -252,6 +252,22 @@ setup_combine() {
return "3002"
}

# clone the combine harvester repo
cd ${CMSSW_BASE}/src
git clone https://github.com/cms-analysis/CombineHarvester.git CombineHarvester || {
>&2 echo "failed to clone CombineHarvester git repository from URL ${CF_COMBINE_HARVESTER_GIT_URL}"
clear_pending
return "3003"
}

# check out the specified combine harvester version
cd CombineHarvester
git checkout "${CF_COMBINE_HARVESTER_VERSION}" || {
>&2 echo "failed to check out revision ${CF_COMBINE_HARVESTER_VERSION} from git repository"
clear_pending
return "3004"
}

# compile
cd ${CMSSW_BASE}
scram b -j4
Expand Down
8 changes: 5 additions & 3 deletions sandboxes/combine_cmssw.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@ action() {
# set variables and source the combine setup
export CF_SANDBOX_FILE="${CF_SANDBOX_FILE:-${this_file}}"
export CF_COMBINE_GIT_URL="${CF_COMBINE_GIT_URL:-https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit.git}"
export CF_COMBINE_SCRAM_ARCH="$( [ "${os_version}" = "8" ] && echo "el8" || echo "slc7" )_amd64_gcc10"
export CF_COMBINE_CMSSW_VERSION="CMSSW_12_6_2"
export CF_COMBINE_VERSION="${CF_COMBINE_VERSION:-v9.1.0}"
export CF_COMBINE_HARVESTER_GIT_URL="${CF_COMBINE_HARVESTER_GIT_URL:-https://github.com/cms-analysis/CombineHarvester.git}"
export CF_COMBINE_SCRAM_ARCH="el9_amd64_gcc12"
export CF_COMBINE_CMSSW_VERSION="CMSSW_14_1_0_pre5" # from combine docu 23.07.24
export CF_COMBINE_HARVESTER_VERSION="${CF_COMBINE_HARVESTER_VERSION:-v3.0.0-pre1}"
export CF_COMBINE_VERSION="${CF_COMBINE_VERSION:-v10.0.1}" # from combine docu 23.07.24
export CF_COMBINE_ENV_NAME="$( basename "${this_file%.sh}" )"
export CF_COMBINE_FLAG="1" # increment when content changed

Expand Down
15 changes: 12 additions & 3 deletions test_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,12 @@

analysis_inst = ana = AnalysisTask.get_analysis_inst(run3_analysis)
config_inst = cfg = ana.get_config(run3_config)
# ana = AnalysisTask.get_analysis_inst(default_analysis)
# config_inst = cfg = ana.get_config(default_config)

print(f"================= Analysis: {ana.name} ======================")
print(f"ID: {ana.id}")
print(f"Config: {cfg.name}")

print(" ================ Processes ======================")
process_insts = cfg.processes
Expand All @@ -33,7 +39,8 @@
dataset_insts = cfg.datasets
for data_inst in dataset_insts:
print(f"{data_inst.name}; N_events: {data_inst.n_events:,}")
print(f"Sum of all mc events: {sum([data_inst.n_events for data_inst in dataset_insts]):,}")
print(cfg.datasets.get(data_inst.name).processes.values()[0].xsecs)
print(f"Sum of all mc events: {sum([data_inst.n_events for data_inst in dataset_insts if not data_inst.name.startswith('data')]):,}")
print(f"Number of datasets: {len(dataset_insts)}")
print(f"Number of dataset files: {sum([data_inst.n_files for data_inst in dataset_insts])}")

Expand Down Expand Up @@ -61,8 +68,10 @@
print("================= Auxiliary ======================")
aux = cfg.aux
for key, value in aux.items():
print(key)
print(cfg.tags)
print(f"aux entry {key}")
print(f"tags: {cfg.tags}")
if cfg.has_tag("is_top_sf"):
print("This is a top SF config")

# print some features of an exemplary process inst
proc_inst = cfg.get_process("tt")
Expand Down
18 changes: 18 additions & 0 deletions topsf/columnflow_patches.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,28 @@

import law
from columnflow.util import memoize
import getpass


logger = law.logger.get_logger(__name__)


@memoize
def patch_htcondor_workflow_naf_resources():
"""
Patches the HTCondorWorkflow task to declare user-specific resources when running on the NAF.
"""
from columnflow.tasks.framework.remote import HTCondorWorkflow

def htcondor_job_resources(self, job_num, branches):
# one "naf_<username>" resource per job, indendent of the number of branches in the job
return {f"naf_{getpass.getuser()}": 1}

HTCondorWorkflow.htcondor_job_resources = htcondor_job_resources

logger.debug(f"patched htcondor_job_resources of {HTCondorWorkflow.task_family}")


@memoize
def patch_bundle_repo_exclude_files():
from columnflow.tasks.framework.remote import BundleRepo
Expand All @@ -37,3 +54,4 @@ def patch_bundle_repo_exclude_files():
@memoize
def patch_all():
patch_bundle_repo_exclude_files()
patch_htcondor_workflow_naf_resources()
4 changes: 2 additions & 2 deletions topsf/config/categories.py
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ def sel_tau32(
False,
)

assert cat_idx < 10**cat_idx_ndigits - 1, "no space for category, ID reassignement necessary"
assert cat_idx < 10**cat_idx_ndigits - 1, "no space for category, ID reassignment necessary"
cat = config.add_category(
name=cat_name,
id=int(10**cat_idx_lsd * ((cat_idx + 1) + 10 * 3)),
Expand All @@ -237,7 +237,7 @@ def sel_tau32(
("pass", "<", slice(None, cat_idx + 1)),
("fail", ">", slice(cat_idx + 1, None)),
]):
cat_label = rf"$\tau_{{3}}/\tau_{{2}}$ {comp_symbol} {tau32_val} ({pass_fail})"
cat_label = rf"{tau32_wp} wp: $\tau_{{3}}/\tau_{{2}}$ {comp_symbol} {tau32_val} ({pass_fail})"

cat_name = f"tau32_wp_{tau32_wp}_{pass_fail}"
sel_name = f"sel_{cat_name}"
Expand Down
2 changes: 0 additions & 2 deletions topsf/config/categories_wp.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@
from columnflow.categorization import Categorizer, categorizer

from topsf.config.util import create_category_combinations
from topsf.production.probe_jet import probe_jet

np = maybe_import("numpy")
ak = maybe_import("awkward")
Expand Down Expand Up @@ -153,7 +152,6 @@ def sel_pt_init(self: Categorizer) -> None:
column = self.cfg.get("column", "FatJet")
self.uses.add(f"{column}.pt")


assert cat_idx < 10**cat_idx_ndigits - 1, "no space for category, ID reassignment necessary"
cat_id = int(10**cat_idx_lsd * ((cat_idx + 1) + 300))
print(f"{cat_name = }, {cat_id = }, {cat_idx_lsd = }")
Expand Down
Loading