Skip to content

Running Tasks

Yannik Rath edited this page Jan 22, 2020 · 1 revision

Typical commands to run the full scale factor calculation. Later commands can also be triggered directly, but this allows a step-by-step production.

Initial SF calculation

1. Write- and MergeTrees

law run WriteTreesWrapper --version prod1 --WriteTrees-transfer-logs --WriteTrees-threads 2 --WriteTrees-poll-interval 5 --workers <X>

Due to the high number of data files, it may make sense to only run MC samples (--skip-datasets "data_*") and then run multiple data files per job (WriteTrees-tasks-per-job <X>)

law run MergeTreesWrapper --version prod1 --MergeTrees-transfer-logs --MergeTrees-threads 2 --MergeTrees-poll-interval 5 --workers <X>

2. Write- and MergeHistograms

law run WriteHistogramsWrapper --version prod1 --WriteHistograms-variable-tag measurement --WriteHistograms-b-tagger deepcsv --MergeTrees-n-cascade-leaves 300 --WriteHistograms-transfer-logs --WriteHistograms-poll-interval 5 --WriteHistograms-threads 2 --workers <X>

This task (and all following tasks) should be run for all considered b-taggers (currently deepcsv and deepjet) separately.

Merging histograms is done simply with

law run MergeHistograms --version prod1 --b-tagger deepcsv

Optimized Binning

To run with optimized binning, add the flag --WriteHistograms-optimize-binning. This requires first calculating an optimized binning. For this, write and merge histograms specifying the binning with --binning "<n_bins>,<min>,<max>". Make sure to require the correct versions of Write- and MergeHistograms. The OptimizeBinning task can be run as

law run OptimizeBinning --version prod1 --MergeHistograms-version prod1_binning --b-tagger deepcsv --is-configured

3. Measure- and FitScaleFactors

law run FitScaleFactorsWrapper --version prod1 --FitScaleFactors-b-tagger deepcsv --skip-shifts "c_stats*" --workers X

Iterative procedure

The scale factor calculation is repeated, using the initial SFs to improve the modeling of the contamination. All tasks starting from WriteHistograms need to be re-run with --iteration 1 as an additional argument, and then with --iteration 2 (for WrapperTasks, use --<task-name>-iteration X).

At the end of the final iteration, the scale factors are renormalized. The required weights for this are calculated with

law run GetScaleFactorWeightsWrapper --skip-datasets "data*" --GetScaleFactorWeights-iteration 2 --version prod1  --MergeTrees-n-cascade-leaves 300 --GetScaleFactorWeights-b-tagger deepcsv--GetScaleFactorWeights-optimize-binning  --workers X

and then merged

law run MergeScaleFactorWeights --version prod1 --b-tagger deepcsv --optimize-binning --iteration 2

Finally, the new scale factors are calculated by adding --FitScaleFactors-fix-normalization to the FitScaleFactorsWrapper.

With the normalized scale factors, we can now also calculated the systematics for c-jets (which are built from the b-jet systematics). For this, we can run the FitScaleFactorsWrapper without --skip-shifts "c_stats*" (and also without --FitScaleFactors-fix-normalization).

Now we also need to normalize the c scale factors be running GetScaleFactorWeights and MergeScaleFactorWeights with --normalize-cerrs and run FitScaleFactorsWrapper one final time without --skip-shifts and with --FitScaleFactors-fix-normalization.

The results in BTV ready format can be created with

law run CreateScaleFactorResults --iteration 2 --version prod1 --b-tagger deepcsv --optimize-binning

Plotting

The scale factors at any point in the calculation can be plotted with

law run PlotScaleFactor --version prod1 --b-taggers "deepcsv" --versions "prod1" --iterations 2 --FitScaleFactors-optimize-binning --MeasureScaleFactors-optimize-binning --fix-normalization --shifts <X>

b-taggers, versions, and iterations accept multiple arguments separated by comma, which will plot different SF results for comparison. --shifts also accepts multiple arguments, and will draw the systematics envelope if indeed multiple are provided. It defaults to all shifts.

Variables can be plotted with

law run PlotVariable --b-tagger deepcsv --category-tag inclusive --variable "jet{i_probe_jet}_{b_tag_var}_{region}_nominal" --iteration 2 --mc-split process --version prod1 --draw-stacked --truncate

The variable supports templating, with arguments taken from the category information. The MC sample is either divided into processes or flavors. --truncate should be used with b-tag variables, as the first bin goes from -2 to 0.

To plot variables with all scale factors combined, one last histogram version should be created with --iteration 3 --final-it. This is both to apply the final scale factors, and because only the contamination will be scaled in the measurement regions otherwise.

Plots for Analysis Note

(data - cont) vs 'signal'

law run PlotContaminationSubtracted --b-tagger deepjet --category-tag merged --iteration 2 --normalize --truncate --suffix bdiff --x-title "probe jet {b_tag_var}" --version test1

data vs MC directly before final scale factor calculation

law run PlotVariable --b-tagger deepjet --category-tag merged --iteration 2 --normalize --truncate --draw-stacked --mc-split flavor --suffix astack --x-title "probe jet {b_tag_var} --version test1

final scale factors (no shifts)

law run PlotScaleFactor --iterations 2 --FitScaleFactors-optimize-binning --MeasureScaleFactors-optimize-binning --fix-normalization --b-taggers deepjet --shifts "NONE" --suffix sf --x-title "{b_tag_var} discriminator"  --version test1

SF Validation

probe jet distributions before corrections

law run PlotVariable --b-tagger deepjet --category-tag combined --normalize --truncate --draw-stacked --mc-split process --x-title "probe jet {b_tag_var}" --suffix "probe_jet" --version test1

probe jet distributions after corrections

law run PlotVariable --b-tagger deepjet --category-tag combined --iteration 3 --final-it --normalize --truncate --draw-stacked --mc-split process --x-title "probe jet {b_tag_var}" --suffix "probe_jet" --version test1

For the n-tag distributions, repeat the same commands, but with --category_tag inclusive, --variable "n_tags_{b_tag_var}_{shift}", --x-title "numTag ({b_tag_var})", and --suffix "n_tags".

For the semileptonic region, repeat the commands with --category-tag sl, --suffix "sl_disc", --x-title "{b_tag_var} Discriminator" and --variable "jet{i_flavor_jet}_{b_tag_var}_hf_{shift}"