The AnalyticAnomalousCoupling provides a Combine-based model for EFT fits overcoming the issues of negative templates that may arise from interference terms.
Suppose to expand the SM Lagrangian with the inclusion of a dimension 6 operator (
Then the scattering amplitude would be written as
Where
As a consequence, the expected number of events in a given phase-space region scales with the Wilson coefficient
In order to fit the parameter
This package provides a workaround to this problem by rewriting the formula in terms of positive definite quantities (amplitude squared). For the simple example of the one-operator case the formule reads as:
By renaming
Sm =
Lin
Quad
Where the component Sm is the Standard Model amplitude squared, Quad is the dimmension-6 amplitude squared and Sm + Lin + Quad is given by the square of the sum of the SM and dimension-6 amplitudes
This model generalises the previous strategy to an arbitrary number of operators where the expected nuber of events in a given phase-space can be now written as:
And the rewriting in terms of positive defined quantities reads as:
Where
Sm =
Quad
Sm + Lin
Sm + Lin
cmsrel CMSSW_10_2_13
cd CMSSW_10_2_13/src
cmsenv
git clone https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit.git HiggsAnalysis/CombinedLimit
cd HiggsAnalysis/CombinedLimit
git fetch origin
git checkout v8.2.0
cd ..
git clone [email protected]:amassiro/AnalyticAnomalousCoupling.git
scramv1 b clean; scramv1 b # always make a clean build
cmsrel CMSSW_14_1_0_pre4
cd CMSSW_14_1_0_pre4/src
cmsenv
git clone https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit.git HiggsAnalysis/CombinedLimit
cd HiggsAnalysis/CombinedLimit
git fetch origin
git checkout v10.0.1
cd ..
git clone [email protected]:amassiro/AnalyticAnomalousCoupling.git
cd AnalyticAnomalousCoupling; git checkout el9-cmssw; cd -
scramv1 b clean; scramv1 b # always make a clean build
AnomalousCouplingEFTNegative
So you might ask yourself how to build templates and datacards in a way compatible with the combine model.
The model AnomalousCouplingEFTNegative
accepts the following inputs positively defined:
- SM
- SM + Lin
$_{\alpha}$ + Quad$_{\alpha}$ - Quad
$_{\alpha}$ - SM + Lin
$_{\alpha}$ + Quad$_{\alpha}$ + Lin$_{\beta}$ + Quad$_{\beta}$ + 2$\cdot$ Mix$_{\alpha, \beta}$
In the case of one operator, only the first three templates are expected, otherwise the last component has to be produced for each operator pair (
From MC tools one can usually retrieve the single shapes for SM, Lin, Quad components. For more explanation on how to generate these shapes with amplitude decomposition and reweighting methods we refer to the chapter Generating the shapes below in this guide or to the overwhelming literature.
Likely templates for a real-life situation are shown in figure as an example. Note how the linear component is negative and is not allowed in combine
The templates for SM
and Quad
are already what you need for the AnomalousCouplingEFTNegative
model. The linear shape should be substituted with SM+Lin+Quad
simply by summing all the shapes in the previous figure. An example of the result is reported in the following image
The datacard that can be built with those shapes reads as:
## Shape input card
imax 1 number of channels
jmax * number of background
kmax * number of nuisance parameters
----------------------------------------------------------------------------------------------------
bin inWW_cW
observation 0
shapes * * shapes/histos_inWW_cW.root histo_$PROCESS histo_$PROCESS_$SYSTEMATIC
shapes data_obs * shapes/histos_inWW_cW.root histo_Data
bin inWW_cW inWW_cW inWW_cW
process sm sm_lin_quad_cW quad_cW
process 1 2 3
rate 30611.7690 34426.6029 3957.9833
----------------------------------------------------------------------------------------------------
lumi lnN 1.02 1.02 1.02
----------------------------------------------------------------------------------------------------
The datacard name for the processes field should follow a precise convention in order for the model to build the pdfs correctly. This table summarises a 2 operator case (
Template | Datacard Name | Expression combine model |
---|---|---|
SM | sm |
func_sm("@0*(1-(@1+@2-@1*@2))",r,k_cW, k_cHW) |
Quad |
quad_cW |
func_quadratic_cW("@0*(@1*@1-@1)",r,k_cW) |
SM + Lin |
sm_lin_quad_cW |
func_sm_linear_quadratic_cW("@0*(@1 * (1-(@2) ))",r,k_cW, k_cHW) |
Quad |
quad_cHW |
func_quadratic_cHW("@0*(@1*@1-@1)",r,k_cHW) |
SM + Lin |
quad_cHW |
func_sm_linear_quadratic_cHW("@0*(@1 * (1-(@2) ))",r,k_cHW, k_cW) |
SM + Lin |
sm_lin_quad_mixed_cW_cHW |
func_sm_linear_quadratic_mixed_cW_cHW("@0*@1*@2",r,k_cW,k_cHW) |
cd test
text2workspace.py \
datacard1opNew.txt \
-P HiggsAnalysis.AnalyticAnomalousCoupling.AnomalousCouplingEFTNegative:analiticAnomalousCouplingEFTNegative \
-o model_test.root --X-allow-no-signal \
--PO eftOperators=cG
combine -M MultiDimFit model_test.root --algo=grid --points 2000 -m 125 -t -1 \
--redefineSignalPOIs k_cG \
--freezeParameters r \
--setParameters r=1 --setParameterRanges k_cG=-10,10 \
--verbose -1
r99t higgsCombineTest.MultiDimFit.mH125.root higgsCombineTest.MultiDimFit.mH125.root draw.cxx\(\"k_cG\"\)
Remember you need to specify the operators you are considering, because the more operators you want to consider the more inputs you need to provide in the datacard.
You can also add the dim8 operators by
--PO addDim8
but you can also just define the new operators by
--PO eftOperators=cS0,cS1,cT0
A model similar to AnomalousCouplingEFTNegative support different kind of EFT inputs: AnaliticAnomalousCouplingEFTNegativeExtended. In particular: EFT2Obs EFT analyses use JSON containing the bin-by-bin EFT parametrization from fiducial bins as defined in reco-level datacards. On the other hand, old EFT analyses were based on the [aTGCRooStat] (https://twiki.cern.ch/twiki/bin/viewauth/CMS/ATGCRooStats) framework where the EFT parametrization is stored in root files as histograms (TH*F) or functions (TF).
For the first kind of datacards, the combination works out of the box, one just needs to specify an additional command line argument while running text2workspace.py
so that the model can parse the EFT2Obs json file with the parametrization:
text2workspace.py combined.txt -P HiggsAnalysis.AnalyticAnomalousCoupling.AnaliticAnomalousCouplingEFTNegativeExtended:analiticAnomalousCouplingEFTNegativeExtended --X-allow-no-signal -o combined.root --PO eftOperators=cW --X-allow-no-background --PO EFTJsonMap=converted_aTGC_EFT2Obs_Wg_SMEFT_cx_into_cw.json
For the second type of analyses, exploiting aTGCRooStat, the problem is more subtle. The main idea is to convert the information inside the root files with the signal parametrization into a json file such as the one from EFT2Obs so that the combine model can readily digest them. The signal parametrization should be provided for each bin of the fitted distribution and each bin should be defined as a different process in the datacard (this should always be the case for aTGCRooStat datacards). The downside of this method is that one needs not only the datacards and the shapes but also the signal parametrization files. A script is provided to convert the parametrization from ROOT files to JSON files: convertATGCRooStatToJson.py
.
This script makes some assumptions: You should already have aTGCRooStat datacards and all root files should be available, those containing the EFT parametrization as TF(1,2,3) and those of the backgrounds. This scripts reads the parametrization files and converts it into a json file that can be read by AnalyticAnomalousCoupling EFT model.
The EFT parametrization file names should follow this convention:
<whatever>_<name>.root
Where <name>
is the one appearing in the datacard except for an arbitrary prefix that can be added
to all samples by specifying the c.l.a. -pp (or --processprefix) so that the datacard name reads as
<processprefix>_<name>
The parametrization should be know and should be the same for all the root input files. For example the files could contain a TF3 function with following content:
[0]+[1]*x+[2]*y+[3]*z+[4]*x*y+[5]*x*z+[6]*y*z+[7]*x*x+[8]*y*y+[9]*z*z
We do not know what the x
coeff., namely [1]
, stands for so it should be provided as an input to this script.
For example, we have multiple files containing the parametrization for each bin as:
signal_proc_ZVBF_ptZ_mu_3D_binx0.root, signal_proc_ZVBF_ptZ_mu_3D_binx1.root
signal_proc_ZVBF_ptZ_mu_3D_binx2.root, signal_proc_ZVBF_ptZ_mu_3D_binx3.root, ...
And atGCRooStat
compliant datacards with process names:
anoCoupl_process_ZVBF_ptZ_mu_3D_binx0, anoCoupl_process_ZVBF_ptZ_mu_3D_binx1
anoCoupl_process_ZVBF_ptZ_mu_3D_binx2, anoCoupl_process_ZVBF_ptZ_mu_3D_binx3, ...
The EFT parametrization in the signal_proc_*
files reads as (from a priori knowledge from analysts that developed the parametrization):
sm + x*lin_cx + y*lin_cw + z*lin_cb + x*y*mix_cx_cw + x*z*mix_cx_cb + y*z*mix_cw_cb +
+ x*x*quad_cx + y*y*quad_cw + z*z*quad_cb
Therefore this script can be called as:
python convertToJson.py -fg signal_proc_*.root -pp anoCoupl_process -fp signal_proc
-c sm:0,lin_cx:1,lin_cw:2,lin_cb:3,mix_cx_cw:4,mix_cx_cb:5,mix_cw_cb:6,quad_cx:7,quad_cw:8,quad_cb:9
-mp 8 -o converted_aTGC_EFT2Obs.json
Both of the methods have been cross-checked with two analysis examples: SMP-20-005 for the EFT2Obs part and SMP-16-018 for the aTGCRooStat
. A perfect agreement is found on the likelihood profiles therefore fully validating the method. The datacards for both analyses can respectively be found here-SMP-20-005 and here SMP-16-018.
Datacard combination works out of the box by specifying more than one json file in the --PO EFTJsonMap
separated by a comma, for example:
text2workspace.py combined.txt -P HiggsAnalysis.AnalyticAnomalousCoupling2.AnaliticAnomalousCouplingEFTNegativeExtended:analiticAnomalousCouplingEFTNegativeExtended --X-allow-no-signal -o combined.root --PO eftOperators=cW --X-allow-no-background --PO EFTJsonMap=thestand/VBFdatacards/converted_aTGC_EFT2Obs_VBFZ_SMEFT_cx_into_cw.json,SMP-20-005/puppi_phi_f_binned_AAC/converted_aTGC_EFT2Obs_VBFZ_SMEFT_cx_into_cw.json
This frameworks comes with plotting tools to support analyst with fast instruments to draw results of fits. For obvious reasons plots are available for 1D and 2D scans on the parameters of interests.
For plots of likelihood profiles scripts/mkEFTScan.py
supports both:
./scripts/mkEFTScan.py oned.root -p k_cqq3 -maxNLL 10 -lumi 138 -cms -preliminary -xlabel "c_{qq}^{(3)} [TeV^{-2}]"
./scripts/mkEFTScan.py twod.root -p k_cqq3 k_cqq31 -maxNLL 10 -lumi 138 -cms -preliminary -xlabel "c_{qq}^{(3)} [TeV^{-2}]" -ylabel "c_{qq}^{(3,1)} [TeV^{-2}]"
For profiled fits one usually scans one/two parameter of interest while leaving the others floating and free to maximse the likelihood at a specific POI. A debugging plotting tool can be useful to print the template (full BSM or single templates) at the fixed POI values on the scan. For plotting options can be concatenated to create a gif: scan
will just draw the scan as a gif, overall
will read the datacard and retrieve all templates for SM background and BSM EFT (as many operators as you want) and display the stacked distribution (this function also provides support for rateParameters). signal
does the same without but backgrounds (only sm
template from datacard). Lastly templates
method will draw signal
plus all the single templates that scale as a function of the various parameters.
The process is repeated for all regions of the datacards.
An example is provided in the following gif:
./scripts/mkEFTGifs.py -d datacard.txt -s higgsCombineTest.MultiDimFit.mH125.root -op k_cqq3 -rp top:CMS_hww_Topnorm2j WW:CMS_hww_WWnorm2j DY_hardJets:CMS_hww_DYnorm2j_hardJets DY_PUJets:CMS_hww_DYnorm2j_PUJets_2016 --frequency 2 -t scan overall signal templates --variables ewkz_2016_zjj_specified:"m_{jj} [GeV]" ewkz_2016_dycr:"m_{jj} [GeV]" --logy -drawSigma -lumi 138
It may happen that the expected yield (SM+EFT) in a bin evaluates negative. Combine will complain and return the maximum FCN value up to that point in the minimization to force MIGRAD to back out of the region. If one want to disable such behaviour and ignore the negative bin (setting its content to zero) add to the combine command the following run-time arguments
--X-rtd SIMNLL_NO_LEE --X-rtd NO_ADDNLL_FASTEXIT
The partial sums will be set to one (so log is zero) and no error will be propagated to RooFit:
The templates needed can be generated in two complementary ways either by amplitude decomposition or with reweighting.
With amplitude decomposition the single components (SM, Lin
SM:
import model SMEFTsim_U35_MwScheme_UFO-SMlimit_massless
generate p p > e+ ve mu- vm~
output WW_SM
Lin:
import model SMEFTsim_U35_MwScheme_UFO-cW_massless
generate p p > e+ ve mu- vm~ NP=1 NP^2==1
output WW_LI
Quad:
import model SMEFTsim_U35_MwScheme_UFO-cW_massless
generate p p > e+ ve mu- vm~ NP=1 NP^2==2
output WW_QU
Mix:
import model SMEFTsim_U35_MwScheme_UFO-cW_cHW_massless
generate p p > e+ ve mu- vm~ NP=1 NP^2==1
output WW_Mix
With the reweighting method one can generatefrom the full Lagrangian + one operator and extract the components thanks to a reweighting procedure.
Suppose we start our generation by including all terms, setting the coupling of
SM+Lin+Quad:
import model SMEFTsim_U35_MwScheme_UFO-cW_massless
generate p p > e+ ve mu- vm~ NP=1
output WW_Reweight
We can now change at reweighting the coupling in order to generate different components
SM:
change helicity False
change rwgt_dir rwgt
# SM: rwgt_1
launch
set SMEFT 2 0
SM-Lin+Quad:
# SM - Lin + Quad: rwgt_2
launch
set SMEFT 2 -1
Each event will now have two new reweighting - weights corresponding to the hypothesis
For more operators one can generate the components for a single operator by setting all the others at 0 and scanning the values -1,0,1 as described above. For the mixed term instead one should also generate the weight:
# SM + Lin + Quad + Lin + Quad + Mix: rwgt_3
launch
set SMEFT 2 1
set SMEFT 7 1
And the algebra on the weights reads as follows
We stress that these weights should be computed on an event-by-event basis.
Run:
get the file:
wget https://www.dropbox.com/s/37auaiyq4qw4o3j/histo_0p3.root
sm->Integral(-1, sm->GetNbinsX())
linear->Integral(-1, linear->GetNbinsX())
quadratic->Integral(-1, quadratic->GetNbinsX())
histo_sm->Integral(-1, histo_sm->GetNbinsX())
histo_linear->Integral(-1, histo_linear->GetNbinsX())
histo_quadratic->Integral(-1, histo_quadratic->GetNbinsX())
folder file.py object defined in the file.py
text2workspace.py datacard.txt -P HiggsAnalysis.AnalyticAnomalousCoupling.AnomalousCoupling:analiticAnomalousCoupling --PO=k_my,r -o model_test.root
combine -M MultiDimFit model_test.root --algo=grid --points 120 -m 125 -t -1 --expectSignal=1 \
--redefineSignalPOIs k_my --freezeParameters r --setParameters r=1 --setParameterRanges k_my=-20,20 \
--verbose -1
combine -M MultiDimFit model_test.root --algo=grid --points 120 -m 125 -t -1 --expectSignal=1 \
--redefineSignalPOIs k_my --freezeParameters r --setParameters r=1 --setParameterRanges k_my=-20,20
text2workspace.py datacard2op.txt -P HiggsAnalysis.AnalyticAnomalousCoupling.AnomalousCoupling:analiticAnomalousCoupling -o --numOperators=2 model_test.root
combine -M MultiDimFit model_test.root --algo=grid --points 120 -m 125 -t -1 --expectSignal=1 \
--redefineSignalPOIs k_my_1 --freezeParameters r,k_my_2 --setParameters r=1,k_my_1=0,k_my_2=0 --setParameterRanges k_my_1=-20,20 \
--verbose -1
text2workspace.py datacard3op.txt -P HiggsAnalysis.AnalyticAnomalousCoupling.AnomalousCouplingEFT:analiticAnomalousCouplingEFT -o model_test.root
combine -M MultiDimFit model_test.root --algo=grid --points 240 -m 125 -t -1 --expectSignal=1 \
--redefineSignalPOIs k_cG \
--freezeParameters r,k_cGtil,k_cH,k_cHB,k_cHBtil,k_cHDD,k_cHG,k_cHGtil,k_cHW,k_cHWB,k_cHWBtil,k_cHWtil,k_cHbox,k_cHd,k_cHe,k_cHl1,k_cHl3,k_cHq1,k_cHq3,k_cHu,k_cHudAbs,k_cHudPh,k_cW,k_cWtil,k_cdBAbs,k_cdBPh,k_cdGAbs,k_cdGPh,k_cdHAbs,k_cdHPh,k_cdWAbs,k_cdWPh,k_cdd,k_cdd1,k_ceBAbs,k_ceBPh,k_ceHAbs,k_ceHPh,k_ceWAbs,k_ceWPh,k_ced,k_cee,k_ceu,k_cld,k_cle,k_cledqAbs,k_cledqPh,k_clequ1Abs,k_clequ1Ph,k_clequ3Abs,k_clequ3Ph,k_cll,k_cll1,k_clq1,k_clq3,k_clu,k_cqd1,k_cqd8,k_cqe,k_cqq1,k_cqq11,k_cqq3,k_cqq31,k_cqu1,k_cqu8,k_cquqd1Abs,k_cquqd1Ph,k_cquqd8Abs,k_cquqd8Ph,k_cuBAbs,k_cuBPh,k_cuGAbs,k_cuGPh,k_cuHAbs,k_cuHPh,k_cuWAbs,k_cuWPh,k_cud1,k_cud8,k_cuu,k_cuu1 \
--setParameters r=1 --setParameterRanges k_cG=-10,10 \
--verbose -1
combine -M MultiDimFit model_test.root --algo=grid --points 400 -m 125 -t -1 --expectSignal=1 \
--redefineSignalPOIs k_cG,k_cGtil \
--freezeParameters r,k_cH,k_cHB,k_cHBtil,k_cHDD,k_cHG,k_cHGtil,k_cHW,k_cHWB,k_cHWBtil,k_cHWtil,k_cHbox,k_cHd,k_cHe,k_cHl1,k_cHl3,k_cHq1,k_cHq3,k_cHu,k_cHudAbs,k_cHudPh,k_cW,k_cWtil,k_cdBAbs,k_cdBPh,k_cdGAbs,k_cdGPh,k_cdHAbs,k_cdHPh,k_cdWAbs,k_cdWPh,k_cdd,k_cdd1,k_ceBAbs,k_ceBPh,k_ceHAbs,k_ceHPh,k_ceWAbs,k_ceWPh,k_ced,k_cee,k_ceu,k_cld,k_cle,k_cledqAbs,k_cledqPh,k_clequ1Abs,k_clequ1Ph,k_clequ3Abs,k_clequ3Ph,k_cll,k_cll1,k_clq1,k_clq3,k_clu,k_cqd1,k_cqd8,k_cqe,k_cqq1,k_cqq11,k_cqq3,k_cqq31,k_cqu1,k_cqu8,k_cquqd1Abs,k_cquqd1Ph,k_cquqd8Abs,k_cquqd8Ph,k_cuBAbs,k_cuBPh,k_cuGAbs,k_cuGPh,k_cuHAbs,k_cuHPh,k_cuWAbs,k_cuWPh,k_cud1,k_cud8,k_cuu,k_cuu1 \
--setParameters r=1 --setParameterRanges k_cG=-10,10:k_cGtil=-10,10
,k_cG,k_cGtil,k_cH,
To simulate "sm" part, k_my == 0 :
--setParameters r=1,k_my=0
combine -M MultiDimFit model_test.root --algo=grid --points 120 -m 125 -t -1 --expectSignal=1 \
--redefineSignalPOIs k_my --freezeParameters r --setParameters r=1,k_my=0 --setParameterRanges k_my=-20,20
Plot:
r99t higgsCombineTest.MultiDimFit.mH125.root higgsCombineTest.MultiDimFit.mH125.root draw.cxx
r99t higgsCombineTest.MultiDimFit.mH125.root higgsCombineTest.MultiDimFit.mH125.root draw.cxx\(\"k_cG\"\)
r00t higgsCombineTest.MultiDimFit.mH125.root draw2D.cxx\(\"cG\",\"Gtil\",\"k_cG\",\"k_cGtil\"\)