-
Notifications
You must be signed in to change notification settings - Fork 1
/
diff_introduction.tex
96 lines (90 loc) · 5.57 KB
/
diff_introduction.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
diff --git a/introduction.tex b/introduction.tex
index 240a7ad..c83d59c 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -1,46 +1,39 @@
\section{Introduction} % 1 page
\label{sec:introduction}
-%A common problem in data analysis is to determine the parameters of
-%a narrow signal which occurs on a wide, smoothly varying background.
-%In high energy physics, this is often achieved using a maximum likelihood technique~\cite{ref:Fisher01011922}
-%in which separate parametric models for the signal and background processes are constructed and
-%``fit'' to the data.
-%If, however, the shape of the background is not known {\it a priori}, then there will be
-%some uncertainty in the signal parameters resulting from the uncertainty in
-%the function used. This issue is exacerbated in the case of a small signal to
-%background ratio.
-
-A common problem in data analysis is that the underlying model, or components of it, which is used to describe a dataset
-is not always known. In high energy physics, determination of signal parameters is
-often achieved using a maximum likelihood technique~\cite{ref:Fisher01011922}
-in which parametric models for the signal and background processes are constructed and
-``fit'' to the data. However in certain circumstances the exact parametrisation, not just the parameter
-values, of the underlying models is not {\it a priori} known. Consequently, there is some uncertainty in the parameters which results from the uncertainty in the function used.
-
-A common approach to asses this systematic is to fit various different plausible functions and
-determine the spread of the parameters of interest when using these functions.
+A very common problem in data analysis is to determine the parameters of
+a narrow signal which occurs on a wide, smoothly varying background.
+In high energy physics, this is often achieved using a maximum likelihood technique~\cite{ref:Fisher01011922}
+in which separate parametric models for the signal and background processes are constructed and
+``fit'' to the data.
+If, however, the shape of the background is not known {\it a priori}, then there will be
+some uncertainty in the signal parameters resulting from the uncertainty in
+the function used. This issue is exacerbated in the case of a small signal to
+background ratio.
+
+A common approach is to try various different plausible functions and
+determine the spread of the signal parameters when using these functions.
However, these methods tend to have some degree of arbitrariness and so
a new approach is discussed in this paper.
This new method was developed as part of the analysis of data at the CMS experiment
following the discovery of the Higgs
boson~\cite{ref:introduction:atlasdis,ref:introduction:cmsdis}.
-It was applied to the analysis of Higgs decays to two photons, which
-results in a narrow signal on a large
+It was applied to the case where the Higgs decays to two photons, which
+results in a narrow signal on a large
background~\cite{ref:introduction:legacy}.
-The method presented tries to be less
+This method tries to be less
arbitrary and treats the uncertainty associated with the
background parameterisation in a way
-which is more comparable with the treatment of other
-uncertainties associated with the measurement; the choice of background
+which is more comparable with the treatment of other
+uncertainties associated with the measurement; the choice of background
function results in a systematic uncertainty
which is handled as a nuisance parameter~\cite{ref:intro:nusiances}.
-There are two major new components to this approach, namely the method for
-treating the choice of function as a nuisance parameter, and how to compare
+There are two major new parts to this approach, namely the method for
+treating the choice of function as a nuisance parameter, and how to compare
functions with different numbers of parameters.
-The concept is described in Section~\ref{sec:concept}.
+The concept of this approach is described in Section~\ref{sec:concept}.
The application of the method to functions with the same number of parameters
is described in Section~\ref{sec:functions} and to functions with different
numbers of parameters in Section~\ref{sec:correction}. Further discussion on
@@ -48,18 +41,18 @@ the method, namely its practical application to the real-world problem of
the Higgs measurements, is given in Section~\ref{sec:discussion}.
Within this paper, twice the negative of the logarithm of the likelihood
-function is denoted by \nll. The data are binned and the
+function is denoted by \nll. The fits discussed are binned and the
likelihood used for each bin is the Poisson likelihood ratio to the best
possible likelihood given the observed data, i.e.
\begin{equation}
%{\rm \nll}_i = \nu_i - n_i + n_i \ln\left(\frac{n_i}{\nu_i}\right),
-\nll = 2\cdot\sum_{i} \nu_i - n_i + n_i \ln\left(\frac{n_i}{\nu_i}\right),
+\nll = \sum_{i} \nu_i - n_i + n_i \ln\left(\frac{n_i}{\nu_i}\right),
\label{eqn:introduction:def2NLL}
\end{equation}
-where $n_{i}$ is the observed and $\nu_{i}$ is the expected number of events
-given a particular background model in the $i^{\rm th}$ bin.
+where $n_{i}$ are the observed and $\nu_{i}$ is the expected number of events
+given a particular background model in the $i^{\rm th}$ bin.
-For the purposes of fitting and generating datasets, the statistical package
-``RooFit'' is used throughout this paper~\cite{ref:roofit}.
+For the purposes of fitting and generating datasets, the statistical package
+``RooFit'' is used throughout this paper~\cite{ref:roofit}.
%FREQUENTIST UNLESS OTHERWISE STATED (ALSO SEE DISCUSSION)?