We are going to use this repository as starting point for the PHYS584 course. In this mini-stage you are going to work on beam test data for the High Granularity Calorimeter for the CMS experiment upgrade.
Content of this README:
- The High Granularity Calorimeter (HGCAL)
- Setting up the working environment
- Learning the data analysis tools
- First look into the data
- Description of the data content
- References
To have a better understanding of calorimetry, I highly suggest this set of slides as an entry point.
Remember: Always feel free to ask questions, either for curiosity or for a better understanding of what you are doing!
There are naive questions, tedious questions, ill-phrased questions, questions put after inadequate self-criticism. But every question is a cry to understand the world. There is no such thing as a dumb question.
The Large Hadron Collider (LHC) is intended to accumulate 300 fb^{-1} by the end of Run 3 (2023). After this data acquisition phase the third long shutdown (LS3) is planned to allow the upgrade operations necessary for the High Luminosity phase (HL-LHC). The HL-LHC is expected to reach an integrated luminosity of some 3000 fb^{-1} by the end of mid-2030s. In addition to this, the HL-LHC is expected to have a larger pileup rate up to 140-200 events per bunch crossing. The detectors at the current stage will not manage to cope with such a harsh environment and are undergoing an upgrade program.
As part of this upgrade program, the CMS Collaboration is proposing to build a high granularity calorimeter (HGCAL) to replace the current endcap calorimeters, which were designed to cope with an integrated luminosity of 500 fb^{-1}.
The HGCAL will be a sampling calorimeter with a silicon-based electromagnetic compartment and a mixture of silicon-based and scintillator tiles for the hadronic compartment. It will be the first large scale calorimeter where silicon is used as active material, hence beam tests are fundamental as a proof of concept of the HGCAL design.
In October 2018 the first large scale prototype of HGCAL was conducted at CERN, using electrons and pions over a wide energy range (20-300 GeV) to fully asssess the detector's performance. In this mini-stage you are going to use these data to learn how electromagnetic showers appear in such a calorimeter, to understand the main differences between electrons and pions showers and to make comparisons with theoretical models and Monte Carlo simulations.
More information about HGCAL can be found in the Technical Design Report (TDR) [1]. For this mini-stage it might be useful to have a look to the "Introduction and overview", the "Reconstruction and detector performance" chapters.
In this section you can find some useful material to start getting familiar with HGCAL and the structure of the data we are going to analyze in this mini-stage.
-
Set of slides by Artur: introduction to HGCAL
https://github.com/bonanomi/PHYS584/blob/master/slides/ALobanov_HGCAL_ICHEP2018.pdf
-
Set of slides by Artur: HGCAL beam test summary:
https://github.com/bonanomi/PHYS584/blob/master/slides/HGC_TBOct2018_Summary.pdf
-
Set of slides by Thorben, describing the structure of the TTrees we are going to use in this analysis:
https://github.com/bonanomi/PHYS584/blob/master/slides/Reco_ntuples_06Nov2018.pdf
-
Set of slides on 2018 beam tests results:
https://github.com/bonanomi/PHYS584/blob/master/slides/171009_HGCAL_Bonanomi_IPRD19.pdf
Do not hesitate to ask me any question or for additional material.
Setting up the working environment 1
You should first set up your laptop in a way that you can connect to the LLR servers and use the Jupyter notebook from the outside. For this, you need to create a SSH key and upload it to any LLR server. If you are using an operating system of the Microsoft Windows family, you can install Ubuntu via the Linux Subsystem for Windows and start a terminal emulator this way.
Once you are in your terminal, create the key with the following program (just hit enter all the time to accept the defaults):
ssh-keygen
Next, you copy the key over to an LLR server:
ssh-copy-id -i ~/.ssh/mykey [email protected]
This one time you'll need a password, just ask me about it. Now it's time to configure your ssh client to connect to the LLR severs via the correct proxy server from the outside. Just download the following config file to set it up and move it to the right directory:
mv ~/.ssh/config ~/.ssh/config.bak
wget -O ~/.ssh/config https://mbonanom.web.cern.ch/mbonanom/res/ssh.config
You should now be able to conect to any of the LLR interactive servers as follows, even from the outside:
ssh polui06
If this particular machine is not available, try any of polui01
, polui03
, polui04
, polui06
or polui07
.
To analyze the test beam data, we use the Jupyter Notebook.
From within LLR, which includes the LLR-WIFI, it is enough to connect to an interactive machine, preferentially polui04
:
ssh polui04
Start the notebook with:
notebook
You should now see a link that you can paste in your browser to access the Jupyter notebook. From outside LLR, one first has to make a SSH tunnel to the LLR network:
ssh -C -D 8888 llrgate02
You can now start the notebook from another terminal while keeping the proxy open.
For the access to the notebook from the outside, you have to configure the proxy in your browser. Under your browser network configurations, you should configure the proxy manually and put 127.0.0.1
as SOCKS host as well as 8888
as port. Pick SOCKS v5 when in doubt. When you want to use your browser without the tunnel after you finished your session, you have to disable the manual proxy again.
In this ministage we are going to use the python
environment for our analysis. Have a look at the tutorials folder to find some notebooks with quick examples on numpy
and matplotlib
. The data we are going to use come in .root
format, a very common one in HEP as it was born together with the ROOT framework, one of the most used frameworks for data analysis in high energy physics. Here we are going to use uproot
, a reader and a writer of the ROOT file format using only python and numpy, to open our ROOT files as pandas
data frames.
The entry point for the work we are going to do is the tutorials
folder. The best strategy for you to work is to clone this repository, add a new folder (e.g. work
) and develop your notebooks starting from the questions and the exercises you have under the tutorials
folder.
Do not forget your best friend when working with the code: the internet! The internet is an extremely powerful source for python-related questions. Check references such as stack overflow
, Jake VanderPlas website
and his open-source book.
Have a look at this article: "Everything you wanted to know about Data Analysis and Fittingbut were afraid to ask".
[1] The Phase-2 Upgrade of the CMS Endcap Calorimeter: https://cds.cern.ch/record/2293646
Footnotes
-
From Jonas' ministage repository. ↩