NB: the repo is archived as the developments with additional updates have been merged into the core TauAlgo group framework.
This repository is an area for R&D in ML-based tau lepton identification, in order to improve upon the existing baseline of the DeepTau architecture. The repository consists in standalone modules which assume to have as input files preprocessed within the core TauAlgo group framework. The parts/modules are:
create_dataset.py
-> preprocessing of the input ROOT files directly into TensorFlow ragged arrays withawkward
, optimised withnumba
.train.py
-> on-the-fly composition of the training dataset via sampling across classes, optimised with unified dynamic batching. Experiment tracking withmlflow
.models/
: adaptation of Transformer, ParticleNet, and Particle Convolutions, with custom embedding layer to handle heterogeneous input collections in a unified way.predict.py
-> model inference script.plot_roc.py
-> plotting of per-class ROC curves with statistical uncertainty.visualization_notebook.ipynb
+utils/visualize.py
-> visualisation of self-attention weights and corresponding particle interactions as aplotly
widget.