Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add datatypes for storing generator (meta)data in a structured and defined way #310

Merged
merged 36 commits into from
Jun 18, 2024
Merged
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
a2a9bda
add GeneratorInformation definition
hegner May 22, 2024
cfcfbdd
remove typo
hegner May 22, 2024
3b7ae97
add better member documentation
hegner May 22, 2024
4338c77
cut of GeneratorPDFInfo
hegner May 22, 2024
9dd2355
add more explicit docs; fix types
hegner May 22, 2024
4d896d8
fix URL
hegner May 22, 2024
47e08e9
add new types to documentation and tests
hegner May 23, 2024
dd08dfb
add new generator data and metadata to the write example
hegner May 23, 2024
32c31ae
Update edm4hep.yaml
hegner May 23, 2024
bf612a5
include clang-format suggestions
hegner May 24, 2024
b4e56bf
Update include/edm4hep/GenToolInfo.h
hegner May 27, 2024
b44db85
Update include/edm4hep/GenToolInfo.h
hegner May 27, 2024
f992f4c
Update edm4hep.yaml
hegner May 27, 2024
c4fe0c8
Update include/edm4hep/GenToolInfo.h
hegner May 27, 2024
1f7ae9b
Update include/edm4hep/GenToolInfo.h
hegner May 27, 2024
4873b6f
include PR comments; improve test coverage
hegner May 28, 2024
34f1ddf
fix renamed class in readme file
hegner May 28, 2024
3067af3
move signalVertex member to generatorEventParameters
hegner Jun 4, 2024
be27423
add doc for generator information
hegner Jun 4, 2024
e329148
Merge branch 'main' into geninfo
hegner Jun 4, 2024
0e823ea
rename generator related members to fit naming conventions
hegner Jun 4, 2024
763bcd8
fix line numbers in README
hegner Jun 4, 2024
8823133
fix issues of pre-commit
hegner Jun 4, 2024
3ddd0c3
Update include/edm4hep/Constants.h
hegner Jun 11, 2024
0364ea6
Update include/edm4hep/GenToolInfo.h
hegner Jun 11, 2024
e9154e4
address PR comments about util namespaces and string naming convention
hegner Jun 11, 2024
a4bd341
parameters are now optional
hegner Jun 11, 2024
cc21262
Update test/read_events.h
hegner Jun 12, 2024
c7d5662
Update include/edm4hep/Constants.h
hegner Jun 12, 2024
1bd26b3
Merge branch 'main' into geninfo
tmadlener Jun 17, 2024
36726ec
Update all label spellings to compile again
tmadlener Jun 17, 2024
6985838
Rename GeneratorToolInfo for more consistency
tmadlener Jun 17, 2024
3aa87f3
Add documentation to generator tool info handling utilities
tmadlener Jun 17, 2024
921c5bd
Rename header for consistency and fix occurences
tmadlener Jun 17, 2024
8d33ff2
Pluralize labels
tmadlener Jun 17, 2024
4a9d409
Pluralize all usages
tmadlener Jun 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,18 @@ A generic event data model for future HEP collider experiments.
| [MCRecoTrackerHitPlaneAssociation](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L644) | [MCRecoCaloParticleAssociation](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L653) | [MCRecoClusterParticleAssociation](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L662) |
| [MCRecoTrackParticleAssociation](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L671) | [RecoParticleVertexAssociation](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L680) | |

**Interfaces**
**Generator related (meta-)data**

| | | |
|-|-|-|
| [TrackerHit](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L787) | | |
| [GeneratorEventParameters](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L790) | | |
| [GeneratorPdfInfo](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L807) | | |

**Interfaces**

| | | |
|-|-|-|
| [TrackerHit](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L818) | | |

The tests and examples in the `tests` directory show how to read, write, and use these types in your code.

Expand Down
35 changes: 35 additions & 0 deletions doc/GeneratorInfo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Dealing with Generator (meta-)data

EDM4hep provides data types and helper functions to handle the storage of generator meta data, both at run and event level.

## Storing and retrieving run level information
At run level, information about generator tools (name, version, description) are being stored. They can be written via


```cpp
#include "edm4hep/GeneratorToolInfo.h"
// ...
// write some generator tool info into the run
auto toolInfo = edm4hep::GeneratorToolInfo();
auto toolInfos = std::vector<edm4hep::GeneratorToolInfo>();
toolInfo.name = "something";
toolInfo.version = "v1";
toolInfo.description = "some tool";
toolInfos.emplace_back(std::move(toolInfo));

edm4hep::utils::putGenToolInfos(run, toolInfos);
```

and read-back via:

```cpp
#include "edm4hep/GeneratorToolInfo.h"
// ...
auto toolInfos = edm4hep::utils::getGenToolInfos(run);

```

### Storing and retrieving event specific generator parameters and PDF information

For storing information about event level parameters of generators one can use the type `GeneratorEventParameters`.
For storing information about PDFs, one can use the type `GeneratorPdfInfo`.
31 changes: 31 additions & 0 deletions edm4hep.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -783,6 +783,37 @@ datatypes:
OneToOneRelations:
- edm4hep::Track track // the corresponding track


#===== Generator related data =====

#---------- GeneratorEventParameters
edm4hep::GeneratorEventParameters:
Description: "Generator event parameters"
Author: "EDM4hep authors"
Members:
- double eventScale // event scale
- double alphaQED // alpha_QED
- double alphaQCD // alpha_QCD
- int signalProcessId // id of signal process
- double sqrts [GeV] // sqrt(s)
VectorMembers:
- double crossSections [pb] // list of cross sections
- double crossSectionErrors [pb] // list of cross section errors
OneToManyRelations:
- edm4hep::MCParticle signalVertex // List of initial state MCParticle that are the source of the hard interaction


#---------- GeneratorPdfInfo
edm4hep::GeneratorPdfInfo:
Description: "Generator pdf information"
Author: "EDM4hep authors"
Members:
- std::array<int, 2> partonId // Parton PDG id
- std::array<int, 2> lhapdfId // LHAPDF PDF id (see https://lhapdf.hepforge.org/pdfsets.html)
- std::array<double, 2> x // Parton momentum fraction
- std::array<double, 2> xf // PDF value
- double scale [GeV] // Factorisation scale

interfaces:
edm4hep::TrackerHit:
Description: "Tracker hit interface class"
Expand Down
8 changes: 8 additions & 0 deletions include/edm4hep/Constants.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,14 @@ namespace labels {
static constexpr const char* PIDParameterNames = "ParameterNames";
static constexpr const char* PIDAlgoName = "AlgoName";
static constexpr const char* PIDAlgoType = "AlgoType";

// Parameter names for Generator level metadata
static constexpr const char* GeneratorToolVersion = "GeneratorToolVersion";
static constexpr const char* GeneratorToolName = "GeneratorToolName";
static constexpr const char* GeneratorToolDescription = "GeneratorToolDescription";
tmadlener marked this conversation as resolved.
Show resolved Hide resolved
static constexpr const char* GeneratorEventParameters = "GeneratorEventParameters";
static constexpr const char* GeneratorPdfInfo = "GeneratorPdfInfo";
static constexpr const char* GeneratorWeightNames = "GeneratorWeightNames";
} // namespace labels

DEPRECATED_LABEL(CellIDEncoding, CellIDEncoding);
Expand Down
88 changes: 88 additions & 0 deletions include/edm4hep/GeneratorToolInfo.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
#ifndef EDM4HEP_GENERATORTOOLINFO_H
#define EDM4HEP_GENERATORTOOLINFO_H

#include "edm4hep/Constants.h"
#include "podio/Frame.h"
#include <string>
#include <vector>

namespace edm4hep {

/// Meta information class to group information about the used generator (tools)
/// to create a file
///
/// @note Since this is all rather loosely coupled and stored via podio Frame
/// parameters, use of the @ref getGenToolInfos and @ref putGenToolInfos utility
/// functions for retrieval, resp. storage of this information is crucial to
/// ensure consistent information.
struct GeneratorToolInfo {
std::string name{}; ///< The name of the tool
std::string version{}; ///< The version of the tool
std::string description{}; ///< A brief description of the tool

/// Construct a generator tool info object with all empty fields
GeneratorToolInfo() = default;

/// Construct a complete tool info object from all ingredients
///
/// @param name The name of the tool
/// @param version The version of the tool
/// @param description The brief description of the tool
GeneratorToolInfo(const std::string& name, const std::string& version, const std::string& description) :
name(name), version(version), description(description){};
};

namespace utils {

/// Get all the generator tool infos that are available from the passed
/// (metadata) frame.
///
/// Tries to retrieve all meta information that are available and that have
/// been stored via the @ref putGenToolInfos function.
///
/// @param frame The (metadata) frame that should be queried for the
/// information
///
/// @returns The GeneratorToolInfo that were found in the Frame. If none ar
/// found an empty vector will be returned.
const inline std::vector<GeneratorToolInfo> getGenToolInfos(const podio::Frame& frame) {
using namespace edm4hep::labels;
auto toolInfos = std::vector<GeneratorToolInfo>();
const auto names =
frame.getParameter<std::vector<std::string>>(GeneratorToolName).value_or(std::vector<std::string>{});
const auto versions =
frame.getParameter<std::vector<std::string>>(GeneratorToolVersion).value_or(std::vector<std::string>{});
const auto descriptions =
frame.getParameter<std::vector<std::string>>(GeneratorToolDescription).value_or(std::vector<std::string>{});
tmadlener marked this conversation as resolved.
Show resolved Hide resolved
for (unsigned int i = 0; i < names.size(); ++i) {
toolInfos.emplace_back(names[i], versions[i], descriptions[i]);
}
return toolInfos;
};

/// Put the generator tool meta information into a (metadata) frame
///
/// In order to guarantee consistent storage and retrieval of these metadata
/// it is necessary to use this utility function.
///
/// @param frame The (metadata) Frame into which the generator tool info should go
/// @param toolInfos The generator tool infos that should be stored
void inline putGenToolInfos(podio::Frame& frame, const std::vector<GeneratorToolInfo>& toolInfos) {
auto names = std::vector<std::string>();
auto versions = std::vector<std::string>();
auto descriptions = std::vector<std::string>();
for (auto& toolInfo : toolInfos) {
names.push_back(toolInfo.name);
versions.push_back(toolInfo.version);
descriptions.push_back(toolInfo.description);
}

using namespace edm4hep::labels;
frame.putParameter(GeneratorToolName, std::move(names));
frame.putParameter(GeneratorToolVersion, std::move(versions));
frame.putParameter(GeneratorToolDescription, std::move(descriptions));
tmadlener marked this conversation as resolved.
Show resolved Hide resolved
};
} // namespace utils
} // namespace edm4hep

#endif // EDM4HEP_GENERATORTOOLINFO_H
38 changes: 38 additions & 0 deletions test/read_events.h
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@

// test data model
#include "edm4hep/CaloHitContributionCollection.h"
#include "edm4hep/GeneratorEventParametersCollection.h"
#include "edm4hep/GeneratorPdfInfoCollection.h"
#include "edm4hep/GeneratorToolInfo.h"
#include "edm4hep/MCParticleCollection.h"
#include "edm4hep/RawTimeSeriesCollection.h"
#include "edm4hep/SimCalorimeterHitCollection.h"
Expand All @@ -16,6 +19,27 @@
// STL
#include <iostream>

void processRun(const podio::Frame& run) {
//===============================================================================
// get generator tool info from the run
auto toolInfos = edm4hep::utils::getGenToolInfos(run);
auto toolinfo = toolInfos[0];
if (toolinfo.name != "something")
throw std::runtime_error("toolinfo.name != 'something'");
if (toolinfo.version != "v1")
throw std::runtime_error("toolinfo.version != 'v1'");
if (toolinfo.description != "some tool")
throw std::runtime_error("toolinfo.description != 'some tool'");

//===============================================================================
// get generator weight names
auto weightNames = run.getParameter<std::vector<std::string>>(edm4hep::labels::GeneratorWeightNames).value();
if (weightNames[0] != "oneWeight")
throw std::runtime_error("weightNames[0] != 'oneWeight'");
if (weightNames[1] != "anotherWeight")
throw std::runtime_error("weightNames[1] != 'anotherWeight'");
}

void processEvent(const podio::Frame& event) {
auto& mcps = event.get<edm4hep::MCParticleCollection>("MCParticles");
auto& sths = event.get<edm4hep::SimTrackerHitCollection>("SimTrackerHits");
Expand Down Expand Up @@ -232,6 +256,17 @@ void processEvent(const podio::Frame& event) {
throw std::runtime_error("Collection 'TrackerHitPlanes' should be present");
}

//===============================================================================
// check the generator meta data
auto& genParametersCollection =
event.get<edm4hep::GeneratorEventParametersCollection>(edm4hep::labels::GeneratorEventParameters);
auto genParam = genParametersCollection[0];
if (genParam.getEventScale() != 23)
throw std::runtime_error("Event_scale != 23");

auto& generatorPdfInfoCollection = event.get<edm4hep::GeneratorPdfInfoCollection>(edm4hep::labels::GeneratorPdfInfo);
auto genPdfInfo = generatorPdfInfoCollection[0];

// //===============================================================================
// if( sccons.isValid() ){
// } else {
Expand All @@ -251,6 +286,9 @@ void read_events(const std::string& filename) {
ReaderT reader;
reader.openFile(filename);

const auto run = podio::Frame(reader.readNextEntry("runs"));
processRun(run);

unsigned nEvents = reader.getEntries("events");
for (unsigned i = 0; i < nEvents; ++i) {
std::cout << "reading event " << i << std::endl;
Expand Down
47 changes: 47 additions & 0 deletions test/write_events.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,11 @@
#define EDM4HEP_TEST_WRITE_EVENTS_H

// Data model

#include "edm4hep/CaloHitContributionCollection.h"
#include "edm4hep/GeneratorEventParametersCollection.h"
#include "edm4hep/GeneratorPdfInfoCollection.h"
#include "edm4hep/GeneratorToolInfo.h"
#include "edm4hep/MCParticleCollection.h"
#include "edm4hep/RawTimeSeriesCollection.h"
#include "edm4hep/SimCalorimeterHitCollection.h"
Expand All @@ -28,6 +32,7 @@ void write(std::string outfilename) {
for (unsigned i = 0; i < nevents; ++i) {
std::cout << " --- processing event " << i << std::endl;
auto event = podio::Frame();
auto run = podio::Frame();

// place the following generator event to the MCParticle collection
//
Expand Down Expand Up @@ -108,6 +113,47 @@ void write(std::string outfilename) {
}
}

//===============================================================================
// write some generator event data
auto genParametersCollection = edm4hep::GeneratorEventParametersCollection();
auto genParam = genParametersCollection.create();
genParam.setEventScale(23);
genParam.setAlphaQED(1 / 127);
genParam.setAlphaQCD(0.1);
genParam.setSignalProcessId(42);
genParam.setSqrts(90);
genParam.addToCrossSections(10);
genParam.addToCrossSectionErrors(3);
genParam.addToSignalVertex(mcp1);
genParam.addToSignalVertex(mcp2);
event.put(std::move(genParametersCollection), edm4hep::labels::GeneratorEventParameters);

auto genPdfInfoCollection = edm4hep::GeneratorPdfInfoCollection();
auto genPdfInfo = genPdfInfoCollection.create();
genPdfInfo.setPartonId(1, 2);
genPdfInfo.setLhapdfId({20, 20});
genPdfInfo.setX({0.5, 0.5});
genPdfInfo.setXf({0.5, 0.5});
genPdfInfo.setScale(23);
event.put(std::move(genPdfInfoCollection), edm4hep::labels::GeneratorPdfInfo);

//===============================================================================
// write some generator tool info into the run
auto toolInfos = std::vector<edm4hep::GeneratorToolInfo>();
auto toolInfo = edm4hep::GeneratorToolInfo();
toolInfo.name = "something";
toolInfo.version = "v1";
toolInfo.description = "some tool";
toolInfos.emplace_back(std::move(toolInfo));
edm4hep::utils::putGenToolInfos(run, toolInfos);

//===============================================================================
// write some generator weightname info into the run
auto weightNames = std::vector<std::string>();
weightNames.emplace_back("oneWeight");
weightNames.emplace_back("anotherWeight");
run.putParameter(edm4hep::labels::GeneratorWeightNames, std::move(weightNames));

// fixme: should this become a utility function ?
//-------------------------------------------------------------

Expand Down Expand Up @@ -237,6 +283,7 @@ void write(std::string outfilename) {
event.putParameter("EventType", "test");

writer.writeFrame(event, "events");
writer.writeFrame(run, "runs");
}

writer.finish();
Expand Down
8 changes: 8 additions & 0 deletions tools/include/edm4hep2json.hxx
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
#include "edm4hep/CalorimeterHitCollection.h"
#include "edm4hep/ClusterCollection.h"
#include "edm4hep/EventHeaderCollection.h"
#include "edm4hep/GeneratorEventParametersCollection.h"
#include "edm4hep/GeneratorPdfInfoCollection.h"
#include "edm4hep/MCParticleCollection.h"
#include "edm4hep/ParticleIDCollection.h"
#include "edm4hep/RawCalorimeterHitCollection.h"
Expand Down Expand Up @@ -154,6 +156,12 @@ nlohmann::json processEvent(const podio::Frame& frame, std::vector<std::string>&
insertIntoJson<podio::UserDataCollection<uint32_t>>(jsonDict, coll, collList[i]);
} else if (coll->getTypeName() == "podio::UserDataCollection<uint64_t>") {
insertIntoJson<podio::UserDataCollection<uint64_t>>(jsonDict, coll, collList[i]);
}
// Generator (meta-)data
else if (coll->getTypeName() == "podio::GeneratorParametersCollection") {
insertIntoJson<edm4hep::GeneratorEventParametersCollection>(jsonDict, coll, collList[i]);
} else if (coll->getTypeName() == "podio::GeneratorPdfInfoCollection") {
insertIntoJson<edm4hep::GeneratorPdfInfoCollection>(jsonDict, coll, collList[i]);
} else {
std::cout << "WARNING: Collection type not recognized!\n"
<< " " << coll->getTypeName() << "\n";
Expand Down
Loading