Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add datatypes for storing generator (meta)data in a structured and defined way #310

Merged
merged 36 commits into from
Jun 18, 2024
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
a2a9bda
add GeneratorInformation definition
hegner May 22, 2024
cfcfbdd
remove typo
hegner May 22, 2024
3b7ae97
add better member documentation
hegner May 22, 2024
4338c77
cut of GeneratorPDFInfo
hegner May 22, 2024
9dd2355
add more explicit docs; fix types
hegner May 22, 2024
4d896d8
fix URL
hegner May 22, 2024
47e08e9
add new types to documentation and tests
hegner May 23, 2024
dd08dfb
add new generator data and metadata to the write example
hegner May 23, 2024
32c31ae
Update edm4hep.yaml
hegner May 23, 2024
bf612a5
include clang-format suggestions
hegner May 24, 2024
b4e56bf
Update include/edm4hep/GenToolInfo.h
hegner May 27, 2024
b44db85
Update include/edm4hep/GenToolInfo.h
hegner May 27, 2024
f992f4c
Update edm4hep.yaml
hegner May 27, 2024
c4fe0c8
Update include/edm4hep/GenToolInfo.h
hegner May 27, 2024
1f7ae9b
Update include/edm4hep/GenToolInfo.h
hegner May 27, 2024
4873b6f
include PR comments; improve test coverage
hegner May 28, 2024
34f1ddf
fix renamed class in readme file
hegner May 28, 2024
3067af3
move signalVertex member to generatorEventParameters
hegner Jun 4, 2024
be27423
add doc for generator information
hegner Jun 4, 2024
e329148
Merge branch 'main' into geninfo
hegner Jun 4, 2024
0e823ea
rename generator related members to fit naming conventions
hegner Jun 4, 2024
763bcd8
fix line numbers in README
hegner Jun 4, 2024
8823133
fix issues of pre-commit
hegner Jun 4, 2024
3ddd0c3
Update include/edm4hep/Constants.h
hegner Jun 11, 2024
0364ea6
Update include/edm4hep/GenToolInfo.h
hegner Jun 11, 2024
e9154e4
address PR comments about util namespaces and string naming convention
hegner Jun 11, 2024
a4bd341
parameters are now optional
hegner Jun 11, 2024
cc21262
Update test/read_events.h
hegner Jun 12, 2024
c7d5662
Update include/edm4hep/Constants.h
hegner Jun 12, 2024
1bd26b3
Merge branch 'main' into geninfo
tmadlener Jun 17, 2024
36726ec
Update all label spellings to compile again
tmadlener Jun 17, 2024
6985838
Rename GeneratorToolInfo for more consistency
tmadlener Jun 17, 2024
3aa87f3
Add documentation to generator tool info handling utilities
tmadlener Jun 17, 2024
921c5bd
Rename header for consistency and fix occurences
tmadlener Jun 17, 2024
8d33ff2
Pluralize labels
tmadlener Jun 17, 2024
4a9d409
Pluralize all usages
tmadlener Jun 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,18 @@ A generic event data model for future HEP collider experiments.
| [MCRecoTrackerHitPlaneAssociation](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L644) | [MCRecoCaloParticleAssociation](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L653) | [MCRecoClusterParticleAssociation](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L662) |
| [MCRecoTrackParticleAssociation](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L671) | [RecoParticleVertexAssociation](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L680) | |

**Interfaces**
**Generator related (meta-)data**

| | | |
|-|-|-|
| [TrackerHit](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L787) | | |
| [GeneratorEventParameters](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L790) | | |
| [GeneratorPdfInfo](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L807) | | |

**Interfaces**

| | | |
|-|-|-|
| [TrackerHit](https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L818) | | |

The tests and examples in the `tests` directory show how to read, write, and use these types in your code.

Expand Down
35 changes: 35 additions & 0 deletions doc/GeneratorInfo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Dealing with Generator (meta-)data

EDM4hep provides data types and helper functions to handle the storage of generator meta data, both at run and event level.

## Storing and retrieving run level information
At run level, information about generator tools (name, version, description) are being stored. They can be written via


```cpp
#include "edm4hep/GenToolInfo.h"
...
// write some generator tool info into the run
auto toolInfo = edm4hep::GenToolInfo();
auto toolInfos = std::vector<edm4hep::GenToolInfo>();
toolInfo.name = "something";
toolInfo.version = "v1";
toolInfo.description = "some tool";
toolInfos.emplace_back(std::move(toolInfo));

edm4hep::putGenToolInfos(run, toolInfos);
```

and read-back via:

```cpp
#include "edm4hep/GenToolInfo.h"
...
auto toolInfos = edm4hep::getGenToolInfos(run);

```

### Storing and retrieving event specific generator parameters and PDF information

For storing information about event level parameters of generators one can use the type `GeneratorEventParameters`.
For storing information about PDFs, one can use the type `GeneratorPdfInfo`.
31 changes: 31 additions & 0 deletions edm4hep.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -783,6 +783,37 @@ datatypes:
OneToOneRelations:
- edm4hep::Track track // the corresponding track


#===== Generator related data =====

#---------- GeneratorEventParameters
edm4hep::GeneratorEventParameters:
Description: "Generator event parameters"
Author: "EDM4hep authors"
Members:
- double eventScale // event scale
- double alphaQED // alpha_QED
- double alphaQCD // alpha_QCD
- int signalProcessId // id of signal process
- double sqrts [GeV] // sqrt(s)
VectorMembers:
- double crossSections [pb] // list of cross sections
- double crossSectionErrors [pb] // list of cross section errors
OneToManyRelations:
- edm4hep::MCParticle signalVertex // List of initial state MCParticle that are the source of the hard interaction


#---------- GeneratorPdfInfo
edm4hep::GeneratorPdfInfo:
Description: "Generator pdf information"
Author: "EDM4hep authors"
Members:
- std::array<int, 2> partonId // Parton PDG id
- std::array<int, 2> lhapdfId // LHAPDF PDF id (see https://lhapdf.hepforge.org/pdfsets.html)
- std::array<double, 2> x // Parton momentum fraction
- std::array<double, 2> xf // PDF value
- double scale [GeV] // Factorisation scale

interfaces:
edm4hep::TrackerHit:
Description: "Tracker hit interface class"
Expand Down
8 changes: 8 additions & 0 deletions include/edm4hep/Constants.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,14 @@ static constexpr const char* pidParameterNames = "ParameterNames";
static constexpr const char* pidAlgoName = "AlgoName";
static constexpr const char* pidAlgoType = "AlgoType";

// Parameter names for Generator level metadata
static constexpr const char* generatorToolVersionLabel = "generatorToolVersion";
static constexpr const char* generatorToolNameLabel = "generatorToolName";
static constexpr const char* generatorToolDescriptionLabel = "generatorToolDescription";
static constexpr const char* generatorEventParametersLabel = "generatorEventParameters";
static constexpr const char* generatorPdfInfoLabel = "generatorPdfInfo";
static constexpr const char* generatorWeightNamesLabel = "generatorWeightNames";
hegner marked this conversation as resolved.
Show resolved Hide resolved

} // namespace edm4hep

#endif // EDM4HEP_CONSTANTS_H
48 changes: 48 additions & 0 deletions include/edm4hep/GenToolInfo.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#ifndef EDM4HEP_GENTOOLINFO_H
hegner marked this conversation as resolved.
Show resolved Hide resolved
#define EDM4HEP_GENTOOLINFO_H

#include "edm4hep/Constants.h"
#include "podio/Frame.h"
#include <string>
#include <vector>

namespace edm4hep {

struct GenToolInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The classes we define in the yaml file are all prefixed Generator, but this one is Gen. For consistency reasons, I think they should both be the same. The choice is also related with how we want to name the constants.

std::string name;
std::string version;
std::string description;

GenToolInfo(){};
GenToolInfo(const std::string& name, const std::string& version, const std::string& description) :
name(name), version(version), description(description){};
};

tmadlener marked this conversation as resolved.
Show resolved Hide resolved
const std::vector<GenToolInfo> getGenToolInfos(const podio::Frame& frame) {
auto toolInfos = std::vector<GenToolInfo>();
const auto names = frame.getParameter<std::vector<std::string>>(generatorToolNameLabel);
const auto versions = frame.getParameter<std::vector<std::string>>(generatorToolVersionLabel);
const auto descriptions = frame.getParameter<std::vector<std::string>>(generatorToolDescriptionLabel);
for (unsigned int i = 0; i < names.size(); ++i) {
toolInfos.emplace_back(names[i], versions[i], descriptions[i]);
tmadlener marked this conversation as resolved.
Show resolved Hide resolved
}
return toolInfos;
};

void putGenToolInfos(podio::Frame& frame, std::vector<GenToolInfo>& toolInfos) {
auto names = std::vector<std::string>();
auto versions = std::vector<std::string>();
auto descriptions = std::vector<std::string>();
for (auto& toolInfo : toolInfos) {
names.push_back(toolInfo.name);
versions.push_back(toolInfo.version);
descriptions.push_back(toolInfo.description);
}
frame.putParameter(generatorToolNameLabel, std::move(names));
frame.putParameter(generatorToolVersionLabel, std::move(versions));
frame.putParameter(generatorToolDescriptionLabel, std::move(descriptions));
};

hegner marked this conversation as resolved.
Show resolved Hide resolved
} // namespace edm4hep

#endif // EDM4HEP_GENTOOLINFO_H
38 changes: 38 additions & 0 deletions test/read_events.h
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@

// test data model
#include "edm4hep/CaloHitContributionCollection.h"
#include "edm4hep/GenToolInfo.h"
#include "edm4hep/GeneratorEventParametersCollection.h"
#include "edm4hep/GeneratorPdfInfoCollection.h"
#include "edm4hep/MCParticleCollection.h"
#include "edm4hep/RawTimeSeriesCollection.h"
#include "edm4hep/SimCalorimeterHitCollection.h"
Expand All @@ -16,6 +19,27 @@
// STL
#include <iostream>

void processRun(const podio::Frame& run) {
//===============================================================================
// get generator tool info from the run
auto toolInfos = edm4hep::getGenToolInfos(run);
auto toolinfo = toolInfos[0];
if (toolinfo.name != "something")
throw std::runtime_error("toolinfo.name != 'something'");
if (toolinfo.version != "v1")
throw std::runtime_error("toolinfo.version != 'v1'");
if (toolinfo.description != "some tool")
throw std::runtime_error("toolinfo.description != 'some tool'");

//===============================================================================
// get generator weight names
auto weightNames = run.getParameter<std::vector<std::string>>(edm4hep::generatorWeightNamesLabel);
if (weightNames[0] != "oneWeight")
throw std::runtime_error("weightNames[0] != 'oneWeight'");
if (weightNames[1] != "anotherWeight")
throw std::runtime_error("weightNames[1] != 'anotherWeight'");
}

void processEvent(const podio::Frame& event) {
auto& mcps = event.get<edm4hep::MCParticleCollection>("MCParticles");
auto& sths = event.get<edm4hep::SimTrackerHitCollection>("SimTrackerHits");
Expand Down Expand Up @@ -232,6 +256,17 @@ void processEvent(const podio::Frame& event) {
throw std::runtime_error("Collection 'TrackerHitPlanes' should be present");
}

//===============================================================================
// check the generator meta data
auto& genParametersCollection =
event.get<edm4hep::GeneratorEventParametersCollection>(edm4hep::generatorEventParametersLabel);
auto genParam = genParametersCollection[0];
if (genParam.getEventScale() != 23)
throw std::runtime_error("Event_scale != 23");

auto& generatorPdfInfoCollection = event.get<edm4hep::GeneratorPdfInfoCollection>(edm4hep::generatorPdfInfoLabel);
auto genPdfInfo = generatorPdfInfoCollection[0];

// //===============================================================================
// if( sccons.isValid() ){
// } else {
Expand All @@ -251,6 +286,9 @@ void read_events(const std::string& filename) {
ReaderT reader;
reader.openFile(filename);

const auto run = podio::Frame(reader.readNextEntry("runs"));
processRun(run);

unsigned nEvents = reader.getEntries("events");
for (unsigned i = 0; i < nEvents; ++i) {
std::cout << "reading event " << i << std::endl;
Expand Down
47 changes: 47 additions & 0 deletions test/write_events.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,11 @@
#define EDM4HEP_TEST_WRITE_EVENTS_H

// Data model

#include "edm4hep/CaloHitContributionCollection.h"
#include "edm4hep/GenToolInfo.h"
#include "edm4hep/GeneratorEventParametersCollection.h"
#include "edm4hep/GeneratorPdfInfoCollection.h"
#include "edm4hep/MCParticleCollection.h"
#include "edm4hep/RawTimeSeriesCollection.h"
#include "edm4hep/SimCalorimeterHitCollection.h"
Expand All @@ -28,6 +32,7 @@ void write(std::string outfilename) {
for (unsigned i = 0; i < nevents; ++i) {
std::cout << " --- processing event " << i << std::endl;
auto event = podio::Frame();
auto run = podio::Frame();

// place the following generator event to the MCParticle collection
//
Expand Down Expand Up @@ -108,6 +113,47 @@ void write(std::string outfilename) {
}
}

//===============================================================================
// write some generator event data
auto genParametersCollection = edm4hep::GeneratorEventParametersCollection();
auto genParam = genParametersCollection.create();
genParam.setEventScale(23);
genParam.setAlphaQED(1 / 127);
genParam.setAlphaQCD(0.1);
genParam.setSignalProcessId(42);
genParam.setSqrts(90);
genParam.addToCrossSections(10);
genParam.addToCrossSectionErrors(3);
genParam.addToSignalVertex(mcp1);
genParam.addToSignalVertex(mcp2);
event.put(std::move(genParametersCollection), edm4hep::generatorEventParametersLabel);

auto genPdfInfoCollection = edm4hep::GeneratorPdfInfoCollection();
auto genPdfInfo = genPdfInfoCollection.create();
genPdfInfo.setPartonId(1, 2);
genPdfInfo.setLhapdfId({20, 20});
genPdfInfo.setX({0.5, 0.5});
genPdfInfo.setXf({0.5, 0.5});
genPdfInfo.setScale(23);
event.put(std::move(genPdfInfoCollection), edm4hep::generatorPdfInfoLabel);

//===============================================================================
// write some generator tool info into the run
auto toolInfos = std::vector<edm4hep::GenToolInfo>();
auto toolInfo = edm4hep::GenToolInfo();
toolInfo.name = "something";
toolInfo.version = "v1";
toolInfo.description = "some tool";
toolInfos.emplace_back(std::move(toolInfo));
edm4hep::putGenToolInfos(run, toolInfos);

//===============================================================================
// write some generator weightname info into the run
auto weightNames = std::vector<std::string>();
weightNames.emplace_back("oneWeight");
weightNames.emplace_back("anotherWeight");
run.putParameter(edm4hep::generatorWeightNamesLabel, std::move(weightNames));

// fixme: should this become a utility function ?
//-------------------------------------------------------------

Expand Down Expand Up @@ -237,6 +283,7 @@ void write(std::string outfilename) {
event.putParameter("EventType", "test");

writer.writeFrame(event, "events");
writer.writeFrame(run, "runs");
}

writer.finish();
Expand Down
8 changes: 8 additions & 0 deletions tools/include/edm4hep2json.hxx
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
#include "edm4hep/CalorimeterHitCollection.h"
#include "edm4hep/ClusterCollection.h"
#include "edm4hep/EventHeaderCollection.h"
#include "edm4hep/GeneratorEventParametersCollection.h"
#include "edm4hep/GeneratorPdfInfoCollection.h"
#include "edm4hep/MCParticleCollection.h"
#include "edm4hep/ParticleIDCollection.h"
#include "edm4hep/RawCalorimeterHitCollection.h"
Expand Down Expand Up @@ -154,6 +156,12 @@ nlohmann::json processEvent(const podio::Frame& frame, std::vector<std::string>&
insertIntoJson<podio::UserDataCollection<uint32_t>>(jsonDict, coll, collList[i]);
} else if (coll->getTypeName() == "podio::UserDataCollection<uint64_t>") {
insertIntoJson<podio::UserDataCollection<uint64_t>>(jsonDict, coll, collList[i]);
}
// Generator (meta-)data
else if (coll->getTypeName() == "podio::GeneratorParametersCollection") {
insertIntoJson<edm4hep::GeneratorEventParametersCollection>(jsonDict, coll, collList[i]);
} else if (coll->getTypeName() == "podio::GeneratorPdfInfoCollection") {
insertIntoJson<edm4hep::GeneratorPdfInfoCollection>(jsonDict, coll, collList[i]);
} else {
std::cout << "WARNING: Collection type not recognized!\n"
<< " " << coll->getTypeName() << "\n";
Expand Down
Loading