Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error When Loading Column-Trimmed Root Files Using NanoEvents #1147

Open
kapsiak opened this issue Aug 4, 2024 · 0 comments
Open

Error When Loading Column-Trimmed Root Files Using NanoEvents #1147

kapsiak opened this issue Aug 4, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@kapsiak
Copy link

kapsiak commented Aug 4, 2024

Describe the bug

Reading root files which have been saved using the prescription #735 (comment) with column-filtering cannot be loaded by NanoEventsFactory using NanoAODSchema. Depending on the saved fields, the following exception is thrown:

TypeError: do not try to convert low-level layouts (Content subclasses) into NumPy arrays; put them in ak.highlevel.Array

The stack trace is quite deep and can be generated with the below example.

To Reproduce

Below is a small example script that demonstrates the behavior, where an exception is raised when trying to load the

import awkward as ak
import uproot
from coffea.nanoevents import NanoEventsFactory

def is_rootcompat(a):
    """Is it a flat or 1-d jagged array?"""
    t = ak.type(a)
    if isinstance(t, ak.types.ArrayType):
        if isinstance(t.content, ak.types.NumpyType):
            return True
        if isinstance(t.content, ak.types.ListType) and isinstance(t.content.content, ak.types.NumpyType):
            return True
    return False

def uproot_writeable(events):
    """Restrict to columns that uproot can write compactly"""
    out = {}
    for bname in events.fields:
        if events[bname].fields:
            out[bname] = ak.zip({n: 
                                 ak.to_packed(ak.without_parameters(events[bname][n])) 
                                 for n in events[bname].fields 
                                 if is_rootcompat(events[bname][n])}, 
                               )
        else:  
            out[bname] = ak.to_packed(ak.without_parameters(events[bname]))
    return out


filename = "https://raw.githubusercontent.com/CoffeaTeam/coffea/master/tests/samples/nano_dy.root"
events = NanoEventsFactory.from_root({filename: "Events"}, delayed=False).events()

def saveAndLoad(events, columns):  
    filtered_events = events[columns]
    with uproot.recreate("skimmedevents.root") as fout:
        fout["Events"] = uproot_writeable(filtered_events)
    reloaded = NanoEventsFactory.from_root({"skimmedevents.root": "Events"}, delayed=False).events()
    print(reloaded.fields)

# This works
saveAndLoad(events, events.fields)
 # This works
saveAndLoad(events, ["event", "run",  "luminosityBlock"])
 # This works
saveAndLoad(events, ["event", "run",  "luminosityBlock", "Electron", "Muon", "HLT"])
 # This does not work
saveAndLoad(events, ["event", "run",  "luminosityBlock", "Electron", "Jet"])

Expected behavior

It is expected that column-trimmed files should be able to be loaded by NanoEventsFactory and have the same content and cross-references where possible.

Version Info

coffea: 2024.6.1
awkward: 2.6.5
uproot: 5.3.10
@kapsiak kapsiak added the bug Something isn't working label Aug 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant