Error When Loading Column-Trimmed Root Files Using NanoEvents #1147

kapsiak · 2024-08-04T14:35:15Z

Describe the bug

Reading root files which have been saved using the prescription #735 (comment) with column-filtering cannot be loaded by NanoEventsFactory using NanoAODSchema. Depending on the saved fields, the following exception is thrown:

TypeError: do not try to convert low-level layouts (Content subclasses) into NumPy arrays; put them in ak.highlevel.Array

The stack trace is quite deep and can be generated with the below example.

To Reproduce

Below is a small example script that demonstrates the behavior, where an exception is raised when trying to load the

import awkward as ak
import uproot
from coffea.nanoevents import NanoEventsFactory

def is_rootcompat(a):
    """Is it a flat or 1-d jagged array?"""
    t = ak.type(a)
    if isinstance(t, ak.types.ArrayType):
        if isinstance(t.content, ak.types.NumpyType):
            return True
        if isinstance(t.content, ak.types.ListType) and isinstance(t.content.content, ak.types.NumpyType):
            return True
    return False

def uproot_writeable(events):
    """Restrict to columns that uproot can write compactly"""
    out = {}
    for bname in events.fields:
        if events[bname].fields:
            out[bname] = ak.zip({n: 
                                 ak.to_packed(ak.without_parameters(events[bname][n])) 
                                 for n in events[bname].fields 
                                 if is_rootcompat(events[bname][n])}, 
                               )
        else:  
            out[bname] = ak.to_packed(ak.without_parameters(events[bname]))
    return out


filename = "https://raw.githubusercontent.com/CoffeaTeam/coffea/master/tests/samples/nano_dy.root"
events = NanoEventsFactory.from_root({filename: "Events"}, delayed=False).events()

def saveAndLoad(events, columns):  
    filtered_events = events[columns]
    with uproot.recreate("skimmedevents.root") as fout:
        fout["Events"] = uproot_writeable(filtered_events)
    reloaded = NanoEventsFactory.from_root({"skimmedevents.root": "Events"}, delayed=False).events()
    print(reloaded.fields)

# This works
saveAndLoad(events, events.fields)
 # This works
saveAndLoad(events, ["event", "run",  "luminosityBlock"])
 # This works
saveAndLoad(events, ["event", "run",  "luminosityBlock", "Electron", "Muon", "HLT"])
 # This does not work
saveAndLoad(events, ["event", "run",  "luminosityBlock", "Electron", "Jet"])

Expected behavior

It is expected that column-trimmed files should be able to be loaded by NanoEventsFactory and have the same content and cross-references where possible.

Version Info

coffea: 2024.6.1
awkward: 2.6.5
uproot: 5.3.10

The text was updated successfully, but these errors were encountered:

kapsiak added the bug Something isn't working label Aug 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error When Loading Column-Trimmed Root Files Using NanoEvents #1147

Error When Loading Column-Trimmed Root Files Using NanoEvents #1147

kapsiak commented Aug 4, 2024

Error When Loading Column-Trimmed Root Files Using NanoEvents #1147

Error When Loading Column-Trimmed Root Files Using NanoEvents #1147

Comments

kapsiak commented Aug 4, 2024