Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] need better to_layout and from_iter design #35

Open
Moelf opened this issue Oct 13, 2023 · 2 comments
Open

[feat] need better to_layout and from_iter design #35

Moelf opened this issue Oct 13, 2023 · 2 comments

Comments

@Moelf
Copy link
Member

Moelf commented Oct 13, 2023

I'm test driving using a function from AK as a sink to UnROOT:

julia> using UnROOT, AwkwardArray

julia> t = LazyTree(UnROOT.samplefile("NanoAODv5_sample.root"), "Events", r"Muon_(pt|eta|phi)$");

julia> t[1:10]
 Row │ Muon_phi         Muon_pt          Muon_eta
     │ Vector{Float32}  Vector{Float32}  Vector{Float32}
─────┼───────────────────────────────────────────────────
 1   │ []               []               []
 2   │ [-0.305, 0.99]   [19.9, 15.3]     [0.53, 0.229]
 3   │ []               []               []
 4   │ []               []               []
 5   │ []               []               []
 6   │ []               []               []
 7   │ [2.71, 1.37]     [22.2, 4.43]     [-1.13, 1.98]
 8   │ []               []               []
 9   │ []               []               []
 10  │ []               []               []


julia> AwkwardArray.from_iter([collect(t[i]) for i =1:10])
10-element AwkwardArray.RecordArray{(:Muon_phi, :Muon_pt, :Muon_eta), Tuple{AwkwardArray.ListOffsetArray{Vector{Int64}, AwkwardArray.PrimitiveArray{Float32, Vector{Float32}, :default}, :default}, AwkwardArray.ListOffsetArray{Vector{Int64}, AwkwardArray.PrimitiveArray{Float32, Vector{Float32}, :default}, :default}, AwkwardArray.ListOffsetArray{Vector{Int64}, AwkwardArray.PrimitiveArray{Float32, Vector{Float32}, :default}, :default}}, :default}:
 {Muon_phi: [], Muon_pt: [], Muon_eta: []}
 {Muon_phi: [-0.30541992f0, ...], Muon_pt: [...], Muon_eta: [...]}
 {Muon_phi: [], Muon_pt: [], Muon_eta: []}
 {Muon_phi: [], Muon_pt: [], Muon_eta: []}
 {Muon_phi: [], Muon_pt: [], Muon_eta: []}
 {Muon_phi: [], Muon_pt: [], Muon_eta: []}
 {Muon_phi: [2.713379f0, 1.3676758f0], Muon_pt: [...], Muon_eta: [...]}
 {Muon_phi: [], Muon_pt: [], Muon_eta: []}
 {Muon_phi: [], Muon_pt: [], Muon_eta: []}
 {Muon_phi: [], Muon_pt: [], Muon_eta: []}

This works but is not very ergonomic and also involves copying everything twice.


Because LazyTree is already iterable, I tried to get the following to work

julia> AwkwardArray.from_iter(t)
# errors because `to_layout` doesn't understand `LazyEvent`

# so I implement this to say "LazyEvent is same as NamedTuple"
julia> function AwkwardArray.layout_for(x::Type{UnROOT.LazyEvent{T}}) where T
           AwkwardArray.layout_for(T)
       end

then I hit a infinite recursion:

ERROR: StackOverflowError:
Stacktrace:
      [1] push!(a::AwkwardArray.RecordArray{(:Muon_phi, :Muon_pt, :Muon_eta), Tuple{…}, :default}, iter::UnROOT.LazyEvent)
        @ Base ./array.jl:1186
      [2] append!
        @ AwkwardArray ~/.julia/dev/AwkwardArray/src/AwkwardArray.jl:150 [inlined]
--- the last 2 lines are repeated 52332 more times ---
 [104667] push!(a::AwkwardArray.RecordArray{(:Muon_phi, :Muon_pt, :Muon_eta), Tuple{…}, :default}, iter::UnROOT.LazyEvent)
        @ Base ./array.jl:1186
 [104668] from_iter(input::LazyTree with 3 branches:
Muon_phi, Muon_pt, Muon_eta
)
        @ AwkwardArray ~/.julia/dev/AwkwardArray/src/AwkwardArray.jl:2358
Some type information was truncated. Use `show(err)` to see complete types.

to me it all just means that:

  • we can probably use something that doesn't involve for ... push!() end
  • we need to change design of to_layout and from_iter a little bit to allow easy custom type

What I really want to do here is to tell AwkwardArray:

  • when you see evt::LazyEvent, you can just call collect(evt) first and you will get a NamedTuple which you know how to handle
@Moelf Moelf changed the title [feat] need Tables.jl interface compatible constructor [feat] need better to_layout and from_iter design Oct 13, 2023
@Moelf
Copy link
Member Author

Moelf commented Oct 13, 2023

A possible implementation is to define

# need a better name
AwkwardArray.akconvert(x) = x #default no-op

and then define from_iter to something like:

function from_iter(input)
    ItemType = eltype(input)
    AwkwardType = layout_for(ItemType)
    out = AwkwardType()
    for item in input
        push!(out, Awkward.akconvert(item))
    end
    out
end

and the user-facing interface/contract for custom type is: if you have a custom type, you need to specialize two functions:

  1. AwkwardArray.layout_for(::Type{MyType}) which must return a type that is (eventually) recognized by Awkward
  2. AwkwardArray.akconvert(x::MyType) which must convert to the same type as returned by 1

@jpivarski
Copy link
Member

I'm open to any such changes, even if they change the interface. Nothing is firmly established yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants