Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support categorical axes on boost histograms #764

Merged
merged 124 commits into from
Nov 2, 2022
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
124 commits
Select commit Hold shift + click to select a range
5c90b48
fix: initialize empty `TObject` members on `to_TObjString`
lobis Oct 24, 2022
a3d9c6a
add test for serialization of `TObjString`
lobis Oct 24, 2022
1419d06
remove unused dependency on test
lobis Oct 24, 2022
149a27b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 24, 2022
8209f72
add `tojson` method to `TObjString`
lobis Oct 24, 2022
a30e8e1
add additional check to `TObjString` write test
lobis Oct 24, 2022
4c828c8
fix bad field in `TList` tojson conversion
lobis Oct 24, 2022
5c95c4b
add inexpensive `assert` to `TList` serialization
lobis Oct 24, 2022
163c8a2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 24, 2022
18bbc41
add `to_THashList` method
lobis Oct 24, 2022
1a32a9e
add aux method to create `THashList` from categorical axis
lobis Oct 24, 2022
c237928
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 24, 2022
98d42ac
Update src/uproot/writing/identify.py
lobis Oct 25, 2022
abaa1a1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 25, 2022
4da653f
fix bad serialization of non-empty TList due to options (https://gith…
lobis Oct 25, 2022
db35954
add tests for TList serialization
lobis Oct 25, 2022
5fff9dd
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 25, 2022
bc95096
fixed bad `__repr__` for `TObject`
lobis Oct 25, 2022
6697fa0
add serialization of `fUniqueID` to `TObject`
lobis Oct 25, 2022
c6fe49e
Merge remote-tracking branch 'origin/to_TObjString-fix' into hist-cat…
lobis Oct 25, 2022
94e93a8
add `empty` method to `TObject`
lobis Oct 25, 2022
46424e1
remove redundant `TObject` member initialization
lobis Oct 25, 2022
50c94a7
Merge branch 'to_TObjString-fix' of github.com:lobis/uproot5 into to_…
lobis Oct 25, 2022
6b1f188
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 25, 2022
0f9b2bc
add serialization for `THashList`
lobis Oct 25, 2022
3433563
made `THashList` writable
lobis Oct 25, 2022
5dfbb33
add test to check serialization of `TList` vs `THashList` (which shou…
lobis Oct 25, 2022
cf0d3e1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 25, 2022
a32744d
initialize boost histogram axis as `IntCategory` if labels can be con…
lobis Oct 25, 2022
303514d
Update src/uproot/writing/identify.py
lobis Oct 25, 2022
d6e1543
moved `TList` serialization list to `serialize` method
lobis Oct 25, 2022
0621c67
add helper serialization method `bytestring` as suggested in https://…
lobis Oct 25, 2022
8e6ad2e
keep `TList` `_options` as python `bytes` and update serialization to…
lobis Oct 25, 2022
897972f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 25, 2022
bd34567
Merge branch 'hist-categorical' of github.com:lobis/uproot5 into hist…
lobis Oct 26, 2022
d6598ef
Merge remote-tracking branch 'origin/to_TObjString-fix' into hist-cat…
lobis Oct 26, 2022
00ba8a0
add bin index as `TObject.fUniqueID` when setting hist labels as done…
lobis Oct 26, 2022
b9fb72e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 26, 2022
9a500c1
reset `serialization.py` to `main` branch status
lobis Oct 26, 2022
884ec4d
Revert "keep `TList` `_options` as python `bytes` and update serializ…
lobis Oct 26, 2022
3c8b85b
Revert "[pre-commit.ci] auto fixes from pre-commit.com hooks"
lobis Oct 26, 2022
45720ff
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 26, 2022
6dde481
Revert "reset `serialization.py` to `main` branch status"
lobis Oct 26, 2022
3eac7ca
Revert "Revert "keep `TList` `_options` as python `bytes` and update …
lobis Oct 26, 2022
84f180f
Merge remote-tracking branch 'origin/to_TObjString-fix' into hist-cat…
lobis Oct 26, 2022
b8d5207
Merge branch 'main' into to_TObjString-fix
lobis Oct 26, 2022
ee0b5ba
Merge branch 'main' into to_TObjString-fix
lobis Oct 26, 2022
697edb1
Merge remote-tracking branch 'origin/to_TObjString-fix' into hist-cat…
lobis Oct 26, 2022
683c6ff
Merge branch 'main' into to_TObjString-fix
lobis Oct 27, 2022
4e5a8ed
Merge remote-tracking branch 'origin/to_TObjString-fix' into hist-cat…
lobis Oct 27, 2022
e4d205c
Apply suggestions from code review
lobis Oct 27, 2022
28928c9
initialize `opt` with different list initialization for consistent style
lobis Oct 27, 2022
fae79b1
add back two spaces before imports (this is not handled by the pre-co…
lobis Oct 27, 2022
cb6deb7
Update src/uproot/models/TObject.py
lobis Oct 27, 2022
2d69174
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 27, 2022
9281aae
Merge remote-tracking branch 'origin/to_TObjString-fix' into hist-cat…
lobis Oct 27, 2022
57a5640
fix bad label placement
lobis Oct 28, 2022
d22fa31
Merge branch 'main' into hist-categorical
lobis Oct 29, 2022
0c74354
Merge remote-tracking branch 'upstream/main' into hist-categorical
lobis Oct 30, 2022
43affea
axis `fName` will always be one of the default values (`xaxis`, `yaxi…
lobis Oct 30, 2022
e4a43fb
add test for categorical histogram
lobis Oct 30, 2022
49c38c7
hist categorical test: check for correctly overwritten axis name
lobis Oct 30, 2022
f28885c
add missing assert to test
lobis Oct 30, 2022
70e7008
updated tests for new axis naming behaviour
lobis Oct 30, 2022
d8189b4
remove unused dependency from test
lobis Oct 30, 2022
e516817
do not used mutable default arguments
lobis Oct 30, 2022
777ecee
fix `to_hist` not using options from function arguments (was using de…
lobis Oct 30, 2022
22cdbf7
fix not checking for all categories
lobis Oct 30, 2022
2f821c4
add test for TH1 -> hist/boost conversion
lobis Oct 30, 2022
b3781b7
TH1 to boost/hist do not use weights (temporary solution)
lobis Oct 30, 2022
dce551b
build action: move checkout right before install
lobis Oct 31, 2022
80e148b
add debug step to pipeline
lobis Oct 31, 2022
0dcfe65
fix test not being updated to new functionality
lobis Oct 31, 2022
c58b688
reverted build CI to original state
lobis Oct 31, 2022
1c71083
updated test to check for working `to_hist` conversion
lobis Oct 31, 2022
2ee4129
simplified test
lobis Oct 31, 2022
9d0dede
add check for boost histogram conversion
lobis Oct 31, 2022
f4d97a4
check also for intcategory to resize values
lobis Oct 31, 2022
af17fc3
add test for TH3 histogram
lobis Oct 31, 2022
1045428
remove redundant test
lobis Oct 31, 2022
59121f2
add link to issue on test
lobis Oct 31, 2022
d56eba3
add temporary fix to bad `to_boost` conversion for categorical histog…
lobis Oct 31, 2022
5370ac9
add tmp fix
lobis Oct 31, 2022
c652f4f
reduced repeated code for slicing values when categorical axis via he…
lobis Oct 31, 2022
b0c940d
implemented `to_boost` in `Histogram` parent class to reduce code dup…
lobis Oct 31, 2022
ecf598e
remove now unecessary helper function `_slice_values_if_categorical_a…
lobis Oct 31, 2022
7ad4228
remove "temporary" `and False`
lobis Oct 31, 2022
c58c9f7
remove unused dependency
lobis Oct 31, 2022
c6f671a
add missing asserts
lobis Oct 31, 2022
052de1a
add test for TH1 weights from root
lobis Oct 31, 2022
9f10718
add test for issue
lobis Oct 31, 2022
1971918
alternative way to detect weighted hist
lobis Oct 31, 2022
02a1ecb
made `fSumw2` None if storage is not weights
lobis Oct 31, 2022
bd90779
remove comments from test
lobis Oct 31, 2022
a160bff
add test for hist with weights
lobis Oct 31, 2022
0bd7de6
fix bad check for storage type
lobis Oct 31, 2022
af7d023
remove comment
lobis Nov 1, 2022
23a76ea
add test for hist with(out) weights and labels
lobis Nov 1, 2022
7ba24ce
updated TH1 `to_boost` to handle weights/labels better
lobis Nov 1, 2022
860b6b3
placed histogram `to_boost` in parent `Histogram` class to reduce cod…
lobis Nov 1, 2022
017cb35
updated `weighted` property
lobis Nov 1, 2022
fc80387
implemented histogram `weighted` property in parent `Histogram` class
lobis Nov 1, 2022
6f50836
using weighted property instead of copying check
lobis Nov 1, 2022
0fe0c1b
do not use mutable default arguments
lobis Nov 1, 2022
ea7fbdd
fix calling property
lobis Nov 1, 2022
6d43cc7
add missing asserts to test
lobis Nov 1, 2022
17459cb
add test for issue #722
lobis Nov 1, 2022
d3157bf
add weight test for 2D and 3D histograms
lobis Nov 1, 2022
99ea10d
add temporary skip to test until file is uploaded
lobis Nov 1, 2022
43d7c01
Merge branch 'main' into fix-hist-weights
lobis Nov 1, 2022
aa8c764
update issue test file
lobis Nov 1, 2022
383eb2c
Merge remote-tracking branch 'origin/fix-hist-weights' into fix-hist-…
lobis Nov 1, 2022
c36ccf3
add back temporary skip until file is available
lobis Nov 1, 2022
2aff474
Apply suggestions from code review
lobis Nov 1, 2022
11c14c1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 1, 2022
7d49e6d
add suggestion from https://github.com/scikit-hep/uproot5/pull/774#di…
lobis Nov 1, 2022
6591e5d
add check for length of `fSumw2` greater than 0 so empty histograms a…
lobis Nov 1, 2022
abc2cf5
remove unnecessary subclass method implementation
lobis Nov 1, 2022
8fbe466
Merge remote-tracking branch 'upstream/main' into hist-categorical
lobis Nov 2, 2022
096d994
Merge branch 'fix-hist-weights' into hist-categorical
lobis Nov 2, 2022
6cc2a87
Merge branch 'main' into hist-categorical
lobis Nov 2, 2022
88b44fb
add check for axis class
lobis Nov 2, 2022
c294f8f
Merge branch 'hist-categorical' of github.com:lobis/uproot5 into hist…
lobis Nov 2, 2022
ff30f50
addressed https://github.com/scikit-hep/uproot5/pull/764#discussion_r…
lobis Nov 2, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 55 additions & 1 deletion src/uproot/writing/identify.py
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,7 @@ def to_writable(obj):
fXmin=axis.edges[0],
fXmax=axis.edges[-1],
fXbins=_fXbins_maybe_regular(axis, boost_histogram),
fLabels=_fLabels_maybe_categorical(axis, boost_histogram),
)
for axis, default_name in zip(obj.axes, ["xaxis", "yaxis", "zaxis"])
lobis marked this conversation as resolved.
Show resolved Hide resolved
]
Expand Down Expand Up @@ -659,6 +660,31 @@ def _fXbins_maybe_regular(axis, boost_histogram):
return axis.edges


def _fLabels_maybe_categorical(axis, boost_histogram):
if boost_histogram is None:
return None

if not isinstance(axis, boost_histogram.axis.IntCategory) and not isinstance(
axis, boost_histogram.axis.StrCategory
):
return None
lobis marked this conversation as resolved.
Show resolved Hide resolved

labels = [str(label) for label in axis]
if isinstance(axis, boost_histogram.axis.IntCategory):
# Check labels are valid integers (this may be redundant)
for label in labels:
try:
int(label)
except ValueError:
raise ValueError(
f"IntCategory labels must be valid integers. Found {label} on axis {axis}"
)
lobis marked this conversation as resolved.
Show resolved Hide resolved

labels = [to_TObjString(label) for label in labels]

return to_THashList(labels)


def _root_stats_1d(entries, edges):
centers = (edges[:-1] + edges[1:]) / 2.0

Expand Down Expand Up @@ -735,12 +761,16 @@ def to_TObjString(string):
This function is for developers to create TObjString objects that can be
written to ROOT files, to implement conversion routines.
"""
tobject = uproot.models.TObject.Model_TObject.empty()
tobject._members["@fUniqueID"] = 0
tobject._members["@fBits"] = 0

tobjstring = uproot.models.TObjString.Model_TObjString(str(string))
tobjstring._deeply_writable = True
tobjstring._cursor = None
tobjstring._parent = None
tobjstring._members = {}
tobjstring._bases = (uproot.models.TObject.Model_TObject(),)
tobjstring._bases = [tobject]
tobjstring._num_bytes = len(string) + (1 if len(string) < 255 else 5) + 16
tobjstring._instance_version = 1
return tobjstring
Expand Down Expand Up @@ -777,6 +807,30 @@ def to_TList(data, name=""):
return tlist


def to_THashList(data, name=""):
"""
Args:
data (:doc:`uproot.model.Model`): Python iterable to convert into a THashList.
name (str): Name of the list (usually empty: ``""``).

This function is for developers to create THashList objects that can be
written to ROOT files, to implement conversion routines.
"""

if not all(isinstance(x, uproot.model.Model) for x in data):
raise TypeError(
"list to convert to THashList must only contain ROOT objects (uproot.Model)"
)

tlist = to_TList(data, name)

thashlist = uproot.models.THashList.Model_THashList.empty()

thashlist._bases.append(tlist)

return thashlist


def to_TArray(data):
"""
Args:
Expand Down
19 changes: 18 additions & 1 deletion tests/test_0349-write-TObjString.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

import os

import numpy as np
import pytest

import uproot
Expand Down Expand Up @@ -78,3 +77,21 @@ def test_update(tmp_path):
assert f6["subdir/wowie"] == "wowie"
assert f6["subdir/zowie"] == "zowie"
assert list(f6.file.streamers) == ["TObjString"]


def test_serialization(tmp_path):
filename = os.path.join(tmp_path, "whatever.root")

string = "hey"
tobjstring = uproot.writing.identify.to_TObjString(string)
assert (
tobjstring.tojson()["_typename"] == "TObjString"
) # https://github.com/scikit-hep/uproot5/issues/762

with uproot.recreate(filename) as f1:
f1["first"] = tobjstring
f1["second"] = str(tobjstring) # also checks conversion to "str"

with uproot.open(filename) as f2:
assert f2["first"] == f2["second"]
assert str(f2["first"]) == string