You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unfortunately the current parallel processing functionality for creating histograms from trees in SampleSet.gethist broke when switching to python 3. Segmentation faults seem to be caused by a conflict between how python and ROOT handle their objects in the memory. (The parallel processing is done (ab)using python's multithreading.)
As a consequence, I am starting to look into completely redesigning the SampleSet.gethist/MergedSample.gethist/Sample.gethist routines using RDataFrame, which is native to ROOT since v6.14. I will probably make this the default routine replacing the old routine based on python's multithreading by Plotter/python/plot/MultiThread.py and MultiDraw.py/MultiDraw.cxx. The latter also has some unexpected behavior for array branches of variable length.
Besides solving the memory issues, this should be more performant because we can string together multiple instances of RDataFrame (see this section of the class reference):
and let RDataFrame optimize the parallel processing of many histograms (multiple samples x variables x selections) by itself.
Furthermore, we could even think of processing multiple variables and selections in one go. (The previous setup would only process multiple variables and samples in parallel, but sequentially for selections.)
The text was updated successfully, but these errors were encountered:
Unfortunately the current parallel processing functionality for creating histograms from trees in
SampleSet.gethist
broke when switching to python 3. Segmentation faults seem to be caused by a conflict between how python and ROOT handle their objects in the memory. (The parallel processing is done (ab)using python's multithreading.)As a consequence, I am starting to look into completely redesigning the
SampleSet.gethist
/MergedSample.gethist
/Sample.gethist
routines usingRDataFrame
, which is native to ROOT since v6.14. I will probably make this the default routine replacing the old routine based on python's multithreading byPlotter/python/plot/MultiThread.py
andMultiDraw.py
/MultiDraw.cxx
. The latter also has some unexpected behavior for array branches of variable length.Besides solving the memory issues, this should be more performant because we can string together multiple instances of
RDataFrame
(see this section of the class reference):and let
RDataFrame
optimize the parallel processing of many histograms (multiple samples x variables x selections) by itself.Furthermore, we could even think of processing multiple variables and selections in one go. (The previous setup would only process multiple variables and samples in parallel, but sequentially for selections.)
The text was updated successfully, but these errors were encountered: