Possible strange behaviour happening with chunking. #1142

jbrewster7 · 2024-07-29T14:31:23Z

Hello, I am using the coffea.nanoevents.NanoEventsFactory.from_root function from coffea.2024.5.0 and I am specifying chunking as defined in https://github.com/scikit-hep/uproot5/blob/v5.1.2/src/uproot/_dask.py#L109-L132 (as suggested in coffea). I am running this on lxplus with files in the eos folder using xrootd. I am running into something that I find odd, though may just be behaving differently than I expect.

Initially, I arbitrarily chose to have chunks of 10000 events (which is equivalent to about 16MB in the root file). This worked until I was working with a larger number of files. With more total files my RAM would fill up and my script would crash when computing using dask.compute(). When I used smaller chunks, my RAM would fill up and it would crash faster (the smaller I made the chunks the faster it would crash). I ended up having to increase my chunk size by 10 for it not to crash.

Could this be happening because when working with this small of chunks the amount of file i/o required overwhelms the RAM? Or is this possibly a bug in either coffea or uproot?

Thanks for your help!

The text was updated successfully, but these errors were encountered:

lgray · 2024-07-29T15:00:41Z

Hello, can you post some of the code that causes this behavior? If you can isolate all this in a simple reproducer it'll help us identify the cause more quickly.

NJManganelli · 2024-07-29T17:09:37Z

I saw a similar behavior while doing the coffea-casa scale tests a few weeks ago. Very small chunksizes (initially a bug where i accidentally passed a O(100) number in as step size instead of steps_per_file), presumably small fractions of TBasket sizes, seem to lead to a serious struggle. Didn't follow up on that yet (and can't for the next couple weeks probably), but intended to scan over it for v1.1 of my simple-benchmark code

jbrewster7 added the question Further information is requested label Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible strange behaviour happening with chunking. #1142

Possible strange behaviour happening with chunking. #1142

jbrewster7 commented Jul 29, 2024

lgray commented Jul 29, 2024

NJManganelli commented Jul 29, 2024

Possible strange behaviour happening with chunking. #1142

Possible strange behaviour happening with chunking. #1142

Comments

jbrewster7 commented Jul 29, 2024

lgray commented Jul 29, 2024

NJManganelli commented Jul 29, 2024