Excessive memory required in `MergeBsseqObjects` #29

Nick-Eagles · 2023-09-15T17:33:45Z

Merging bsseq objects is extraordinarily memory inefficient currently. In initial tests, merging a list (even a list of 2 elements!) of bsseq objects often results in peak memory exceeding 10 times the size of the sum of memories occupied by individual objects; the theoretical worst-case optimal behavior of any merging algorithm should not exceed 2 times this sum. Note that in these tests, only in-memory portions of the object were quantified (not HDF5-backed assays).

The end goal here will either be to 1. fix any inefficient code on my end (e.g. is do.call even expected to do things in a memory-efficient way?), and/or 2. raise GitHub issues on dependent buggy packages.

(1) Questionable pieces of my code include:

using do.call on a list of objects: should we expect do.call to iterate over the list in a memory-efficient way?
using rbind instead of combine or combineList (the officially intended methods for this purpose), though I'm only using rbind because of this open bug, and a bsseq contributor claims rbind is suitable for our case
possibly not using required HDF5Array or DelayedArray settings, such as setAutoRealizationBackend("HDF5Array")

(2) Regarding probable issues in dependent packages: combining just the rowRanges of two bsseq objects results in peak memory usage hitting ~4 times the sum of memory sizes of individual ranges, which is arguably a bug I'll need to make a reprex and issue for on the GenomicsRanges GitHub. But, as mentioned above, merging 2 bsseq objects is even more memory inefficient, so this GenomicsRanges bug is only part of the problem. It's possible that there are failures to use the HDF5 backend when merging assays, such as noted in this (currently) open issue.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Excessive memory required in `MergeBsseqObjects` #29

Excessive memory required in `MergeBsseqObjects` #29

Nick-Eagles commented Sep 15, 2023

Excessive memory required in MergeBsseqObjects #29

Excessive memory required in MergeBsseqObjects #29

Comments

Nick-Eagles commented Sep 15, 2023

Excessive memory required in `MergeBsseqObjects` #29

Excessive memory required in `MergeBsseqObjects` #29