Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Size of combined BSSeq objects #88

Open
kperzel opened this issue Oct 17, 2019 · 1 comment
Open

Size of combined BSSeq objects #88

kperzel opened this issue Oct 17, 2019 · 1 comment

Comments

@kperzel
Copy link

kperzel commented Oct 17, 2019

Hi,

I'm working with a large number of samples that were processed in different batches so I built two BSSeq objects with the same loci, which turned out to be 3GB and 6GB, and went to combine them using combine() as the documentation says. This led to an object of 46GB, which is difficult to work with. I tried instead to cbind() the two objects and this created an object of 9GB. As far as I can tell, these two objects contain all the same information, what is the reason for the major size difference?

Thank you

@PeteHaitch
Copy link
Contributor

combine() has some internal inefficiencies compared to cbind() (although, of course, cbind() is only appropriate if the two objects contain the exact same loci in the exact same order).
It's on the list of things to fix.

In the interim, you can run realize(combined_BSseq, NULL) to simplify a BSseq object which will reduce its memory footprint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants