Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

500,000 nuclei of multiome data has very high memory requirements for batch correction (1.2 TB) #214

Open
mgrantpeters opened this issue Mar 6, 2024 · 1 comment

Comments

@mgrantpeters
Copy link

I'm experiencing very high memory requirements with a multiome dataset much smaller than the TAURUS data reported in the manuscript (fig 5). While I appreciate that having paired ATAC and RNA will have higher computation required than RNA only, requiring >1TB seems excessive. Does panpipes have implemented optional strategies to reduce memory requirements that could be useful in this case (e.g. sparse matrix conversion, on disk storage rather than in memory)?

@bio-la
Copy link
Collaborator

bio-la commented Mar 6, 2024

Hi @mgrantpeters thanks for reaching out! I agree that the behaviour you observe is rather strange. To help us troubleshoot, could you please let me know:

  • which integration method are you using?
  • how many features does your object have? (ATAC and RNA)
    thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants