-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
controlForConfounders uses as much memory as available on machine #11
Comments
Hi Garret, Thanks for trying it out. Did you pass a BPPARAM object? Please try initializing BPPARAM before calling the function.
|
Thank you for your prompt reply. I get the same problem regardless of my BPPARAM workers count. I tried using workers=1, 2 and 20. But when I did 20 workers, I went ahead and requested 400G of RAM so I could complete successfully and also give a complete profile of the RAM usage. The code did complete correctly, but the peak RAM utilization was 232 gigabytes. I know this is not required as I ran the same data through the same code on my local windows desktop with 8G of RAM. For your testing purposes I am using anaconda's latest (non-microsoft) version of R, and I am using this to ensure that BLAS libraries do not use too many resources (actually I find that anaconda's R sometimes segfaults when it hits multithreaded BLAS libraries):
Otherwise, I am pretty much just following the vignette with my own data. EDIT: Also this means my title for this issue is incorrect. This machine had more than 400G of ram and only 232G was used. Feel free to edit the issue title if you want. |
Hey Garrett, I looked into this issue and I found that Multicore keeps accumulating memory. I guess this is due to the fact that the machine has to much free space and does not see the need for garbage collection. To speed up the autoencoder fit I start the cluster with I created a bioc-dev branch with a switch to not keep the cluster running. This will reduce a bit the speed but should be the solution to the memory leek. could you install OUTRIDER from github and run your pipeline with the new parameter
This should resolve your memory problem. If you what the threads with htop on the cluster you can see how the threads accumulate memory and dont free them. But with |
Thank you for your help here. This did work, although I was going the manual route, which did require changes to the subsequent function calls as well. This is what ended up working for me:
Running fit or computePvalues without the BPPARAM would result in the memory issue being encountered at those steps. I guess we will keep our fingers crossed that Multicore gets this corrected. Thanks again! |
Sorry to ask a potentially unrelated question (I will open a new issue if you prefer), but I had a question regarding your experience with changing results as the cohort grows. We had a cohort of ~50 samples all processed in using the same chemistry/protocol/etc and ran OUTRIDER. We then added in ~10 more samples, and the existing results for two of our previous samples changed dramatically. For example, one sample had no FDR<0.05 & |Z-score|>3 genes in the first round and now it has over 500 genes meeting those criteria. Have you ever seen this before? I would value any thoughts you might have about this. I saw your downsampling analysis in the paper, and so maybe this is just due to an increase in power, but it seems odd to me to go from no outliers detected to >500 when our cohort size did not change dramatically. Also do you suggest using "sampleExclusionMask" to exclude these dramatic outlier samples from the analyses? I initially thought this might make sense, but when I exclude these samples and run OUTRIDER, I get even more samples with a large number of aberrant genes identified. So for now I only use "sampleExclusionMask" to mask out any samples that are from related individuals (e.g., we have a few trios). |
@WGarrettJenkinson We did see some changes when playing around with the number of samples in the test. But never such dramatic changes (from 0 to > 500 outliers). What we do see is higher numbers of outliers if we increase the number of samples, which is due to the power increase. Hope this helped also with such a delay. Best, |
Hello! Sorry to wake an old thread, but did the hotfix ever make it into master? We have a need for it too. Or was the issue recoded so it does not arise anymore? Thanks. |
We did not pushed this hotfix into the master branch as it did not come up again. Also I'm not sure if this should be rather raised directly on BiocParallel from Bioconductor or the backend you are using or if it is just an OUTRIDER problem. Also as I understood from Garrett that he just limited the number of threads to 1 which disables parallel computing at all. Does your job survives if you register or pass a |
Since I'm not really sure if the hotfix works as Garrett did not tried it its up to you. So @lsmainzer if you want to give it a try I could setup a branch with it. |
Hi sorry if it wasn't clear, my comment on Dec 18, 2018 was indicating that I had tested the fix and that it worked when I modified my function calls as listed in that comment (using just 1 core as you note). As a matter of fact, we have stayed on that old dev commit because we knew it was tested and working, but it may be nice to have this on master so we could move off the old installation. I do not recall testing SerialParam, so I too would be curious if that works for @lsmainzer (which would save me setting up a new install off of master to test it). |
thanks @WGarrettJenkinson for chipping in on this old thread. As you mentioned, you used only 1 thread eg: From the MulticoreParam documentation its also true for Linux with a single worker:
So in theory you should be fine to switch to the newest version and just run Just to see if the cluster re-initiation fixes the memory issue, could you monitor your memory consumption when increasing the thread counts?
|
Thanks for the follow up and information. I just ran your suggested code with three threads on my installation with the patch from this issue and my memory utilization never exceeded 8GB. So I think it is safe to say that the fix from this issue is tested and working on our system here. Let me know if I can provide any further information. If I get a chance I will try to test your suggested |
As an update, I tested the |
Just wanted to also give a quick feedback here: I did not try your hotfix-version though... |
Thank you for this very helpful package.
I've noticed this issue when moving my workflow over to a high performance computing cluster. Specifically, I try to run the exact same code that runs locally (on a machine with 8G of RAM), but whenever I run this code on the cluster, it gets killed by the job manager for using too much memory (even when I provide 100G of ram). In my most recent attempt, the code used 215.965G of memory when I ran the controlForConfounders function. Just before running that function, my memory usage was only at 3.47G. The code successfully completes on this cluster when I max out the memory on the node.
I have tried using the "ulimit -v" setting in bash before starting R to control the memory usage, but this does not change the behavior.
Please let me know if I can provide any further information:
Matrix products: default
BLAS/LAPACK: /~/miniconda/installDir/lib/R/lib/libRblas.so
locale:
[1] C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] OUTRIDER_0.99.31 data.table_1.11.8
[3] SummarizedExperiment_1.10.1 DelayedArray_0.6.6
[5] matrixStats_0.54.0 GenomicFeatures_1.32.3
[7] AnnotationDbi_1.42.1 Biobase_2.40.0
[9] GenomicRanges_1.32.7 GenomeInfoDb_1.16.0
[11] IRanges_2.14.12 S4Vectors_0.18.3
[13] BiocGenerics_0.26.0 BiocParallel_1.14.2
loaded via a namespace (and not attached):
[1] bitops_1.0-6 bit64_0.9-7 RColorBrewer_1.1-2
[4] progress_1.2.0 PRROC_1.3.1 httr_1.3.1
[7] tools_3.5.1 backports_1.1.2 R6_2.3.0
[10] KernSmooth_2.23-15 rpart_4.1-13 Hmisc_4.1-1
[13] DBI_1.0.0 lazyeval_0.2.1 colorspace_1.3-2
[16] nnet_7.3-12 tidyselect_0.2.4 gridExtra_2.3
[19] prettyunits_1.0.2 DESeq2_1.20.0 bit_1.1-14
[22] compiler_3.5.1 htmlTable_1.12 plotly_4.8.0
[25] rtracklayer_1.40.6 caTools_1.17.1.1 scales_1.0.0
[28] checkmate_1.8.5 genefilter_1.62.0 stringr_1.3.1
[31] digest_0.6.17 Rsamtools_1.32.3 foreign_0.8-71
[34] XVector_0.20.0 base64enc_0.1-3 pkgconfig_2.0.2
[37] htmltools_0.3.6 htmlwidgets_1.3 rlang_0.2.2
[40] rstudioapi_0.8 RSQLite_2.1.1 BBmisc_1.11
[43] bindr_0.1.1 jsonlite_1.5 gtools_3.8.1
[46] acepack_1.4.1 dplyr_0.7.6 RCurl_1.95-4.11
[49] magrittr_1.5 GenomeInfoDbData_1.1.0 Formula_1.2-3
[52] Matrix_1.2-14 Rcpp_0.12.19 munsell_0.5.0
[55] reticulate_1.10 stringi_1.2.4 zlibbioc_1.26.0
[58] gplots_3.0.1 plyr_1.8.4 grid_3.5.1
[61] blob_1.1.1 gdata_2.18.0 crayon_1.3.4
[64] lattice_0.20-35 Biostrings_2.48.0 splines_3.5.1
[67] annotate_1.58.0 hms_0.4.2 locfit_1.5-9.1
[70] knitr_1.20 pillar_1.3.0 ggpubr_0.1.8
[73] geneplotter_1.58.0 biomaRt_2.36.1 XML_3.98-1.12
[76] glue_1.3.0 latticeExtra_0.6-28 pcaMethods_1.72.0
[79] tidyr_0.8.1 gtable_0.2.0 purrr_0.2.5
[82] assertthat_0.2.0 ggplot2_3.0.0 xtable_1.8-3
[85] viridisLite_0.3.0 survival_2.42-6 tibble_1.4.2
[88] GenomicAlignments_1.16.0 memoise_1.1.0 bindrcpp_0.2.2
[91] cluster_2.0.7-1
The text was updated successfully, but these errors were encountered: