Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in xgb.get.handle(object): 'xgb.Booster' object is corrupted or is from an incompatible xgboost version. #447

Open
nick-youngblut opened this issue Sep 16, 2024 · 10 comments

Comments

@nick-youngblut
Copy link

nick-youngblut commented Sep 16, 2024

My command:

bamba_ret = bambu(
    reads = bam_file, 
    annotations = gtf_file, 
    genome = fna_file, 
    quant = FALSE,
    ncore = 8
)

The error:

Error: BiocParallel errors
  1 remote errors, element index: 1
  0 unevaluated and other errors
  first remote error:
Error in xgb.get.handle(object): 'xgb.Booster' object is corrupted or is from an incompatible xgboost version.

Traceback:

1. bambu(reads = bam_file, annotations = gtf_file, genome = fna_file, 
 .     quant = FALSE, ncore = 8)
2. bambu.processReads(reads, annotations, genomeSequence = genome, 
 .     readClass.outputDir = rcOutDir, yieldSize, bpParameters, 
 .     stranded, verbose, isoreParameters, trackReads = trackReads, 
 .     fusionMode = fusionMode, lowMemory = lowMemory)
3. bplapply(names(reads), function(bamFileName) {
 .     bambu.processReadsByFile(bam.file = reads[bamFileName], genomeSequence = genomeSequence, 
 .         annotations = annotations, readClass.outputDir = readClass.outputDir, 
 .         stranded = stranded, min.readCount = min.readCount, fitReadClassModel = fitReadClassModel, 
 .         min.exonOverlap = min.exonOverlap, defaultModels = defaultModels, 
 .         returnModel = returnModel, verbose = verbose, lowMemory = lowMemory, 
 .         trackReads = trackReads, fusionMode = fusionMode)
 . }, BPPARAM = bpParameters)
4. bplapply(names(reads), function(bamFileName) {
 .     bambu.processReadsByFile(bam.file = reads[bamFileName], genomeSequence = genomeSequence, 
 .         annotations = annotations, readClass.outputDir = readClass.outputDir, 
 .         stranded = stranded, min.readCount = min.readCount, fitReadClassModel = fitReadClassModel, 
 .         min.exonOverlap = min.exonOverlap, defaultModels = defaultModels, 
 .         returnModel = returnModel, verbose = verbose, lowMemory = lowMemory, 
 .         trackReads = trackReads, fusionMode = fusionMode)
 . }, BPPARAM = bpParameters)
5. .bpinit(manager = manager, X = X, FUN = FUN, ARGS = ARGS, BPPARAM = BPPARAM, 
 .     BPOPTIONS = BPOPTIONS, BPREDO = BPREDO)

I've tried restarting the R kernel, but that did not help. It appears that the xgboost version that I've installed (xgboost_2.1.1.1) is not compatible with the model utilized by default by bambu.

I don't see any version specifications for xgboost in the README or elsewhere. Which versions of xgboost are compatible with the default model?

sessionInfo

R version 4.3.3 (2024-02-29)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 22.04.4 LTS

Matrix products: default
BLAS/LAPACK: /home/nickyoungblut/miniforge3/envs/ont_10x/lib/libopenblasp-r0.3.27.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] bambu_3.4.0                 BSgenome_1.70.1            
 [3] rtracklayer_1.62.0          BiocIO_1.12.0              
 [5] Biostrings_2.70.1           XVector_0.42.0             
 [7] SummarizedExperiment_1.32.0 Biobase_2.62.0             
 [9] GenomicRanges_1.54.1        GenomeInfoDb_1.38.1        
[11] IRanges_2.36.0              S4Vectors_0.40.2           
[13] BiocGenerics_0.48.1         MatrixGenerics_1.14.0      
[15] matrixStats_1.3.0           ArcRUtils_0.1.0            
[17] ggplot2_3.5.1               tidyr_1.3.1                
[19] dplyr_1.1.4                

loaded via a namespace (and not attached):
 [1] DBI_1.2.3                bitops_1.0-7             biomaRt_2.58.0          
 [4] rlang_1.1.3              magrittr_2.0.3           compiler_4.3.3          
 [7] RSQLite_2.3.7            GenomicFeatures_1.54.1   png_0.1-8               
[10] vctrs_0.6.5              stringr_1.5.1            pkgconfig_2.0.3         
[13] crayon_1.5.2             fastmap_1.1.1            dbplyr_2.5.0            
[16] utf8_1.2.4               Rsamtools_2.18.0         purrr_1.0.2             
[19] bit_4.0.5                zlibbioc_1.48.0          cachem_1.0.8            
[22] jsonlite_1.8.8           progress_1.2.3           blob_1.2.4              
[25] DelayedArray_0.28.0      uuid_1.2-0               BiocParallel_1.36.0     
[28] parallel_4.3.3           prettyunits_1.2.0        R6_2.5.1                
[31] stringi_1.8.4            xgboost_2.1.1.1          Rcpp_1.0.12             
[34] IRkernel_1.3.2           base64enc_0.1-3          Matrix_1.6-5            
[37] tidyselect_1.2.1         abind_1.4-5              yaml_2.3.8              
[40] codetools_0.2-20         curl_5.1.0               lattice_0.22-6          
[43] tibble_3.2.1             withr_3.0.0              KEGGREST_1.42.0         
[46] evaluate_0.23            BiocFileCache_2.10.1     xml2_1.3.6              
[49] BiocManager_1.30.23      pillar_1.9.0             filelock_1.0.3          
[52] generics_0.1.3           RCurl_1.98-1.14          IRdisplay_1.1           
[55] hms_1.1.3                munsell_0.5.1            scales_1.3.0            
[58] glue_1.7.0               tools_4.3.3              data.table_1.15.2       
[61] GenomicAlignments_1.38.0 pbdZMQ_0.3-11            XML_3.99-0.16.1         
[64] grid_4.3.3               AnnotationDbi_1.64.1     colorspace_2.1-0        
[67] GenomeInfoDbData_1.2.11  repr_1.1.7               restfulr_0.0.15         
[70] cli_3.6.2                rappdirs_0.3.3           fansi_1.0.6             
[73] S4Arrays_1.2.0           gtable_0.3.5             digest_0.6.35           
[76] SparseArray_1.2.2        rjson_0.2.21             memoise_2.0.1           
[79] htmltools_0.5.8.1        lifecycle_1.0.4          httr_1.4.7              
[82] bit64_4.0.5             
@nick-youngblut
Copy link
Author

I see that this issue might be addressed with #386.

In that PR, I don't see any info on the original and updated versions. Which version of xgboost was used to train the default model, and which was used to create the updated models for the PR?

@andredsim
Copy link
Collaborator

Thanks for reporting this.

Did you install Bambu via BioConductor or Github? Was your installation of xgboost separate or part of the dependency installation?
If possible I would recommend updating to R >4.4 and reinstalling Bambu to the latest version 3.5.1 using Bioconductor which should install a compatible version of xgboost (v1.7.8.1 from what I just tested). It is good to know that the latest version of xgboost is causing this error, so we will need to be ready once the compatible version is no longer compatible with R.

Let me know if updating R/Bambu/downgrading xgboost solves your issue.

Kind Regards,
Andre Sim

@nick-youngblut
Copy link
Author

Thanks for the suggestion.

Do you constrain the supported versions of xgboost in the package setup? I don't see any such constraints on xgboost in the DESCRIPTION.

If there are no version constraints, why would a re-install of bambu install anything other than the latest version of xgboost, which is now in v2, while you are testing on v1.7 ("v1.7.8.1 from what I just tested")?

@andredsim
Copy link
Collaborator

The cran version of xgboost is still v1.7.8 (https://cran.rstudio.com/web/packages/xgboost/index.html) which is what bioconductor will install as a dependency I believe. To confirm if thats correct , may I ask if you installed xgboost independantly or via bioconductor? Presumably in the next bioconductor release xgboosts version will incrase to v2 so we will make sure to either constrain the version or make sure its compatible thanks to your report.

@nick-youngblut
Copy link
Author

The lastest conda-forge version of xgboost is 2.1.1.

I used the bambu Bioconda recipe, which does not specify particular xgboost versions.

It might be best to contact the authors of this bioconda recipe:

Screenshot 2024-09-23 at 8 29 38 AM

@cying111
Copy link
Collaborator

Hi @nick-youngblut ,

thanks for reporting this.

However, I believe there must be some confusion here. As you know, xgboost can refer to both the software or the R package. For bambu, the xgboost version is always referring to that of the R xgboost package, so if you check the latest version of that on cran, it is v1.7.8.1, which is the same version that is in bioconductor, see attached here:
image

Hope this clarifies your question!
Thanks,
Ying

@SwiftSeal
Copy link

Hi both,

I am experiencing the same issue via conda - is there a solution for this? I'm using xgboost v1.7.8.1 which should be compatible. I also can't install R>4.4 as:

The following packages are incompatible
├─ bioconductor-bambu =* * is installable with the potential options
│  ├─ bioconductor-bambu [1.0.0|1.0.2] would require
│  │  └─ r-base >=4.0,<4.1.0a0 *, which can be installed;
│  ├─ bioconductor-bambu [1.2.0|2.0.0|2.0.3|2.0.5|2.0.6] would require
│  │  └─ r-base >=4.1,<4.2.0a0 *, which can be installed;
│  ├─ bioconductor-bambu [3.0.1|3.0.5|3.0.6|3.0.8] would require
│  │  └─ r-base >=4.2,<4.3.0a0 *, which can be installed;
│  └─ bioconductor-bambu [3.2.4|3.4.0] would require
│     └─ r-base >=4.3,<4.4.0a0 *, which can be installed;
└─ r-base =4.4 * is not installable because it conflicts with any installable versions previously reported.

Cheers,
Moray

@SwiftSeal
Copy link

Quick update - the method which has brought me closest to a working (?) install is:

  1. conda install bioconductor-bambu r-biocmanager r-devtools
  2. >R library(devtools)
  3. >R install_github("GoekeLab/bambu")

This method resulted in xgboost_2.1.4.1 being reported in sessionInfo() - not sure how this occurs if CRAN is at 1.7*

So I followed this up with

  1. install.packages("xgboost")

This brought me to xgboost_1.7.8.1 but the same error occurs.
Full session info is:

R version 4.3.3 (2024-02-29)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Debian GNU/Linux 12 (bookworm)

Matrix products: default
BLAS/LAPACK: /mnt/apps/users/msmith/conda/envs/bambu/lib/libopenblasp-r0.3.29.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/London
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] bambu_3.9.3                 BSgenome_1.70.1            
 [3] rtracklayer_1.62.0          BiocIO_1.12.0              
 [5] Biostrings_2.70.1           XVector_0.42.0             
 [7] SummarizedExperiment_1.32.0 Biobase_2.62.0             
 [9] GenomicRanges_1.54.1        GenomeInfoDb_1.38.1        
[11] IRanges_2.36.0              S4Vectors_0.40.2           
[13] BiocGenerics_0.48.1         MatrixGenerics_1.14.0      
[15] matrixStats_1.5.0           xgboost_1.7.8.1            

loaded via a namespace (and not attached):
 [1] KEGGREST_1.42.0          rjson_0.2.23             lattice_0.22-6          
 [4] generics_0.1.3           vctrs_0.6.5              tools_4.3.3             
 [7] bitops_1.0-9             curl_6.2.1               parallel_4.3.3          
[10] tibble_3.2.1             AnnotationDbi_1.64.1     RSQLite_2.3.9           
[13] blob_1.2.4               pkgconfig_2.0.3          Matrix_1.6-5            
[16] data.table_1.16.4        dbplyr_2.5.0             lifecycle_1.0.4         
[19] GenomeInfoDbData_1.2.11  compiler_4.3.3           stringr_1.5.1           
[22] Rsamtools_2.18.0         progress_1.2.3           codetools_0.2-20        
[25] RCurl_1.98-1.16          yaml_2.3.10              tidyr_1.3.1             
[28] pillar_1.10.1            crayon_1.5.3             BiocParallel_1.36.0     
[31] DelayedArray_0.28.0      cachem_1.1.0             abind_1.4-8             
[34] tidyselect_1.2.1         digest_0.6.37            stringi_1.8.4           
[37] purrr_1.0.4              dplyr_1.1.4              restfulr_0.0.15         
[40] biomaRt_2.58.0           fastmap_1.2.0            grid_4.3.3              
[43] cli_3.6.4                SparseArray_1.2.2        magrittr_2.0.3          
[46] S4Arrays_1.2.0           GenomicFeatures_1.54.1   XML_3.99-0.17           
[49] rappdirs_0.3.3           filelock_1.0.3           prettyunits_1.2.0       
[52] bit64_4.6.0-1            httr_1.4.7               bit_4.5.0.1             
[55] png_0.1-8                hms_1.1.3                memoise_2.0.1           
[58] BiocFileCache_2.10.1     rlang_1.1.5              Rcpp_1.0.14             
[61] glue_1.8.0               DBI_1.2.3                xml2_1.3.6              
[64] jsonlite_1.8.9           R6_2.6.1                 GenomicAlignments_1.38.0
[67] zlibbioc_1.48.0         

@N-Hoffmann
Copy link

N-Hoffmann commented Feb 26, 2025

Hi all,
I'm experiencing the same issue when I use Bambu in the recent bioconda::bioconductor-bambu environments. It works in the bambu-3.0.8 release, but not in the 3.4.0 and 3.8.3 releases.
As @nick-youngblut mentioned, the Bioconda recipe doesn't specifiy a version for r-xgboost. In the bioconda envs, the 3.4.0 release has xgboost version 2.1.2.1 while the 3.0.8 release has version 1.7.1.1 .

For now I managed to work around it for Bambu 3.4.0 by specifying in my environment:

  • bioconda::bioconductor-bambu=3.4.0
  • conda-forge::r-xgboost=1.7.6=cpu_r43had0c348_6

I used xgboost 1.7.6 here because it was the latest version before 2.0.3 in conda. I hope this helps you @SwiftSeal (also it's strange that you have bambu_3.9.3 if you installed it via conda ? It should be 3.8.3 from what I understand).

Best regards,
Nicolaï

@SwiftSeal
Copy link

SwiftSeal commented Feb 26, 2025

Hi @N-Hoffmann

Aye some confusion there - I installed bambu via conda then patched it via devtools::install_github. But your method worked (with the addition of r-biocmanager). It now begins to run, but quickly encounters:

--- Start generating read class files ---
  |                                                                      |   0%Error in reducer$value.cache[[as.character(idx)]] <- values : 
  wrong args for environment subassignment
In addition: Warning message:
In parallel::mccollect(wait = FALSE, timeout = 1) :
  1 parallel job did not deliver a result

I might open a separate issue for that as seems unrelated to this.

EDIT:

Turns out that was just an OOM - runs fine after increasing memory :shipit:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants