Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootstrap_enrichment_test throws "Error in exp_mats[[cc]][s, ] <- sort(expD[, cc]) : replacement has length zero" #86

Open
Kfarls opened this issue Sep 28, 2023 · 5 comments
Labels

Comments

@Kfarls
Copy link

Kfarls commented Sep 28, 2023

1. Bug description

When running EWCE with my own gene lists and single cell data, I get this error:
"Error in exp_mats[[cc]][s, ] <- sort(expD[, cc]) :
replacement has length zero"
traceback is:

  1. | generate_bootstrap_plots_exp_mats(exp_mats = exp_mats, sct_data = sct_data, annotLevel = annotLevel, bootstrap_list = bootstrap_list, reps = reps, combinedGenes = combinedGenes, hits = hits, verbose = verbose)

  2. | compute_gene_scores(sct_data = sct_data, annotLevel = annotLevel, bootstrap_list = bootstrap_list, hits = hits, combinedGenes = combinedGenes, verbose = verbose)

  3. | get_summed_proportions(hits = hits, sct_data = sct_data, annotLevel = annotLevel, reps = reps, no_cores = no_cores, geneSizeControl = geneSizeControl, controlledCT = controlledCT, control_network = control_network, verbose = verbose)

  4. EWCE::bootstrap_enrichment_test(sct_data = ctd, sctSpecies = "human",
    genelistSpecies = "human", hits = hits, reps = reps, annotLevel = annotLevel)

2. Reproducible example

Code

reps <- 100
# Use level 1 annotations)
annotLevel <- 1
hits <- GWAS_psych

psych_results <- EWCE::bootstrap_enrichment_test(sct_data = ctd,
                                                hits = hits, 
                                                sctSpecies_origin = "human",
                                                sctSpecies = "human",
                                                genelistSpecies = "human",
                                                reps = reps,
                                                annotLevel = annotLevel,
                                                geneSizeControl = TRUE,
                                                verbose = TRUE)

Data

(If possible, upload a small sample of your data so that we can reproduce the bug on our end. If that's not possible, please at least include a screenshot of your data and other relevant details.)
I have tried running EWCE with several of my own gene lists, the gene lists work fine when running with the example ctds however i get the above error if I run on my own ctd.
str(GWAS_neurol)
chr [1:45] "ANO3" "APOE" "DYNC1I1" "FHL5" "GALNT2" "PVRL2" "SCN2A" "ADSS" "ASH1L" "ASTN2" "ATG13" "BCL7C" "BTN2A2" "C14orf37" "CAMK1D" ...
ctd generated like so:
counts <- GetAssayData(seurat, slot = "counts")
temp <- list() # initalise empty list
celltype <- unique(metadata$major_clust)
temp <- sapply(1:length(celltype), function(x){
cellname <- celltype[x]
keep <- which(metadata$major_clust == cellname)
counts_subset <- counts[, keep]
keep <- which(tabulate(counts_subset@i + 1) > (0.05 * ncol(counts_subset)))
genes <- rownames(counts_subset)[keep]
updatedlist <- append(temp, list(genes))
})
a <- bitr(geneID = rownames(counts.filtered),
fromType = "ENTREZID",
toType = "SYMBOL",
OrgDb = "org.Hs.eg.db",
drop = FALSE) %>%
mutate(SYMBOL = if_else(is.na(SYMBOL), as.character(ENTREZID),
SYMBOL))
rownames(counts.filtered) <- a$SYMBOL
ctd <- generate_celltype_data(exp = counts.filtered,
annotLevels = annotLevels,
groupName = "pfc_dev",
input_species = "human",
output_species = "human",
savePath = "../results/single_cell/EWCE/")
ctd <- load_rdata("../results/single_cell/EWCE/ctd_pfc_dev.rda")

3. Session info

R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Australia/Perth
tzcode source: internal

attached base packages:
[1] stats4 stats graphics grDevices datasets utils methods base

other attached packages:
[1] devtools_2.4.5 usethis_2.2.2 ewceData_1.8.0 ExperimentHub_2.8.1 AnnotationHub_3.8.0
[6] BiocFileCache_2.8.0 dbplyr_2.3.4 ggdendro_0.1.23 tidyverse_2.0.0 hdWGCNA_0.2.23
[11] EWCE_1.9.3 RNOmni_1.0.1.2 dplyr_1.1.3 tibble_3.2.1 tidyr_1.3.0
[16] readr_2.1.4 purrr_1.0.2 stringr_1.5.0 lubridate_1.9.2 forcats_1.0.0
[21] reshape2_1.4.4 ggplot2_3.4.2 patchwork_1.1.3 gplots_3.1.3 clusterProfiler_4.8.1
[26] rtracklayer_1.60.0 GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 SeuratObject_4.1.3 Seurat_4.3.0.1
[31] org.Hs.eg.db_3.17.0 AnnotationDbi_1.62.2 IRanges_2.34.1 S4Vectors_0.38.1 Biobase_2.60.0
[36] BiocGenerics_0.46.0 TCseq_1.23.0 pacman_0.5.1

@Kfarls Kfarls added the bug label Sep 28, 2023
@Al-Murphy
Copy link
Collaborator

Hey, would it be possible to share your ctd? It would nbot be possible for us to debug otherwise.

Thanks,
Alan.

@Lina0125
Copy link

I got the same error, when length of hits are small, this error wont happen, like 15 hits in background with 1350 genes. I tried several times with the same ctd file, only got this error when hits length are big, I'm saying like 500 hits in a background with 1350 genes

@Al-Murphy
Copy link
Collaborator

Can you share reproducible code demonstrating the issue? As I said, we can't debug effectively otherwise. Thanks!

@Lina0125
Copy link

Hi, Unfortunately I can not share any data, the code I used were

full_results <- EWCE::bootstrap_enrichment_test(sct_data = ctd_mtg_Ensemble2_final, 
                                                bg = background,
                                                sctSpecies = "human",
                                                genelistSpecies = "human",
                                                hits = background[1:500],
                                                reps = 10000,
                                                annotLevel = 1)

I'm not super familiar with this method, I thought it might because some hits not appear in ctd? Or some package problem. It's very strange I can get it run when I use background[1:15] as hits

@Al-Murphy
Copy link
Collaborator

EWCE should filter to the genes in the background or at least print a statement to say there was little/no overlap. I can't really help with example data - I would suggest using the vignette ctd and test your ideas of what the problem is on that to get the same error message. If you manage to do that and share the code, I will be able to help.

Thanks,
Alan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants