Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No Aberrant Expression was found in Targeted Panel RNA Sequence Data #70

Open
ellango85 opened this issue Aug 21, 2024 · 7 comments
Open

Comments

@ellango85
Copy link

Hi,
I have a Target RNA Panel Sequence (Set of Genes are enriched) data. I have tried the OUTRIDER with default parameter, but I couldn't able get any aberrant expression even in known positive samples, which has known Aberrant splicing event.

Does the OUTRIDER will work on the Targeted Panel RNA Sequence Data or it will only work on non-Targeted whole transcriptome data!.

Even I tried to merge the Target and non-target Count file and failed due to some error [Run error #31669 unevaluated and other errors !!!]
Kindly help me out to fix this issue.

Best,
Ellango

@gelowin
Copy link

gelowin commented Oct 8, 2024

Hi,
I have the same problem. I create a toy example and no aberrant genes was discovered.

counts <- data.frame(
    id1 = withr::with_seed(seed = 1, rnbinom(100, size = 200, mu = 400)),
    id2 = withr::with_seed(seed = 2, rnbinom(100, size = 20,  mu = 800)),
    id3 = withr::with_seed(seed = 3, rnbinom(100, size = 2000, mu = 100)))
## Simulate some aberrant observations for demonstration purposes
counts[2:3, ] <- 4000000
counts[4,1] <- 888888888

row.names(counts) <- paste0("gene", 1:100)
counts <- as.matrix(counts)
counts <- round(counts) ## counts must be integer
annot_db <- data.frame(sampleID = paste0("id", 1:3), sex = c("F", "M", "M"))
ods <- OUTRIDER::OutriderDataSet(countData = counts, colData = annot_db)
head(assay(ods))


ods <- OUTRIDER::OUTRIDER(ods, implementation = "pca")
outrider_output <- OUTRIDER::results(ods)
## No significant events: 

@AtaJadidAhari
Copy link
Collaborator

AtaJadidAhari commented Oct 8, 2024

Hi @gelowin, running OUTRIDER with such a small set of simulated data will not work. It is recommended to run OUTRIDER with at least 50 samples. With so few samples, you would not get any significant results after FDR correction.

@AtaJadidAhari
Copy link
Collaborator

Hi @ellango85, sorry for the late reply. How many samples do you have?

You can also try to check the expression values for some genes that you think might have outliers using:

plotExpressionRank(ods, "your_gene_of_interest", basePlot=TRUE) 

to see what's going on.

@gelowin
Copy link

gelowin commented Oct 9, 2024

@AtaJadidAhari Thank you very much for your response.

Perhaps I don't fully understand what an "aberrant" observation is. I initially thought it referred to a extreme outlier.

For instance, I can simulate 10k genes and 100 subjects, introduce two clear outliers, and still get results indicating "No significant events", for example:

counts <- replicate( n = 100, rnbinom(10000, size = 200, mu = 50))
colnames(counts) <- paste0("id", 1:100)
rownames(counts) <- paste0("gen", 1:10000)
## introducing two aberrant observations:
counts[2, 2] <- 999999999
counts[1, 3] <- 999999999

annot_db <- data.frame(sampleID = paste0("id", 1:100), sex = c("F", "M"))
ods <- OUTRIDER::OutriderDataSet(countData = counts, colData = annot_db)
ods <- OUTRIDER::OUTRIDER(ods, implementation = "pca")
outrider_output <- OUTRIDER::results(ods)

plotExpectedVsObservedCounts(ods, "gen2", basePlot = TRUE)
plotQQ(ods, "gen2")
plotExpressionRank(ods, "gen2", basePlot=TRUE) 

Could you kindly clarify what is meant by 'aberrant status' in this context? I initially thought it referred to the outlier status of an event, but I believe I may have misunderstood.

Thank you very much

@AtaJadidAhari
Copy link
Collaborator

Hi @gelowin, the problem with this simulation is that the "extreme" outliers are simply too extreme and OUTRIDER cannot fit an autoencoder properly to this data. I redid your analysis with less extreme values, e.g. 9999 instead of 999999999 and then OUTRIDER produced the expected results:
image

> outrider_output
Key: <geneID, sampleID>
   geneID sampleID       pValue      padjust zScore  l2fc rawcounts meanRawcounts normcounts meanCorrected theta aberrant
   <char>   <char>        <num>        <num>  <num> <num>     <int>         <num>      <num>         <num> <num>   <lgcl>
1:   gen1      id3 2.099035e-22 2.054453e-17   8.49  5.84      9999        150.72    3135.68         83.81  1.24     TRUE
2:   gen2      id2 5.766593e-21 5.644114e-16   9.36  7.36      9999        147.59    8523.82        133.06  0.63     TRUE
   AberrantBySample AberrantByGene padj_rank
              <num>          <num>     <num>
1:                1              1         1
2:                1              1         1

@gelowin
Copy link

gelowin commented Oct 9, 2024

Ok, thank you very much.
Yes, with 9999, works. Anyway it seems that it should work also for those more aberrant. I did not try the autoencoder, because it seems freeze all the time but with the "pca" is not detected.
Thank you very much for your response and for your package.

@vyepez88
Copy link
Contributor

Hi @gelowin, I guess the problem is that 999999999 is so high that it's unrealistic and beyond the sort of expected values for the negative binomial distribution. Not sure if with the autoencoder it "freezes" or it's actually working. It can take a couple of hours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants