Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple samples and single samples produce some transcripts of GTF somewhat differently #287

Open
Tang-pro opened this issue Feb 17, 2025 · 2 comments

Comments

@Tang-pro
Copy link

Hi @andrewprzh

Thanks in advance.

First, suppa2 can get information about alternative splicing events based on GTF files.

When I use one sample for Isoquant analysis to get GTF, a gene in it contains intron retention events, but when I use all the samples (I measured 3 periods, 2 replicates), surprisingly I found that the total GTF file I got does not have any intron retention of this gene, what is the reason for this, is it because of the inaccurate structural annotation of GTFs obtained from a single sample? Here is the specific description in SUPPA2.

comprna/SUPPA#207

By the way, is it still necessary to do SQANTI3 again for the structural annotation file obtained with Isoquant, because when I did SQANTI3 analysis with the GTF obtained with isoquant, it was found that the structural annotations of some transcripts were different.

@andrewprzh
Copy link
Collaborator

Dear @Tang-pro

In general, it's quite hard to predict the outcome of the transcript discovery algorithm - it uses a lot of different cut-offs, including cut-offs relative to gene expression. Thus, when a gene gets more reads, some novel isoforms may appear to have insufficient read support. Moreover, when providing several replicas, IsoQuant reports a novel isoform only if it's confirmed by at least 2 of them. Thus, it may also happen that some of the novel isoforms were lost due to lack of support in different replicas/samples.

By the way, is it still necessary to do SQANTI3 again for the structural annotation file obtained with Isoquant, because when I did SQANTI3 analysis with the GTF obtained with isoquant, it was found that the structural annotations of some transcripts were different.

Could you send me an example where SQANTI and IsoQuant output differs?

Best
Andrey

@Tang-pro
Copy link
Author

Image

Hi @andrewprzh

The top GTF is the original reference GTF obtained from short-read RNA-seq (containing only the gene level as a reference), the second is the transcript_models.gtf obtained by Isoquant, the third is the corrected GTF obtained by SQANTI3, and the fourth contains the predicted CDS sequence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants