"winnow" transcripts to filter by coverage #1

AlexGaithuma · 2020-12-24T07:35:09Z

Hi fishercera,

I read your approach on Transcriptome de novo assembly approach and am interested. However, I am not a bioinformatics expert but a DIY and "learn while doing it" kind of guy.

I want to follow the process and use it on my data.
Could you be kind enough to provide an outline of the commands you used to achieve the end result of ~20,000 transcripts. This would be very helpful. Thanks in advance. my email is [email protected]

Your words are as follows:

I started with >100,000 transcripts in a de-novo transcriptome made from
pooled siblings' tissues.
What I have done to "winnow" transcripts is to filter by coverage, as here:
https://github.com/trinityrnaseq/trinityrnaseq/wiki/Trinity-Transcript-Quantification#filtering-transcripts
Then I take the remaining transcripts that passed that filter and I predict
ORFs with something like Transdecoder (I used GeneMarkS-T).
THEN I cluster the predicted proteome at a 70% identity threshold using
USEARCH: https://www.drive5.com/usearch/
The centroid sequences you get from that are the ones that are most
representative of each cluster. I take the headers for the centroid
proteins and use them to pull the matching nucleotide transcripts from my
assembly.

This has generally ended up with a nice manageable transcriptome of ~20,000
transcripts. The N50 goes up considerably. And my BUSCO results are quite
good!

AlexGaithuma changed the title ~~"winnow" transcripts is to filter by coverage~~ "winnow" transcripts to filter by coverage Dec 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"winnow" transcripts to filter by coverage #1

"winnow" transcripts to filter by coverage #1

AlexGaithuma commented Dec 24, 2020 •

edited

Loading

"winnow" transcripts to filter by coverage #1

"winnow" transcripts to filter by coverage #1

Comments

AlexGaithuma commented Dec 24, 2020 • edited Loading

AlexGaithuma commented Dec 24, 2020 •

edited

Loading