Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about -noautoindex #24

Open
vpbrendel opened this issue Jan 6, 2021 · 0 comments
Open

about -noautoindex #24

vpbrendel opened this issue Jan 6, 2021 · 0 comments

Comments

@vpbrendel
Copy link

Hi Gordon et al.,
I have been trying to use gth in parallel, using a combination of -noautoindex and -intermediate. I got it to work in a roundabout way only, because I could not figure out how to make a proper index with either mkvtree or a first run of gth. The problem I ran into was presence/absence of .dna in the index files. Below is a script how I got around it. Surely there must be a more elegant solution?
Happy New Year, Volker

#/bin/bash!

NUMPRC=24
GENOME=IRBB7unm.fa
CDNAFILE=IRBB7trinityTranscripts.fa

We'll run a toy spliced alignment to create the genome index:

head -2 IRBB7trinityTranscripts.fa > tmpcdna
gth -genomic ${GENOME} -cdna tmpcdna -species rice

... if everything worked as planned, we should now have the

genome index files and can go ahead with the real work in parallel.

However, the created genome index files have the extra tag .dna,

which is then not recognized when using the -noautoindex option to

gth next. As a workaround, we rename the index files to get rid of

the .dna tag. Then gth -noautoindex works (it seems to copy the index

files it needs, using again the .dna tag, but now we seem to have a

working index ...).

ls -1 ${GENOME}.dna* > tmpcmda
cat tmpcmda | sed -e "s/.dna//" > tmpcmdb
sed -i -e "s/^/mv /" tmpcmda
paste tmpcmda tmpcmdb | bash
gth -noautoindex -genomic ${GENOME} -cdna tmpcdna -species rice
\rm tmpcmda tmpcmdb tmpcdna*

gt splitfasta -numfiles ${NUMPRC} ${CDNAFILE}

for cdnafile in ${CDNAFILE}.*
do
gth -noautoindex -intermediate -xmlout -gzip -o gth.${cdnafile}.gz -genomic ${GENOME} -cdna ${cdnafile} -species rice &
done
wait
echo "... gth intermediate run done"

gthconsensus -o gth.TranscriptsOnIRBB7 gth.${CDNAFILE}.*.gz
echo "... gthconsensus run done"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant