You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for developing this amazing pipeline! Our team are currently working on the clinical outbreak investigation on one of the most challenging multidrug-resistant fungi - Candida auris.
We would like to build a specific marker gene set for Candida auris by using the train module base on all the Candida auris genomes available on NCBI with complete/chromosome assembly level.
We have already downloaded a total of 45 genomes and created a directory containing all these reference genomes. However, when we tried to use the train module, we found that there was one more required option: -i STR Directory containing marker sequences in FASTA format (should be able to build an MSA).
May we know what data should we provide for this option?
Thank you very much!
Best regards,
Eddie
The text was updated successfully, but these errors were encountered:
Based on your description of your aim, it sounds like you are trying to identify marker genes for Candida aurisde novo from your genome sequences. Unfortunately, this is currently beyond the capabilities of our pipeline.
The train module of our pipeline is designed to generate profile HMMs from a pre-defined set of marker genes, using an iterative training process with the given set of genome sequences to improve sensitivity. This means that in order to use the module, you will need to first identify a set of candidate marker genes for Candida auris.
One potential resource for identifying marker genes for this organism could be OrthoDB Saccharomycetes subset. Once you have identified a set of candidate marker genes, you can create a FASTA file for each marker by gathering a handful of protein sequences. Then, you can provide a directory containing all of these FASTA files as the input for the -i option, which will generate profile HMMs for the marker genes you provided.
If this explanation is unclear or if you have any further questions, please do not hesitate to ask.
Hello @endixk ,
Thank you for developing this amazing pipeline! Our team are currently working on the clinical outbreak investigation on one of the most challenging multidrug-resistant fungi -
Candida auris
.We would like to build a specific marker gene set for
Candida auris
by using thetrain
module base on all theCandida auris
genomes available on NCBI with complete/chromosome assembly level.We have already downloaded a total of 45 genomes and created a directory containing all these reference genomes. However, when we tried to use the
train
module, we found that there was one more required option:-i STR Directory containing marker sequences in FASTA format (should be able to build an MSA)
.May we know what data should we provide for this option?
Thank you very much!
Best regards,
Eddie
The text was updated successfully, but these errors were encountered: