[Question] Train module issue #6

llk578496 · 2022-12-07T01:55:38Z

Thank you for developing this amazing pipeline! Our team are currently working on the clinical outbreak investigation on one of the most challenging multidrug-resistant fungi - Candida auris.

We would like to build a specific marker gene set for Candida auris by using the train module base on all the Candida auris genomes available on NCBI with complete/chromosome assembly level.

We have already downloaded a total of 45 genomes and created a directory containing all these reference genomes. However, when we tried to use the train module, we found that there was one more required option: -i STR Directory containing marker sequences in FASTA format (should be able to build an MSA).

May we know what data should we provide for this option?

Thank you very much!

Best regards,
Eddie

The text was updated successfully, but these errors were encountered:

endixk · 2022-12-08T02:21:08Z

Dear Eddie,

Thank you for using our pipeline!

Based on your description of your aim, it sounds like you are trying to identify marker genes for Candida auris de novo from your genome sequences. Unfortunately, this is currently beyond the capabilities of our pipeline.

The train module of our pipeline is designed to generate profile HMMs from a pre-defined set of marker genes, using an iterative training process with the given set of genome sequences to improve sensitivity. This means that in order to use the module, you will need to first identify a set of candidate marker genes for Candida auris.

One potential resource for identifying marker genes for this organism could be OrthoDB Saccharomycetes subset. Once you have identified a set of candidate marker genes, you can create a FASTA file for each marker by gathering a handful of protein sequences. Then, you can provide a directory containing all of these FASTA files as the input for the -i option, which will generate profile HMMs for the marker genes you provided.

If this explanation is unclear or if you have any further questions, please do not hesitate to ask.

Thanks!

Best wishes,
Daniel

endixk added the question Further information is requested label Dec 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Train module issue #6

[Question] Train module issue #6

llk578496 commented Dec 7, 2022 •

edited

Loading

endixk commented Dec 8, 2022

[Question] Train module issue #6

[Question] Train module issue #6

Comments

llk578496 commented Dec 7, 2022 • edited Loading

endixk commented Dec 8, 2022

llk578496 commented Dec 7, 2022 •

edited

Loading