Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expected training time. #13

Open
xiekunwhy opened this issue Dec 26, 2024 · 1 comment
Open

expected training time. #13

xiekunwhy opened this issue Dec 26, 2024 · 1 comment

Comments

@xiekunwhy
Copy link

xiekunwhy commented Dec 26, 2024

Hi,

I want to train models myself, I have a computer without GPU, only 64 threads with 512Gb memory. I have about 150 plant genomes downloaded from ensembl and/or other database with busco C > 90%.
Could you please give an expected training time using above data and resource?

Best,
Kun

@LarsGab
Copy link
Collaborator

LarsGab commented Jan 22, 2025

Hi,

It is challenging to estimate training time accurately, as it depends on for example the size of the training data, the input species, and the performance of your CPU. If your computational resources are limited, I recommend training only the pre-HMM part. For mammals, we trained the pre-HMM part for six days using 4xA100 GPUs. On our 64-thread CPU, this would take approximately 36 times longer with the same batch size. However, with 512 GB of memory, it might be possible to increase the batch size, which may reduce he training time.

Best,
Lars

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants