Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ampere GPU RuntimeError Cuda and slow basecalling speed #335

Closed
jorisbalc opened this issue Mar 20, 2023 · 1 comment
Closed

Ampere GPU RuntimeError Cuda and slow basecalling speed #335

jorisbalc opened this issue Mar 20, 2023 · 1 comment

Comments

@jorisbalc
Copy link

Hi,

I'm trying to run a custom mod base model with bonito with the following line:

onito basecaller --modified-base-model train_results/model_best.pt [email protected] BC3/ --device cuda:0  --reference /home/v313/ref-seqs/lambdagenomeref.fasta > ahyC_bonito_basecalls.bam

I get the following error after which bonito starts basecalling with CPU:

> reading pod5
> outputting aligned bam
> loading model [email protected]
> loading modified base model
> loaded modified base model to call (alt to C): a=5ahyC
> loading reference
> calling:   0%|                                                      | 0/40000 [00:00<?, ? reads/s]Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/home/v313/.local/lib/python3.8/site-packages/bonito/multiprocessing.py", line 110, in run
    for item in self.iterator:
  File "/home/v313/.local/lib/python3.8/site-packages/bonito/crf/basecall.py", line 71, in <genexpr>
    (read, compute_scores(model, batch, reverse=reverse)) for read, batch in batches
  File "/home/v313/.local/lib/python3.8/site-packages/bonito/crf/basecall.py", line 34, in compute_scores
    scores = model(batch.to(dtype).to(device))
  File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/v313/.local/lib/python3.8/site-packages/bonito/crf/model.py", line 178, in forward
    return self.encoder(x)
  File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/v313/.local/lib/python3.8/site-packages/koi/lstm.py", line 117, in forward
    layer(buff1, buff2, self.chunks)
  File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/v313/.local/lib/python3.8/site-packages/koi/lstm.py", line 85, in forward
    void_ptr(input_buffer.data @ self.w_ih),
RuntimeError: CUDA out of memory. Tried to allocate 2.20 GiB (GPU 0; 7.79 GiB total capacity; 1.68 GiB already allocated; 1.81 GiB free; 4.40 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
> calling:   0%|                                           | 1/40000 [00:04<53:1

I'm trying to run this on 1.12.1+cu113 torch version for ampere gpus. I'm running a mobile 3080.

Name: torch
Version: 1.12.1+cu113
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3
Location: /home/v313/.local/lib/python3.8/site-packages
Requires: typing-extensions
Required-by: ont-bonito, ont-remora, thop, torchaudio, torchvision

Switching from @4.0.0 model to @3.5.2 dismisses the error. My question is, is the error related to the slow basecalling speed and should I expect greater speed while using an ampere gpu? With the 3.5.2 model I'm getting 3 reads/s which seems extremely slow (and this is with the fast model aswell)

Thanks in advance!

@davidnewman02
Copy link
Collaborator

It looks like your GPU has insufficient memory (8Gb) to load the model with the default settings. The config for the [email protected] sets the [basecaller.batchsize] = 1536 by default, please try a smaller --batchsize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants