Data accumulation while computing in GPU #1094

ravikakaiya · 2022-06-16T12:12:51Z

🐛 Bug

To Reproduce

create a dataloader.
Passing batch wise images to GPU.
Computing SSIM on GPU.
After each batch the GPU memory allocation keeps on increasing.
Resulting into Cuda outof Memory

Expected behavior

RuntimeError: CUDA out of memory. Tried to allocate 210.00 MiB (GPU 0; 15.78 GiB total capacity; 14.39 GiB already allocated; 138.50 MiB free; 14.42 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Environment

TorchMetrics version (and how you installed TM, e.g. conda, pip, build from source): TorchMetrics version 0.8.2
Python & PyTorch Version (e.g., 1.0): python 3.9.12 , pytorch 1.11.0
Any other relevant information such as OS (e.g., Linux):OS : Linux

SkafteNicki · 2022-06-20T11:40:51Z

Hi @ravikakaiya,
This is warned about when using the metric: https://github.com/Lightning-AI/metrics/blob/203ab6b13cad0219b484f3e47c34b6e7c8831af1/src/torchmetrics/image/ssim.py#L86-L90
We need to store preds and target for computing over all batches.

If you only want to compute the value on the current batch, you could use the functional implementation
https://torchmetrics.readthedocs.io/en/stable/image/structural_similarity.html#functional-interface

Alternatively, you if you are using the modular implementation you can all metric.reset() after calling metric.compute() to reset the internal accumulation buffer.

Closing issue.

ravikakaiya added bug / fix Something isn't working help wanted Extra attention is needed labels Jun 16, 2022

SkafteNicki closed this as completed Jun 20, 2022

Borda added the v0.8.x label Aug 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data accumulation while computing in GPU #1094

Data accumulation while computing in GPU #1094

ravikakaiya commented Jun 16, 2022

SkafteNicki commented Jun 20, 2022

Data accumulation while computing in GPU #1094

Data accumulation while computing in GPU #1094

Comments

ravikakaiya commented Jun 16, 2022

🐛 Bug

To Reproduce

Expected behavior

Environment

SkafteNicki commented Jun 20, 2022