Description
🐛 Describe the bug
nvJPEG leaks memory and fails with OOM after ~1-2k images.
import torch
from torchvision.io import read_file, decode_jpeg
for i in range(1000): # increase to your liking till gpu OOMs (:
img_u8 = read_file('lena.jpg')
img_nv = decode_jpeg(img_u8, device='cuda')
Probably related to first response to #3848
RuntimeError: nvjpegDecode failed: 5
is exactly the message you get after OOM.
Versions
PyTorch version: 1.9.0+cu111
Is debug build: False
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A
OS: Arch Linux (x86_64)
GCC version: (GCC) 11.1.0
Clang version: 12.0.1
CMake version: version 3.21.1
Libc version: glibc-2.33
Python version: 3.8.7 (default, Jan 19 2021, 18:48:37) [GCC 10.2.0] (64-bit runtime)
Python platform: Linux-5.13.8-arch1-1-x86_64-with-glibc2.2.5
Is CUDA available: True
CUDA runtime version: 11.4.48
GPU models and configuration:
GPU 0: NVIDIA GeForce RTX 2080 Ti
GPU 1: NVIDIA GeForce RTX 2080 Ti
GPU 2: NVIDIA GeForce GTX 1080
Nvidia driver version: 470.57.02
cuDNN version: Probably one of the following:
/usr/lib/libcudnn.so.8.2.2
/usr/lib/libcudnn_adv_infer.so.8.2.2
/usr/lib/libcudnn_adv_train.so.8.2.2
/usr/lib/libcudnn_cnn_infer.so.8.2.2
/usr/lib/libcudnn_cnn_train.so.8.2.2
/usr/lib/libcudnn_ops_infer.so.8.2.2
/usr/lib/libcudnn_ops_train.so.8.2.2
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] adabelief-pytorch==0.2.0
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.19.5
[pip3] pytorch-lightning==1.4.5
[pip3] torch==1.9.0+cu111
[pip3] torchaudio==0.9.0
[pip3] torchfile==0.1.0
[pip3] torchmetrics==0.4.1
[pip3] torchvision==0.10.0+cu111
[conda] Could not collect