CUDA Out of memory issue during inference #26

sselgrad23 · 2025-03-06T08:30:59Z

Hi! I wanted to first say that your results are very impressive.

When attempting to run run_video.py, I come across the issue that when reaching the stage of processing videos, I receive a CUDA out of memory issue. I have received both when inferring just on the fifth video in data/samples, and when running on all videos in this folder.

When inferring on the fifth video, I received at the end the following:
Predicting snippets with dilation 1: 100%|█████████▉| 1791/1798 [02: Predicting snippets with dilation 1: 100%|█████████▉| 1793/1798 [02: Predicting snippets with dilation 1: 100%|█████████▉| 1795/1798 [02: Processing videos: 0%| | 0/1 [13:03<?, ?it/s]1797/1798 [02: Traceback (most recent call last): File "/cluster/scratch/sselgrad/RollingDepth/run_video.py", line 491, in <module> pipe_out: RollingDepthOutput = pipe( ^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/venv/rollingdepth/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 135, in __call__ pipe_output = self.forward( ^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/venv/rollingdepth/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 291, in forward snippet_pred_ls = self.init_snippet_infer( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 452, in init_snippet_infer triplets_decoded = self.decode_depth( ^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 733, in decode_depth all_decoded = torch.cat(decoded_outputs, dim=0) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 10.00 GiB. GPU 0 has a total capacity of 23.68 GiB of which 6.07 GiB is free. Including non-PyTorch memory, this process has 17.61 GiB memory in use. Of the allocated memory 16.26 GiB is allocated by PyTorch, and 1.04 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

When inferring on all five videos, it is able to save predictions for the first four videos, but for the fifth one, it runs out of memory:
INFO:root:Saving predictions to output/samples_fast_low_gpu_mem_footage/4_M9SFmxyAlrc-CceT_500frame_pred.npy INFO:root:Using codec: h264 Writing video: 100%|██████████| 500/500 [00:04<00:00, 105.86it/s] INFO:root:Using codec: h264██▉| 498/500 [00:04<00:00, 93.68it/s] Writing video: 100%|██████████| 500/500 [00:12<00:00, 38.72it/s] Writing video: 100%|█████████▉| 498/500 [00:12<00:00, 37.50it/INFO:root:Using codec: h264 Writing video: 100%|██████████| 500/500 [00:17<00:00, 27.79it/s] Processing videos: 67%|██████▋ | 4/6 [1INFO:root:1800 frames loaded from video data/samples/5_M9SFmxyAlrc-CceT_long.mp4 ... [06: Traceback (most recent call last): File "/cluster/scratch/sselgrad/RollingDepth/run_video.py", line 491, in <module> pipe_out: RollingDepthOutput = pipe( ^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/venv/rollingdepth/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 135, in __call__ pipe_output = self.forward( ^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/venv/rollingdepth/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 291, in forward snippet_pred_ls = self.init_snippet_infer( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 452, in init_snippet_infer triplets_decoded = self.decode_depth( ^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 733, in decode_depth all_decoded = torch.cat(decoded_outputs, dim=0) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 10.00 GiB. GPU 0 has a total capacity of 23.64 GiB of which 6.70 GiB is free. Including non-PyTorch memory, this process has 16.94 GiB memory in use. Of the allocated memory 16.29 GiB is allocated by PyTorch, and 191.46 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

I get this issue even when passing the recommended options for low GPU memory footage.

If you could help me resolve this issue, I would be very grateful.

The text was updated successfully, but these errors were encountered:

markkua · 2025-03-07T08:47:43Z

Hi, thanks for the issue. I have pushed an update #27. Please pull and try again.

markkua added a commit that referenced this issue Mar 7, 2025

unload snippet in decoding loop #26

9bb0d36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA Out of memory issue during inference #26

CUDA Out of memory issue during inference #26

sselgrad23 commented Mar 6, 2025

markkua commented Mar 7, 2025

CUDA Out of memory issue during inference #26

CUDA Out of memory issue during inference #26

Comments

sselgrad23 commented Mar 6, 2025

markkua commented Mar 7, 2025