Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA Out of memory issue during inference #26

Open
sselgrad23 opened this issue Mar 6, 2025 · 1 comment
Open

CUDA Out of memory issue during inference #26

sselgrad23 opened this issue Mar 6, 2025 · 1 comment

Comments

@sselgrad23
Copy link

Hi! I wanted to first say that your results are very impressive.

When attempting to run run_video.py, I come across the issue that when reaching the stage of processing videos, I receive a CUDA out of memory issue. I have received both when inferring just on the fifth video in data/samples, and when running on all videos in this folder.

When inferring on the fifth video, I received at the end the following:
Predicting snippets with dilation 1: 100%|█████████▉| 1791/1798 [02: Predicting snippets with dilation 1: 100%|█████████▉| 1793/1798 [02: Predicting snippets with dilation 1: 100%|█████████▉| 1795/1798 [02: Processing videos: 0%| | 0/1 [13:03<?, ?it/s]1797/1798 [02: Traceback (most recent call last): File "/cluster/scratch/sselgrad/RollingDepth/run_video.py", line 491, in <module> pipe_out: RollingDepthOutput = pipe( ^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/venv/rollingdepth/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 135, in __call__ pipe_output = self.forward( ^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/venv/rollingdepth/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 291, in forward snippet_pred_ls = self.init_snippet_infer( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 452, in init_snippet_infer triplets_decoded = self.decode_depth( ^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 733, in decode_depth all_decoded = torch.cat(decoded_outputs, dim=0) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 10.00 GiB. GPU 0 has a total capacity of 23.68 GiB of which 6.07 GiB is free. Including non-PyTorch memory, this process has 17.61 GiB memory in use. Of the allocated memory 16.26 GiB is allocated by PyTorch, and 1.04 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

When inferring on all five videos, it is able to save predictions for the first four videos, but for the fifth one, it runs out of memory:
INFO:root:Saving predictions to output/samples_fast_low_gpu_mem_footage/4_M9SFmxyAlrc-CceT_500frame_pred.npy INFO:root:Using codec: h264 Writing video: 100%|██████████| 500/500 [00:04<00:00, 105.86it/s] INFO:root:Using codec: h264██▉| 498/500 [00:04<00:00, 93.68it/s] Writing video: 100%|██████████| 500/500 [00:12<00:00, 38.72it/s] Writing video: 100%|█████████▉| 498/500 [00:12<00:00, 37.50it/INFO:root:Using codec: h264 Writing video: 100%|██████████| 500/500 [00:17<00:00, 27.79it/s] Processing videos: 67%|██████▋ | 4/6 [1INFO:root:1800 frames loaded from video data/samples/5_M9SFmxyAlrc-CceT_long.mp4 ... [06: Traceback (most recent call last): File "/cluster/scratch/sselgrad/RollingDepth/run_video.py", line 491, in <module> pipe_out: RollingDepthOutput = pipe( ^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/venv/rollingdepth/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 135, in __call__ pipe_output = self.forward( ^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/venv/rollingdepth/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 291, in forward snippet_pred_ls = self.init_snippet_infer( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 452, in init_snippet_infer triplets_decoded = self.decode_depth( ^^^^^^^^^^^^^^^^^^ File "/cluster/scratch/sselgrad/RollingDepth/rollingdepth/rollingdepth_pipeline.py", line 733, in decode_depth all_decoded = torch.cat(decoded_outputs, dim=0) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 10.00 GiB. GPU 0 has a total capacity of 23.64 GiB of which 6.70 GiB is free. Including non-PyTorch memory, this process has 16.94 GiB memory in use. Of the allocated memory 16.29 GiB is allocated by PyTorch, and 191.46 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

I get this issue even when passing the recommended options for low GPU memory footage.

If you could help me resolve this issue, I would be very grateful.

markkua added a commit that referenced this issue Mar 7, 2025
@markkua
Copy link
Member

markkua commented Mar 7, 2025

Hi, thanks for the issue. I have pushed an update #27. Please pull and try again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants