Can not train model. Tensor mismatch #3

Vadim2S · 2024-12-04T13:24:05Z

I try run training. Just test run yet. One file as train and same file as eval.

And I am get error in file train_ddp.py line 154 or so.
Error: RuntimeError: The size of tensor a (10) must match the size of tensor b (9) at non-singleton dimension 0
Code: mse = torch.mean(torch.square(inputs - video_recon), dim=(1, 2, 3))

I am investigate and see what (before rearrange):
inputs.shape = [1, 3, 10, 256, 256]
video_recon.shape = [1, 3, 9, 256, 256]

qqingzheng · 2024-12-05T03:33:28Z

May I check your training sh script? it seems that you set a wrong number of frames.

Cuz we employ temporal causal convolution instead of regular convolution, the number of frames must be 4x+1 (if temporal compression rate is 4).

Vadim2S · 2024-12-05T10:11:36Z

What exactly meaning of --num_frames and --eval_num_frames? At first glance I am thinking about something like batch_size or train\val subset length.
Thanks! approach "4*n+1" works! Num_frames parameters must be from row 13, 17, 21, 25, 29, 33, etc Any other values leads to tensor mismatch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can not train model. Tensor mismatch #3

Can not train model. Tensor mismatch #3

Vadim2S commented Dec 4, 2024

qqingzheng commented Dec 5, 2024 •

edited

Loading

Vadim2S commented Dec 5, 2024 •

edited

Loading

Can not train model. Tensor mismatch #3

Can not train model. Tensor mismatch #3

Comments

Vadim2S commented Dec 4, 2024

qqingzheng commented Dec 5, 2024 • edited Loading

Vadim2S commented Dec 5, 2024 • edited Loading

qqingzheng commented Dec 5, 2024 •

edited

Loading

Vadim2S commented Dec 5, 2024 •

edited

Loading