Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not train model. Tensor mismatch #3

Open
Vadim2S opened this issue Dec 4, 2024 · 2 comments
Open

Can not train model. Tensor mismatch #3

Vadim2S opened this issue Dec 4, 2024 · 2 comments

Comments

@Vadim2S
Copy link

Vadim2S commented Dec 4, 2024

I try run training. Just test run yet. One file as train and same file as eval.

And I am get error in file train_ddp.py line 154 or so.
Error: RuntimeError: The size of tensor a (10) must match the size of tensor b (9) at non-singleton dimension 0
Code: mse = torch.mean(torch.square(inputs - video_recon), dim=(1, 2, 3))

I am investigate and see what (before rearrange):
inputs.shape = [1, 3, 10, 256, 256]
video_recon.shape = [1, 3, 9, 256, 256]

@qqingzheng
Copy link
Collaborator

qqingzheng commented Dec 5, 2024

May I check your training sh script? it seems that you set a wrong number of frames.

Cuz we employ temporal causal convolution instead of regular convolution, the number of frames must be 4x+1 (if temporal compression rate is 4).

@Vadim2S
Copy link
Author

Vadim2S commented Dec 5, 2024

What exactly meaning of --num_frames and --eval_num_frames? At first glance I am thinking about something like batch_size or train\val subset length.
Thanks! approach "4*n+1" works! Num_frames parameters must be from row 13, 17, 21, 25, 29, 33, etc Any other values leads to tensor mismatch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants