Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问下pad_size为什么是int(max(known_num)) #48

Open
Flyooofly opened this issue Dec 18, 2022 · 1 comment
Open

请问下pad_size为什么是int(max(known_num)) #48

Flyooofly opened this issue Dec 18, 2022 · 1 comment

Comments

@Flyooofly
Copy link

您好,我把一些中间变量的shape打出来看了下,有个地方不太明白
我的理解是这样的,只讨论tgt部分,300维是可学习的编码,然后pad部分是存放添加了噪声的label
image

如此图,batchsize为2,两张图片的label数量分别为4和16,然后噪声label的tensor经过repeat scalar次后shape变为20×5=100
但是pad_size只设置为known_num的最大值的话,pad部分大小为16×5=80.
那这样的话新的tgt大小为380,但是噪声label是100,会占用掉非去噪部分的20

当然如果按您给的训练参数batch_size=1的话不会存在这个问题,但是batch_size为1有点慢,针对batchsize>1可否设置成pad_size=sum(known_num)呢,这里的改动会影响整个模型的性能吗。谢谢。

@FengLi-ust
Copy link
Collaborator

您好,我们的实现是可以支持batchsize>1的情况的。您可以设置batchsize=2在这里debug一下应该就明白原理了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants