We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HI, 现在代码对于数据处理的方式是直接拼接text到max_length,中间用eos。这样操作的话在计算attention的时候,text2其实可以看到text1的内容,如果两个text之间没有啥联系的话会有影响吗?你们在实践中是会mask掉text1的token还是说每个text的文本尽可能的长呢,一个样本只有一个text?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Required prerequisites
Questions
HI,
现在代码对于数据处理的方式是直接拼接text到max_length,中间用eos。这样操作的话在计算attention的时候,text2其实可以看到text1的内容,如果两个text之间没有啥联系的话会有影响吗?你们在实践中是会mask掉text1的token还是说每个text的文本尽可能的长呢,一个样本只有一个text?
Checklist
The text was updated successfully, but these errors were encountered: