-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why the max_seq_length = 512 for XLNet? #263
Comments
I was having the same question. @zihangdai could you please help us with this? |
or maybe @kimiyoung ? |
Assuming you are familiar with Transformer-XL, Then, why the value 512? |
@zihangdai thank you for your fast reply! |
Hi,
Just a conceptual question:
In the paper, it is mentioned that XLNet derives some parts from Transformer-XL which isn't limited to a fixed context but the hyperparameters section says that the max length is 512.
Can you please help me better understand it?
Thanks!
The text was updated successfully, but these errors were encountered: