-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GPT-2 Model and its Variants #354
Conversation
This looks great! Will take a deeper look, but you can do two things for now
|
Edit: Here is the updated notebook: https://colab.research.google.com/drive/1vI3dDmFEFOQM0GbDBfVf6sanATjkwV9X?usp=sharing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall lgtm, just a few minor comments.
keras_nlp/models/gpt2.py
Outdated
|
||
|
||
class Gpt2Custom(keras.Model): | ||
"""Generative Pretrained Transformer-2 (GPT-2) network. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should get out of the habit of listing the full model name in a first line. That should be short, scannable, and informative.
GPT-2 core network with custom hyperparmeters.
or something like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! I have shifted the full name below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, thank you!
One request: please move your style changes of other files to a separate PR.
Also just a heads up I will be pushing a name change cleanup that will probably require updating any checkpoints conversion scripts using |
@mattdangerw, updated: https://colab.research.google.com/drive/1vI3dDmFEFOQM0GbDBfVf6sanATjkwV9X?usp=sharing! |
Please hold off from merging this; I'll validate ckpt conversion for all 4 sizes |
Good to go! :) |
Partially resolves #337