-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about GODEL_XL (GPT-J) model size #19
Comments
The parameter count isn't a very reliable statistic of the model's capability. With newer models that exploit sparsely connected networks and model distillation, one can darastically reduce the number of parameters and improve the speed and performance stats of the model. (ie. faster and better, less params) |
Also, models that exploit knowledge retrieval can run circles around large language models. |
@meatflavourdev thanks for your responses. While I don't dispute anything you said, it doesn't address my question. To be clear, here is my issue:
My guess is that the number of parameters listed in the table in the README of this repo for the GODEL_XL model is a typo and it should say "6B" instead of "2.7B". This is a relatively minor point, but I was hoping that one of the authors could confirm just to be sure. |
First of all, thank you for making this work public!
I'm curious about the model size shown in the README for the released GODEL_XL model (based on GPT-J). In the table in the README it lists the model size as "2.7B". My understanding is that GPT-J has 6B parameters.
Is the number of parameters for GODEL XL listed in the README correct?
The text was updated successfully, but these errors were encountered: