Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support GPT-J #154

Open
lorenzosinisi opened this issue Jan 29, 2023 · 6 comments
Open

Support GPT-J #154

lorenzosinisi opened this issue Jan 29, 2023 · 6 comments
Labels
kind:feature New feature or request

Comments

@lorenzosinisi
Copy link

Any plan on adding support for togethercomputer/GPT-JT (https://huggingface.co/spaces/togethercomputer/GPT-JT).

Seems like the closest alternative to GPT-3. What do you think? I would love to help but I don't know where to start

@josevalim
Copy link
Contributor

The above is an app, do you have a link to the model implementation and params from HF? :)

@lorenzosinisi
Copy link
Author

lorenzosinisi commented Jan 30, 2023

Oh sorry! Yes this one is the model https://huggingface.co/togethercomputer/GPT-JT-6B-v1. Does this have similarities with other models that I could try to implement or even use directly? I would have no idea where to start but willing to help

@josevalim
Copy link
Contributor

Ah, nice! We support GPT 2, so maybe that can be used as a building block? Or at least you can compare GPT 2 Python's implementation with our GPT 2 and then to the same to implement your own. :)

@lorenzosinisi
Copy link
Author

Thanks! I will give it a try :) thanks! Any MR I can look at for reference?

BTW this model already knows some Elixir

Screenshot 2023-01-30 at 10 10 49

❤️

@jonatanklosko
Copy link
Member

The reference hf/transformers implementation of GPT-J is here. The implementation should be for the most part similar to any other text model we have, like GPT-2. By a brief look I think we may need to adjust/extend our attention implementation to support the rotary position embedding, but it's fine to modify the current code as necessary and we can find the best way to make it configurable later.

@lorenzosinisi
Copy link
Author

Working on this but it is gonna take a while 'cause I am new to transformers

@jonatanklosko jonatanklosko changed the title Support for togethercomputer/GPT-JT ? Support GPT-J Mar 31, 2023
@jonatanklosko jonatanklosko added the kind:feature New feature or request label Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants