Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

converting Gemma maxtext compatible checkpoint to Hugging Face format #829

Open
salrowili opened this issue Aug 16, 2024 · 3 comments
Open

Comments

@salrowili
Copy link

salrowili commented Aug 16, 2024

I have looked around for a script that could convert MaxText Gemma and Gemma 2 checkpoints to Hugging Face format but i have not find anything related. This may related to #581

@salrowili
Copy link
Author

Any update on this?

@gobbleturk
Copy link
Collaborator

Indeed #581 was adding support for this. Out of curiosity what is your use case for this?

@salrowili
Copy link
Author

salrowili commented Sep 18, 2024

Hi @gobbleturk ,
#581 does not work with Gemma because Gemma 2 has local and global attention. I think each of q k and v attention layer has a local layer followed by global one. My case is that I did a continual pre-training of the Gemma 2 2B model on a mono language pre-training dataset and I want to use HF SFT trainer to do supervised fine-tuning tasks.

Thank you though for taking care of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants