converting Gemma maxtext compatible checkpoint to Hugging Face format #829

salrowili · 2024-08-16T12:39:44Z

I have looked around for a script that could convert MaxText Gemma and Gemma 2 checkpoints to Hugging Face format but i have not find anything related. This may related to #581

salrowili · 2024-08-21T12:23:15Z

Any update on this?

gobbleturk · 2024-09-17T18:57:20Z

Indeed #581 was adding support for this. Out of curiosity what is your use case for this?

salrowili · 2024-09-18T04:08:06Z

Hi @gobbleturk ,
#581 does not work with Gemma because Gemma 2 has local and global attention. I think each of q k and v attention layer has a local layer followed by global one. My case is that I did a continual pre-training of the Gemma 2 2B model on a mono language pre-training dataset and I want to use HF SFT trainer to do supervised fine-tuning tasks.

Thank you though for taking care of this.

salrowili mentioned this issue Aug 17, 2024

add Gemma2 support to MaxText #814

Merged

shralex added the feature request label Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

converting Gemma maxtext compatible checkpoint to Hugging Face format #829

converting Gemma maxtext compatible checkpoint to Hugging Face format #829

salrowili commented Aug 16, 2024 •

edited

Loading

salrowili commented Aug 21, 2024

gobbleturk commented Sep 17, 2024

salrowili commented Sep 18, 2024 •

edited

Loading

converting Gemma maxtext compatible checkpoint to Hugging Face format #829

converting Gemma maxtext compatible checkpoint to Hugging Face format #829

Comments

salrowili commented Aug 16, 2024 • edited Loading

salrowili commented Aug 21, 2024

gobbleturk commented Sep 17, 2024

salrowili commented Sep 18, 2024 • edited Loading

salrowili commented Aug 16, 2024 •

edited

Loading

salrowili commented Sep 18, 2024 •

edited

Loading