💎🌍🇮🇹 Gemma Neogenesis

Improving Gemma 2 for a Specific Language on a Budget: Post-Training Recipe

Additional resources for Gemma Neogenesis, a 📓 Kaggle notebook for improving Gemma 2 for a specific language on a budget. The notebook participates to the Kaggle competition: Google - Unlock Global Communication with Gemma.

Notebook intro

The notebook demonstrates a case study on improving Gemma 2 2B's performance in Italian through Post-Training, combining Supervised Fine Tuning and Preference Tuning. The process uses both existing datasets and synthetic data generated specifically for this competition. While focused on Italian, the cost-effective methods demonstrated can inspire similar fine-tuning approaches for other languages.

👣 Navigating this repository

📝 Evaluation Prompts: prompts for evaluating the quality of translated instructions and responses, using an LLM as a Judge, in the context of LLM-aided translation.
👁️ Qualitative Evaluation/Vibe Checking: qualitative evaluation of the model, compared to gemma-2-2b-it on about10 varied questions/tasks.
🌐⚙️ Scale Translation: code for scaling LLM-aided-translation.
🎯 Spectrum results: results of the Signal to Noise Ratio analysis done with Spectrum.
📚 References: curated collection of resources and references used in the notebook.
🖼️ Images.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
evaluation_prompts		evaluation_prompts
images		images
qualitative_evaluation		qualitative_evaluation
scale_translation		scale_translation
spectrum_results		spectrum_results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
references.md		references.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💎🌍🇮🇹 Gemma Neogenesis

Notebook intro

👣 Navigating this repository

About

Languages

License

anakin87/gemma-neogenesis

Folders and files

Latest commit

History

Repository files navigation

💎🌍🇮🇹 Gemma Neogenesis

Notebook intro

👣 Navigating this repository

About

Resources

License

Stars

Watchers

Forks

Languages