Instruction Tuning

This module will guide you through instruction tuning language models. Instruction tuning involves adapting pre-trained models to specific tasks by further training them on task-specific datasets. This process helps models improve their performance on targeted tasks.

In this module, we will explore two topics: 1) Chat Templates and 2) Supervised Fine-Tuning.

1️⃣ Chat Templates

Chat templates structure interactions between users and AI models, ensuring consistent and contextually appropriate responses. They include components like system prompts and role-based messages. For more detailed information, refer to the Chat Templates section.

2️⃣ Supervised Fine-Tuning

Supervised Fine-Tuning (SFT) is a critical process for adapting pre-trained language models to specific tasks. It involves training the model on a task-specific dataset with labeled examples. For a detailed guide on SFT, including key steps and best practices, see the Supervised Fine-Tuning page.

Exercise Notebooks

Title	Description	Exercise	Link	Colab
Chat Templates	Learn how to use chat templates with SmolLM2 and process datasets into chatml format	🐢 Convert the `HuggingFaceTB/smoltalk` dataset into chatml format 🐕 Convert the `openai/gsm8k` dataset into chatml format	Notebook
Supervised Fine-Tuning	Learn how to fine-tune SmolLM2 using the SFTTrainer	🐢 Use the `HuggingFaceTB/smoltalk` dataset 🐕 Try out the `bigcode/the-stack-smol` dataset 🦁 Select a dataset for a real world use case	Notebook

References

Transformers documentation on chat templates
Script for Supervised Fine-Tuning in TRL
SFTTrainer in TRL
Direct Preference Optimization Paper
Supervised Fine-Tuning with TRL
How to fine-tune Google Gemma with ChatML and Hugging Face TRL
Fine-tuning LLM to Generate Persian Product Catalogs in JSON Format

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Instruction Tuning

1️⃣ Chat Templates

2️⃣ Supervised Fine-Tuning

Exercise Notebooks

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

Instruction Tuning

1️⃣ Chat Templates

2️⃣ Supervised Fine-Tuning

Exercise Notebooks

References