-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #4 from djliden/dl/reorganize
Dl/reorganize
- Loading branch information
Showing
20 changed files
with
1,364 additions
and
27 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,20 +1,24 @@ | ||
# Fine-Tuning LLMs | ||
# Jupyter Notebook Examples | ||
|
||
*View these notebooks in a more readable format at [danliden.com/fine-tuning](https://danliden.com/fine-tuning).* | ||
*View these notebooks in a more readable format at [danliden.com/notebooks](https://danliden.com/notebooks).* | ||
|
||
This series of notebooks is intended to show how to fine-tune language models, starting from smaller models on single-node single-GPU setups and gradually scaling up to multi-GPU and multi-node configurations. | ||
This repository contains a collection of Jupyter notebooks demonstrating various concepts and techniques across different fields. Currently, it includes a series on fine-tuning language models, but it will expand to cover other topics in the future. | ||
|
||
## Fine-Tuning LLMs | ||
|
||
The fine-tuning section shows how to fine-tune language models, starting from smaller models on single-node single-GPU setups and gradually scaling up to multi-GPU and multi-node configurations. | ||
|
||
Existing examples and learning resources generally do not bridge the practical gap between single-node single-GPU training when all parameters fit in VRAM, and the various forms of distributed training. These examples, when complete, are intended to show how to train smaller models given sufficient compute resources and then scale the models up until we encounter compute and/or memory constraints. We will then introduce various distributed training approaches aimed at overcoming these issues. | ||
|
||
This will, hopefully, serve as a practical and conceptual bridge from single-node single-GPU training to distributed training with tools such as deepspeed and FSDP. | ||
|
||
## How to use this repository | ||
|
||
The examples in this repository are intended to be read sequentially. Later examples build on earlier examples and gradually add scale and complexity. | ||
The examples in this repository are organized by topic. Within each topic, the notebooks are intended to be read sequentially. Later examples often build on earlier examples and gradually add complexity. | ||
|
||
## Contributing | ||
|
||
Contributions are welcome, and there are a few different ways to get involved. | ||
- If you see an error or bug, please [open an issue](https://github.com/djliden/fine-tuning/issues/new) or open a PR. | ||
- If you have a question about this repository, or you want to request a specific example, please [open an issue](https://github.com/djliden/fine-tuning/issues/new). | ||
- If you're interested in contributing an example, I encourage you to get in touch. You can [open an issue](https://github.com/djliden/fine-tuning/issues/new) or reach out by email or social media. | ||
- If you see an error or bug, please [open an issue](https://github.com/djliden/notebooks/issues/new) or open a PR. | ||
- If you have a question about this repository, or you want to request a specific example, please [open an issue](https://github.com/djliden/notebooks/issues/new). | ||
- If you're interested in contributing an example, I encourage you to get in touch. You can [open an issue](https://github.com/djliden/notebooks/issues/new) or reach out by email or social media. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Appendix | ||
|
||
This appendix contains a collection of miscellaneous resources and supplementary materials that complement the main sections of the AI training materials. | ||
|
||
```{tableofcontents} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
# LLM Training/Fine-tuning | ||
|
||
Welcome to the AI Training module! This notebook serves as an introduction to the fundamental concepts and techniques used in training artificial intelligence models. | ||
|
||
In this module, we'll explore various aspects of AI training, including: | ||
- Data preparation and preprocessing | ||
- Model selection and architecture | ||
- Training algorithms and optimization techniques | ||
- Evaluation metrics and performance assessment | ||
|
||
```{tableofcontents} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,19 @@ | ||
# Fine-Tuning LLMs | ||
# Guides and Examples | ||
|
||
This series of notebooks is intended to show how to fine-tune language models, starting from smaller models on single-node single-GPU setups and gradually scaling up to multi-GPU and multi-node configurations. | ||
```{attention} This site was previously dedicated solely to fine-tuning LLMs. I have since expanded the scope to include other topics. The process of converting the site is still in process, so you might encounter broken links or other issues. Let me know if you do! You can submit an issue with the GitHub button on the top right. | ||
``` | ||
|
||
This repository contains a collection of Jupyter notebooks demonstrating various concepts and techniques across different fields. Currently, it includes a series on fine-tuning language models, but it will expand to cover other topics in the future. | ||
|
||
## AI Training: Fine-Tuning LLMs | ||
|
||
Existing examples and learning resources generally do not bridge the practical gap between single-node single-GPU training when all parameters fit in VRAM, and the various forms of distributed training. These examples, when complete, are intended to show how to train smaller models given sufficient compute resources and then scale the models up until we encounter compute and/or memory constraints. We will then introduce various distributed training approaches aimed at overcoming these issues. | ||
The AI Training section currently focuses on fine-tuning language models. It shows how to fine-tune models starting from smaller, single-GPU setups and gradually scaling up to multi-GPU and multi-node configurations. | ||
|
||
This will, hopefully, serve as a practical and conceptual bridge from single-node single-GPU training to distributed training with tools such as deepspeed and FSDP. | ||
These examples aim to bridge the gap between single-node single-GPU training and various forms of distributed training, serving as a practical and conceptual guide for scaling up model training. | ||
|
||
## How to use this repository | ||
|
||
The examples in this repository are intended to be read sequentially. Later examples build on earlier examples and gradually add scale and complexity. | ||
The examples in this repository are organized by topic. Within each topic, the notebooks are intended to be read sequentially. Later examples often build on earlier examples and gradually add complexity. | ||
|
||
```{tableofcontents} | ||
``` |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
[project] | ||
name = "fine-tuning" | ||
version = "0.1.0" | ||
description = "Add your description here" | ||
readme = "README.md" | ||
requires-python = ">=3.12" | ||
dependencies = [ | ||
"jupyter-book>=1.0.2", | ||
] |
Oops, something went wrong.