Tiny-DeepSpeed

Welcome to Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library. This project is designed to provide a simple, easy-to-understand codebase that helps learners and developers understand the core functionalities of DeepSpeed, a powerful library for accelerating deep learning models.

Share us a ⭐ if this github repo does help.

If you encounter any question, please feel free to contact us. You can create an issue or just send an email to me at [email protected].

This project is highly inspired by CoreScheduler, a High-Performance Scheduler for Large Model Training

Getting Started

Prerequisites

Before you begin, ensure you have the following installed:

Python 3.11
PyTorch (CUDA) 2.3.1
triton 2.3.1

Installation

Clone this repository to your local machine:

git clone https://github.com/liangyuwang/Tiny-DeepSpeed.git
cd Tiny-DeepSpeed

Running the Demo

To run the Tiny-DeepSpeed demo, use the following command (set "num_device" to your number of devices):

# Single Device
python example/single_device/train.py

# DDP mode
torchrun --nproc_per_node num_device --nnodes 1 example/ddp/train.py

# Zero1 mode
torchrun --nproc_per_node num_device --nnodes 1 example/zero1/train.py

# Zero2 mode
torchrun --nproc_per_node num_device --nnodes 1 example/zero2/train.py

This will initiate a simple training loop using the Tiny-DeepSpeed framework.

Feel free to try our demo online on Kaggle Notebook

Features

Simplified Codebase: Stripped down to the essential components to facilitate learning and experimentation with DeepSpeed.
Meta Device Model Initialization: Loads model parameters on a meta device, avoiding actual parameter initialization and reducing initial memory usage.
Parameter Distribution via Cache Rank Map: Implements a cache rank map table to distribute model parameters across different ranks. Each parameter is assigned a rank ID based on the number of participants, allowing for efficient and targeted initialization.
Scalability and Flexibility: Demonstrates basic principles of distributed training and parameter management that can be scaled up for more complex implementations.
Educational Tool: Serves as a practical guide for those new to model optimization and distributed computing in machine learning.

Name	Name	Last commit message	Last commit date
Latest commit liangyuwang Update README Aug 8, 2024 1ca1850 · Aug 8, 2024 History 34 Commits
example	example	Update example	Aug 5, 2024
script	script	Update README	Jul 26, 2024
tiny_deepspeed	tiny_deepspeed	fix bugs of communication	Aug 6, 2024
LICENSE	LICENSE	Initial commit	Jul 22, 2024
README.md	README.md	Update README	Aug 8, 2024
license_addition_errors.log	license_addition_errors.log	Add License	Jul 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tiny-DeepSpeed

Getting Started

Prerequisites

Installation

Running the Demo

Features

TODO:

About

Releases

Packages

Languages

License

kaustpradalab/Tiny-DeepSpeed

Folders and files

Latest commit

History

Repository files navigation

Tiny-DeepSpeed

Getting Started

Prerequisites

Installation

Running the Demo

Features

TODO:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages