Skip to content

Commit

Permalink
init
Browse files Browse the repository at this point in the history
  • Loading branch information
juncongmoo committed Feb 23, 2023
1 parent 743eaa6 commit 338a466
Show file tree
Hide file tree
Showing 54 changed files with 63 additions and 73 deletions.
34 changes: 10 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,8 @@

> To Train ChatGPT In 5 Minutes
Implementation of RLHF (Reinforcement Learning with Human Feedback) powered by Colossal-AI. It supports distributed training and offloading, which can fit extremly large models. More details can be found in the [blog](https://www.hpc-ai.tech/blog/colossal-ai-chatgpt).

<p align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/chatgpt.png" width=700/>
</p>

## Training process (step 3)
<p align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/experience.jpg" width=500/>
</p>
<p align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/train.jpg" width=500/>
</p>


## Install

```shell
pip install minichatgpt
```
Expand All @@ -34,9 +20,9 @@ The main entrypoint is `Trainer`. We only support PPO trainer now. We support ma
Simplest usage:

```python
from chatgpt.trainer import PPOTrainer
from chatgpt.trainer.strategies import ColossalAIStrategy
from chatgpt.nn import GPTActor, GPTCritic, RewardModel
from minichatgpt.trainer import PPOTrainer
from minichatgpt.trainer.strategies import ColossalAIStrategy
from minichatgpt.nn import GPTActor, GPTCritic, RewardModel
from copy import deepcopy
from colossalai.nn.optimizer import HybridAdam

Expand Down Expand Up @@ -140,7 +126,7 @@ strategy.load_optimizer(actor_optim, 'actor_optim_checkpoint.pt')
- [ ] support more RL paradigms, like Implicit Language Q-Learning (ILQL)

## Invitation to open-source contribution
Referring to the successful attempts of [BLOOM](https://bigscience.huggingface.co/) and [Stable Diffusion](https://en.wikipedia.org/wiki/Stable_Diffusion), any and all developers and partners with computing powers, datasets, models are welcome to join and build an ecosystem with Colossal-AI, making efforts towards the era of big AI models from the starting point of replicating ChatGPT!
Referring to the successful attempts of [BLOOM](https://bigscience.huggingface.co/) and [Stable Diffusion](https://en.wikipedia.org/wiki/Stable_Diffusion), any and all developers and partners with computing powers, datasets, models are welcome to join and build an ecosystem with Colossal-AI, making efforts towards the era of big AI models from the starting point of replicating minichatgpt!

You may contact us or participate in the following ways:
1. Posting an [issue](https://github.com/hpcaitech/ColossalAI/issues/new/choose) or submitting a [PR](https://github.com/hpcaitech/ColossalAI/pulls) on GitHub
Expand All @@ -153,21 +139,21 @@ and [WeChat](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colo
Thanks so much to all of our amazing contributors!

## Quick Preview
<p id="ChatGPT_scaling" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/ChatGPT%20scaling.png" width=800/>
<p id="minichatgpt_scaling" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/minichatgpt/minichatgpt%20scaling.png" width=800/>
</p>

- Up to 7.73 times faster for single server training and 1.42 times faster for single-GPU inference

<p id="ChatGPT-1GPU" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/ChatGPT-1GPU.jpg" width=450/>
<p id="minichatgpt-1GPU" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/minichatgpt/minichatgpt-1GPU.jpg" width=450/>
</p>

- Up to 10.3x growth in model capacity on one GPU
- A mini demo training process requires only 1.62GB of GPU memory (any consumer-grade GPU)

<p id="inference" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/LoRA%20data.jpg" width=600/>
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/minichatgpt/LoRA%20data.jpg" width=600/>
</p>

- Increase the capacity of the fine-tuning model by up to 3.7 times on a single GPU
Expand Down
8 changes: 4 additions & 4 deletions benchmarks/benchmark_gpt_dummy.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@
import torch
import torch.distributed as dist
import torch.nn as nn
from chatgpt.nn import GPTActor, GPTCritic, RewardModel
from chatgpt.trainer import PPOTrainer
from chatgpt.trainer.callbacks import PerformanceEvaluator
from chatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy, Strategy
from minichatgpt.nn import GPTActor, GPTCritic, RewardModel
from minichatgpt.trainer import PPOTrainer
from minichatgpt.trainer.callbacks import PerformanceEvaluator
from minichatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy, Strategy
from torch.optim import Adam
from transformers.models.gpt2.configuration_gpt2 import GPT2Config
from transformers.models.gpt2.tokenization_gpt2 import GPT2Tokenizer
Expand Down
8 changes: 4 additions & 4 deletions benchmarks/benchmark_opt_lora_dummy.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@
import torch
import torch.distributed as dist
import torch.nn as nn
from chatgpt.nn import OPTActor, OPTCritic, RewardModel
from chatgpt.trainer import PPOTrainer
from chatgpt.trainer.callbacks import PerformanceEvaluator
from chatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy, Strategy
from minichatgpt.nn import OPTActor, OPTCritic, RewardModel
from minichatgpt.trainer import PPOTrainer
from minichatgpt.trainer.callbacks import PerformanceEvaluator
from minichatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy, Strategy
from torch.optim import Adam
from transformers import AutoTokenizer
from transformers.models.opt.configuration_opt import OPTConfig
Expand Down
Empty file removed chatgpt/__init__.py
Empty file.
2 changes: 1 addition & 1 deletion examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ torchrun --standalone --nproc_per_node=2 train_dummy.py --strategy colossalai

## Train with real prompt data

We use [awesome-chatgpt-prompts](https://huggingface.co/datasets/fka/awesome-chatgpt-prompts) as example dataset. It is a small dataset with hundreds of prompts.
We use [awesome-minichatgpt-prompts](https://huggingface.co/datasets/fka/awesome-minichatgpt-prompts) as example dataset. It is a small dataset with hundreds of prompts.

You should download `prompts.csv` first.

Expand Down
6 changes: 3 additions & 3 deletions examples/train_dummy.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
from copy import deepcopy

import torch
from chatgpt.nn import BLOOMActor, BLOOMCritic, GPTActor, GPTCritic, OPTActor, OPTCritic, RewardModel
from chatgpt.trainer import PPOTrainer
from chatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy, NaiveStrategy
from minichatgpt.nn import BLOOMActor, BLOOMCritic, GPTActor, GPTCritic, OPTActor, OPTCritic, RewardModel
from minichatgpt.trainer import PPOTrainer
from minichatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy, NaiveStrategy
from torch.optim import Adam
from transformers import AutoTokenizer, BloomTokenizerFast
from transformers.models.gpt2.tokenization_gpt2 import GPT2Tokenizer
Expand Down
6 changes: 3 additions & 3 deletions examples/train_prompts.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@

import pandas as pd
import torch
from chatgpt.nn import BLOOMActor, BLOOMCritic, GPTActor, GPTCritic, OPTActor, OPTCritic, RewardModel
from chatgpt.trainer import PPOTrainer
from chatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy, NaiveStrategy
from minichatgpt.nn import BLOOMActor, BLOOMCritic, GPTActor, GPTCritic, OPTActor, OPTCritic, RewardModel
from minichatgpt.trainer import PPOTrainer
from minichatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy, NaiveStrategy
from torch.optim import Adam
from transformers import AutoTokenizer, BloomTokenizerFast
from transformers.models.gpt2.tokenization_gpt2 import GPT2Tokenizer
Expand Down
8 changes: 4 additions & 4 deletions examples/train_reward_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

import loralib as lora
import torch
from chatgpt.dataset import RewardDataset
from chatgpt.nn import BLOOMRM, GPTRM, OPTRM
from chatgpt.trainer import RewardModelTrainer
from chatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy, NaiveStrategy
from minichatgpt.dataset import RewardDataset
from minichatgpt.nn import BLOOMRM, GPTRM, OPTRM
from minichatgpt.trainer import RewardModelTrainer
from minichatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy, NaiveStrategy
from datasets import load_dataset
from torch.optim import Adam
from transformers import AutoTokenizer, BloomTokenizerFast
Expand Down
3 changes: 3 additions & 0 deletions minichatgpt/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
__version__='0.0.1'


File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

import torch
import torch.nn as nn
from chatgpt.nn.actor import Actor
from minichatgpt.nn.actor import Actor


@dataclass
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import torch
from chatgpt.nn.utils import compute_reward, normalize
from minichatgpt.nn.utils import compute_reward, normalize

from .base import Experience, ExperienceMaker

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from abc import ABC, abstractmethod
from typing import Any

from chatgpt.experience_maker.base import Experience
from minichatgpt.experience_maker.base import Experience


class ReplayBuffer(ABC):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from typing import List

import torch
from chatgpt.experience_maker.base import Experience
from minichatgpt.experience_maker.base import Experience

from .base import ReplayBuffer
from .utils import BufferItem, make_experience_batch, split_experience_batch
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

import torch
import torch.nn.functional as F
from chatgpt.experience_maker.base import Experience
from minichatgpt.experience_maker.base import Experience


@dataclass
Expand Down
File renamed without changes.
4 changes: 2 additions & 2 deletions chatgpt/trainer/base.py → minichatgpt/trainer/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
from typing import Any, Callable, Dict, List, Optional, Union

import torch
from chatgpt.experience_maker import Experience, ExperienceMaker
from chatgpt.replay_buffer import ReplayBuffer
from minichatgpt.experience_maker import Experience, ExperienceMaker
from minichatgpt.replay_buffer import ReplayBuffer
from torch import Tensor
from torch.utils.data import DistributedSampler
from tqdm import tqdm
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from abc import ABC

from chatgpt.experience_maker import Experience
from minichatgpt.experience_maker import Experience


class Callback(ABC):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

import torch
import torch.distributed as dist
from chatgpt.experience_maker import Experience
from minichatgpt.experience_maker import Experience

from .base import Callback

Expand Down
8 changes: 4 additions & 4 deletions chatgpt/trainer/ppo.py → minichatgpt/trainer/ppo.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
from typing import Any, Callable, Dict, List, Optional

import torch.nn as nn
from chatgpt.experience_maker import Experience, NaiveExperienceMaker
from chatgpt.nn import Actor, Critic, PolicyLoss, ValueLoss
from chatgpt.nn.generation_utils import update_model_kwargs_fn
from chatgpt.replay_buffer import NaiveReplayBuffer
from minichatgpt.experience_maker import Experience, NaiveExperienceMaker
from minichatgpt.nn import Actor, Critic, PolicyLoss, ValueLoss
from minichatgpt.nn.generation_utils import update_model_kwargs_fn
from minichatgpt.replay_buffer import NaiveReplayBuffer
from torch.optim import Optimizer

from .base import Trainer
Expand Down
4 changes: 2 additions & 2 deletions chatgpt/trainer/rm.py → minichatgpt/trainer/rm.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

import loralib as lora
import torch
from chatgpt.dataset import RewardDataset
from chatgpt.nn import PairWiseLoss
from minichatgpt.dataset import RewardDataset
from minichatgpt.nn import PairWiseLoss
from torch.optim import Adam, Optimizer
from torch.utils.data import DataLoader
from tqdm import tqdm
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@

import torch
import torch.nn as nn
from chatgpt.nn import Actor, Critic, RewardModel
from chatgpt.replay_buffer import ReplayBuffer
from minichatgpt.nn import Actor, Critic, RewardModel
from minichatgpt.replay_buffer import ReplayBuffer
from torch.optim import Optimizer
from torch.utils.data import DataLoader

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
import torch.distributed as dist
import torch.nn as nn
import torch.optim as optim
from chatgpt.nn import Actor
from minichatgpt.nn import Actor
from torch.optim import Optimizer

import colossalai
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@
import torch
import torch.distributed as dist
import torch.nn as nn
from chatgpt.nn import Actor
from chatgpt.replay_buffer import ReplayBuffer
from minichatgpt.nn import Actor
from minichatgpt.replay_buffer import ReplayBuffer
from torch.nn.parallel import DistributedDataParallel as DDP
from torch.optim import Optimizer
from torch.utils.data import DataLoader, DistributedSampler
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import torch
import torch.nn as nn
import torch.optim as optim
from chatgpt.replay_buffer import ReplayBuffer
from minichatgpt.replay_buffer import ReplayBuffer
from torch.optim import Optimizer
from torch.utils.data import DataLoader

Expand Down
File renamed without changes.
5 changes: 3 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,18 @@
setup(
name='minichatgpt',
maintainer='Juncong Moo',
maintainer_email='[email protected]',
version=read_file('version.txt',by_line=False),
packages=find_packages(exclude=(
'tests',
'benchmarks',
'*.egg-info',
)),
description='A RLFH implementation (ChatGPT) powered by ColossalAI',
description='minichatgpt - Traing ChatGPT In 5 Minutes',
long_description=read_file('README.md',by_line=False),
long_description_content_type='text/markdown',
license='Apache Software License 2.0',
url='https://github.com/hpcaitech/ChatGPT',
url='https://github.com/juncongmoo/minichatgpt',
install_requires=read_file('requirements.txt'),
python_requires='>=3.6',
classifiers=[
Expand Down
4 changes: 2 additions & 2 deletions tests/test_checkpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
import torch
import torch.distributed as dist
import torch.multiprocessing as mp
from chatgpt.nn import GPTActor
from chatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy
from minichatgpt.nn import GPTActor
from minichatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy
from transformers.models.gpt2.configuration_gpt2 import GPT2Config

from colossalai.nn.optimizer import HybridAdam
Expand Down
8 changes: 4 additions & 4 deletions tests/test_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@
import torch
import torch.distributed as dist
import torch.multiprocessing as mp
from chatgpt.experience_maker import NaiveExperienceMaker
from chatgpt.nn import GPTActor, GPTCritic, RewardModel
from chatgpt.replay_buffer import NaiveReplayBuffer
from chatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy
from minichatgpt.experience_maker import NaiveExperienceMaker
from minichatgpt.nn import GPTActor, GPTCritic, RewardModel
from minichatgpt.replay_buffer import NaiveReplayBuffer
from minichatgpt.trainer.strategies import ColossalAIStrategy, DDPStrategy
from transformers.models.gpt2.configuration_gpt2 import GPT2Config

from colossalai.testing import rerun_if_address_is_in_use
Expand Down
2 changes: 1 addition & 1 deletion version.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.0.1a0
0.0.1a1

0 comments on commit 338a466

Please sign in to comment.