Skip to content

Commit

Permalink
make actual changes to recipes
Browse files Browse the repository at this point in the history
  • Loading branch information
RdoubleA committed Feb 23, 2024
1 parent 8f65b67 commit 3ab9a99
Show file tree
Hide file tree
Showing 21 changed files with 205 additions and 1,422 deletions.
6 changes: 5 additions & 1 deletion recipes/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,11 @@
# LICENSE file in the root directory of this source tree.


_RECIPE_LIST = ["full_finetune", "lora_finetune", "alpaca_generate", "full_finetune_hydra"]
_RECIPE_LIST = [
"full_finetune",
"lora_finetune",
"alpaca_generate",
]
_CONFIG_LISTS = {
"full_finetune": ["alpaca_llama2_full_finetune"],
"lora_finetune": ["alpaca_llama2_lora_finetune"],
Expand Down
32 changes: 25 additions & 7 deletions recipes/configs/alpaca_llama2_full_finetune.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,43 @@
# tune --nnodes 1 --nproc_per_node 1 --config alpaca_llama2_full_finetune --override model_checkpoint=<your_checkpoint_dir> ...

# Dataset and Dataloader
dataset: alpaca
dataset:
_target_: torchtune.datasets.AlpacaDataset
train_on_input: True

seed: null
shuffle: True

# Model Arguments
model: llama2_7b
model:
_target_: torchtune.models.llama2_7b

model_checkpoint: /tmp/llama2-7b
tokenizer: llama2_tokenizer
tokenizer_checkpoint: /tmp/tokenizer.model
tokenizer:
_target_: torchtune.models.llama2_tokenizer
path: /tmp/tokenizer.model

# Fine-tuning arguments
batch_size: 2
lr: 2e-5
epochs: 3
optimizer: SGD
loss: CrossEntropyLoss
optimizer:
_target_: torch.optim.SGD
lr: 2e-5
max_steps_per_epoch: null
gradient_accumulation_steps: 1
log_every_n_steps: null
run_generation: null

loss:
_target_: torch.nn.CrossEntropyLoss

output_dir: /tmp/alpaca-llama2-finetune
device: cuda
dtype: fp32
enable_fsdp: True
enable_activation_checkpointing: True
cpu_offload: False
resume_from_checkpoint: False
metric_logger:
_target_: torchtune.utils.metric_logging.DiskLogger
log_dir: ${output_dir}
46 changes: 0 additions & 46 deletions recipes/configs/alpaca_llama2_full_finetune_hydra.yaml

This file was deleted.

43 changes: 28 additions & 15 deletions recipes/configs/alpaca_llama2_lora_finetune.yaml
Original file line number Diff line number Diff line change
@@ -1,36 +1,49 @@
# Model Arguments
model: lora_llama2_7b
model:
_target_: lora_llama2_7b
lora_attn_modules: ['q_proj', 'v_proj']
lora_rank: 8
lora_alpha: 16

model_checkpoint: /tmp/llama2-7b
lora_attn_modules: ['q_proj', 'v_proj']
lora_rank: 8
lora_alpha: 16
lora_checkpoint: null

# Tokenizer
tokenizer: llama2_tokenizer
tokenizer_checkpoint: /tmp/tokenizer.model

# Dataset and Sampler
dataset: alpaca
train_on_input: True
dataset:
_target_: torchtune.datasets.AlpacaDataset
train_on_input: True
use_clean: True
shuffle: True
batch_size: 2

# Optimizer and Scheduler
optimizer: AdamW
weight_decay: 0.01
lr: 3e-4
lr_scheduler: cosine_with_warmup
num_warmup_steps: 100
loss: CrossEntropyLoss
optimizer:
_target_: AdamW
weight_decay: 0.01
lr: 3e-4
lr_scheduler: # TODO: this is a partial instantiation, make this more elegant
_target_: torchtune.modules.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
_target_: torch.nn.CrossEntropyLoss

# Training
epochs: 1
resume_from_checkpoint: False

# Logging
output_dir: /tmp/lora_finetune_output
metric_logger:
_target_: torchtune.utils.metric_logging.DiskLogger
log_dir: ${output_dir}

# Environment
device: cuda
dtype: fp32

# Logging
output_dir: /tmp/lora_finetune_output
enable_fsdp: True
enable_activation_checkpointing: True
Loading

0 comments on commit 3ab9a99

Please sign in to comment.