Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement options for selecting layers to apply LoRA to during training #163

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

viktorhargitai
Copy link

Implements a feature for specifying the layers of the model that LoRA should be applied to during training. This enables further training memory use reductions, and additional QLoRA training experiments (including but not limited to the reproduction of relevant experiments from the QLoRA paper).

The layer selection is done with the lora_modules str argument. Its value can be:

  • a RegEx pattern for exact matching arbitrary layer names,
  • the default value or 'all' selects all linear transformer block layers (i.e. no difference from the last commit)
  • 'attention', which selects the attention layers only,
  • 'ffn', which selects the feed-forward ones.
    (The latter 2 aim to reproduce ablation experiments from the paper, but work for e.g. Falcon models too, not just LLaMA.)

While the "lora_modules" argument was already in the guanaco training scripts, it was ignored because only the selection of all linear layers was implemented in this repo.

@viktorhargitai viktorhargitai changed the title Implement layer selection options for applying LoRA to during training Implement options for selecting layers to apply LoRA to during training Jun 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant