-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phi-3 conversation format, example training script and perplexity metric #1582
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing! 👍
I can help get the tests/linter passing tomorrow
@brianfitzgerald Made some additional fixes, can you check to see if everything still seems correct? thanks! |
LGTM! |
This LGTM but I was testing it out and there might be an issue with Phi-3 and flash-attention. On 4xA100s node I have a warning is obtained when training Phi-3 (ONLY on Phi-3 this error occurs tested different models too).
This might be related to huggingface/transformers#30547 |
I tried out this branch and seem to have run into an issue where it hangs forever when tokenizing prompts at the start of a training run. It gets through a few then hangs on a dataset of size 1. Control+C is very slow/hangs as well. Need to do it twice. Config:
|
Btw, my issue here got resolved when I turned off sample packing. Maybe Phi-3 sample packing with Flash-Attention isn't compatible. @brianfitzgerald |
Merged. Thanks @brianfitzgerald ! |
@winglian @brianfitzgerald I didn't find a way to train phi3 with a conversation dataset. We should rely on the
|
…ric (#1582) * phi-3 support and perplexity metric * phi-3 chat template * metrics updates * chore: lint * fix assertion on Tensor * fix tests since tokenization happens in the metric * fix perplexity value of shorter passage --------- Co-authored-by: Wing Lian <[email protected]>
Adds phi-3 conversation template, and example script, to demonstrate how to fine-tune with Alpaca-format datasets with the Phi-3 pretraining format.
I've also added a Perplexity metric, which I ended up writing a variant of the Huggingface
evaluate
Perplexity metric, as that implementation both re-tokenizes and loads a separate copy of the LLM within the metric. Instead, this uses the already-loaded model and tokenizer, and the already-tokenized validation samples as the parameters for scoring perplexity.How has this been tested?
Trained an example LoRA with the provided config.
Unit test is provided for the Perplexity metric.
Full list of changes
Social Handles (Optional)
https://twitter.com/bfitzgerald242
https://brianfitzgerald.xyz/
Thanks for the review!