You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Do I need to change the template to "llama-2" if I would like to finetune llama2 model? I am a bit confused as if I set it to "llama-2", the training loss is always 0, but if I stick to "vicuna", the loss is okay.
I try to take a look at train_baichuan.py, it also uses "vicuna" instead of "baichuan..."
Figured it out not long after asking. For those that face this, you do need to modify the preprocess function in train.py. Specifically the # Mask targets. Only compute loss on the assistant outputs. sep = conv.sep + conv.roles[1] + ": " for conversation, target in zip(conversations, targets): and make sure it aligns with the template you use. I had to change it with ideas from #2423
Thanks for the awesome work! I am not sure whether I have missed this.
I see from here, the
train.py
uses vicuna template.FastChat/fastchat/train/train.py
Lines 87 to 88 in c3ad73a
Do I need to change the template to "llama-2" if I would like to finetune llama2 model? I am a bit confused as if I set it to "llama-2", the training loss is always 0, but if I stick to "vicuna", the loss is okay.
I try to take a look at
train_baichuan.py
, it also uses "vicuna" instead of "baichuan..."FastChat/fastchat/train/train_baichuan.py
Lines 80 to 82 in c3ad73a
In that case, when should these templates be used?
The text was updated successfully, but these errors were encountered: