Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ask about the usage of template #2517

Open
coranholmes opened this issue Oct 5, 2023 · 2 comments
Open

Ask about the usage of template #2517

coranholmes opened this issue Oct 5, 2023 · 2 comments

Comments

@coranholmes
Copy link

coranholmes commented Oct 5, 2023

Thanks for the awesome work! I am not sure whether I have missed this.

I see from here, the train.py uses vicuna template.

conv = get_conversation_template("vicuna")
roles = {"human": conv.roles[0], "gpt": conv.roles[1]}

Do I need to change the template to "llama-2" if I would like to finetune llama2 model? I am a bit confused as if I set it to "llama-2", the training loss is always 0, but if I stick to "vicuna", the loss is okay.
I try to take a look at train_baichuan.py, it also uses "vicuna" instead of "baichuan..."

def apply_prompt_template(sources, systems=None):
conv = get_conversation_template("vicuna")
roles = {"human": conv.roles[0], "gpt": conv.roles[1]}

In that case, when should these templates be used?

@akujuou-sony
Copy link

I am facing the same issue with fine-tuning Mistral. Did anyone figure this out yet?

@akujuou-sony
Copy link

Figured it out not long after asking. For those that face this, you do need to modify the preprocess function in train.py. Specifically the # Mask targets. Only compute loss on the assistant outputs. sep = conv.sep + conv.roles[1] + ": " for conversation, target in zip(conversations, targets): and make sure it aligns with the template you use. I had to change it with ideas from #2423

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants