What is the idea behind train_on_inputs: false? #890

jooray · 2023-11-24T02:16:53Z

jooray
Nov 24, 2023

Hello,

what is the idea behind "train on inputs" setting? At what point is the input masked when set to false? What does it achieve?

My question is if the model still retains ability to answer specific questions, just does not learn from formulations of the questions (which might be noisy if they come from real-world users), or if it is just not used for error propagation...

In what situations should we set this to false and when to true?

Thanks.

dblakely · 2023-12-27T14:02:01Z

dblakely
Dec 27, 2023

The user's inputs are still fed into the model, so the model sees them while training. But they don't contribute to the language modeling loss -- loss is just for the output (response/answer) tokens. The intuition behind it is that you don't need to train the model to mimic the user inputs, you just need to train it to mimic the outputs given the user inputs.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the idea behind train_on_inputs: false? #890

{{title}}

Replies: 1 comment

{{title}}

Select a reply

What is the idea behind train_on_inputs: false? #890

jooray Nov 24, 2023

Replies: 1 comment

dblakely Dec 27, 2023

jooray
Nov 24, 2023

dblakely
Dec 27, 2023