You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Question about the metric reported in the paper?.
HELLO! I am a new NLPer. I am confused about the pipline(pretrain->fineturn->test) of pre-training large language models.
I would like to know which stage of the model was used for unlabeled dataset (e.g., c4), labeled dataset (e.g., glue, superGLUE WMT), respectively?
In paper, section 2.4, I find that We instead allow for separately fine-tuning the model on each individual task and use short task prefixes instead of an explicit question-answer format.
As shown in Table 1 of the paper, dose T5 model pre-trained on dataset C4, then fine-tuned on GLUE, CNNDM, SQuAD, SGLUE and WMT dataset, respectively? Finally, reported the score in Table 1.
Other Large Language Models, like GPT, GPT2, have these models been fine-tuned on labeled dataset before reporting the scores?
Thank you!
The text was updated successfully, but these errors were encountered:
Question about the metric reported in the paper?.
HELLO! I am a new NLPer. I am confused about the pipline(pretrain->fineturn->test) of pre-training large language models.
In paper, section 2.4, I find that
We instead allow for separately fine-tuning the model on each individual task and use short task prefixes instead of an explicit question-answer format.
As shown in Table 1 of the paper, dose
T5
model pre-trained on dataset C4, then fine-tuned on GLUE, CNNDM, SQuAD, SGLUE and WMT dataset, respectively? Finally, reported the score in Table 1.Thank you!
The text was updated successfully, but these errors were encountered: