You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fromtransformersimportpipeline, AutoTokenizerfromtrlimportAutoModelForCausalLMWithValueHeadfromtrl.coreimportLengthSamplerfromtrl.extrasimportBestOfNSamplerref_model=AutoModelForCausalLMWithValueHead.from_pretrained(ref_model_name)
reward_pipe=pipeline("sentiment-analysis", model=reward_model, device=device)
tokenizer=AutoTokenizer.from_pretrained(ref_model_name)
tokenizer.pad_token=tokenizer.eos_token# callable that takes a list of raw text and returns a list of corresponding reward scoresdefqueries_to_scores(list_of_strings):
return [output["score"] foroutputinreward_pipe(list_of_strings)]
best_of_n=BestOfNSampler(model, tokenizer, queries_to_scores, length_sampler=output_length_sampler)
An officially supported task in the examples folder
My own task or dataset (give details below)
Reproduction
fromtransformersimportpipeline, AutoTokenizerfromtrlimportAutoModelForCausalLMWithValueHeadfromtrl.coreimportLengthSamplerfromtrl.extrasimportBestOfNSamplerref_model=AutoModelForCausalLMWithValueHead.from_pretrained(ref_model_name)
reward_pipe=pipeline("sentiment-analysis", model=reward_model, device=device)
tokenizer=AutoTokenizer.from_pretrained(ref_model_name)
tokenizer.pad_token=tokenizer.eos_token# callable that takes a list of raw text and returns a list of corresponding reward scoresdefqueries_to_scores(list_of_strings):
return [output["score"] foroutputinreward_pipe(list_of_strings)]
best_of_n=BestOfNSampler(model, tokenizer, queries_to_scores, length_sampler=output_length_sampler)
outputs:
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
[<ipython-input-4-6cdab4e940d5>](https://u5o1j4ybk99-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20241218-060117_RC00_707469454#) in <cell line: 6>()
4 from trl.extras import BestOfNSampler
5
----> 6 ref_model = AutoModelForCausalLMWithValueHead.from_pretrained(ref_model_name)
7 reward_pipe = pipeline("sentiment-analysis", model=reward_model, device=device)
8 tokenizer = AutoTokenizer.from_pretrained(ref_model_name)
NameError: name 'ref_model_name' is not defined
Expected behavior
Define the variable of ref_model_name, pls.
Checklist
I have checked that my issue isn't already filed (see open issues)
I have included my system information
Any code provided is minimal, complete, and reproducible (more on MREs)
Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
Any traceback provided is complete
The text was updated successfully, but these errors were encountered:
System Info
I pasted the code from the open-source script. The link is https://github.com/huggingface/trl/blob/main/docs/source/best_of_n.mdx
could u define the ref_model_name in the script?
Information
Tasks
examples
folderReproduction
outputs:
Expected behavior
Define the variable of ref_model_name, pls.
Checklist
The text was updated successfully, but these errors were encountered: