Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Absence of ref_model_name in the file which located in docs/source/best_of_n.mdx #2508

Open
7 of 9 tasks
aivolcano opened this issue Dec 20, 2024 · 1 comment
Open
7 of 9 tasks

Comments

@aivolcano
Copy link

System Info

from transformers import pipeline, AutoTokenizer
from trl import AutoModelForCausalLMWithValueHead
from trl.core import LengthSampler
from trl.extras import BestOfNSampler

ref_model = AutoModelForCausalLMWithValueHead.from_pretrained(ref_model_name)
reward_pipe = pipeline("sentiment-analysis", model=reward_model, device=device)
tokenizer = AutoTokenizer.from_pretrained(ref_model_name)
tokenizer.pad_token = tokenizer.eos_token


# callable that takes a list of raw text and returns a list of corresponding reward scores
def queries_to_scores(list_of_strings):
  return [output["score"] for output in reward_pipe(list_of_strings)]

best_of_n = BestOfNSampler(model, tokenizer, queries_to_scores, length_sampler=output_length_sampler)

I pasted the code from the open-source script. The link is https://github.com/huggingface/trl/blob/main/docs/source/best_of_n.mdx
could u define the ref_model_name in the script?

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

from transformers import pipeline, AutoTokenizer
from trl import AutoModelForCausalLMWithValueHead
from trl.core import LengthSampler
from trl.extras import BestOfNSampler

ref_model = AutoModelForCausalLMWithValueHead.from_pretrained(ref_model_name)
reward_pipe = pipeline("sentiment-analysis", model=reward_model, device=device)
tokenizer = AutoTokenizer.from_pretrained(ref_model_name)
tokenizer.pad_token = tokenizer.eos_token


# callable that takes a list of raw text and returns a list of corresponding reward scores
def queries_to_scores(list_of_strings):
  return [output["score"] for output in reward_pipe(list_of_strings)]

best_of_n = BestOfNSampler(model, tokenizer, queries_to_scores, length_sampler=output_length_sampler)

outputs:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
[<ipython-input-4-6cdab4e940d5>](https://u5o1j4ybk99-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20241218-060117_RC00_707469454#) in <cell line: 6>()
      4 from trl.extras import BestOfNSampler
      5 
----> 6 ref_model = AutoModelForCausalLMWithValueHead.from_pretrained(ref_model_name)
      7 reward_pipe = pipeline("sentiment-analysis", model=reward_model, device=device)
      8 tokenizer = AutoTokenizer.from_pretrained(ref_model_name)

NameError: name 'ref_model_name' is not defined

Expected behavior

Define the variable of ref_model_name, pls.

Checklist

  • I have checked that my issue isn't already filed (see open issues)
  • I have included my system information
  • Any code provided is minimal, complete, and reproducible (more on MREs)
  • Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
  • Any traceback provided is complete
@metric-space
Copy link
Contributor

@aivolcano There is a notebook that is related to this. The updated notebook is here: https://github.com/huggingface/trl/blob/main/examples/notebooks/best_of_n.ipynb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants