Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with Models from PyReft Library #2012

Open
crux82 opened this issue Jun 23, 2024 · 6 comments
Open

Compatibility with Models from PyReft Library #2012

crux82 opened this issue Jun 23, 2024 · 6 comments

Comments

@crux82
Copy link

crux82 commented Jun 23, 2024

Hi everyone,

First, I'm sorry if this issue has already been raised.

I wanted to ask if the framework supports models obtained through the PyReft library (https://github.com/stanfordnlp/pyreft). Currently, in lm-eval, there is support for models obtained by applying LoRA through PEFT, but I haven’t found any information regarding loading models obtained via LoReft.

Is there anyone who can help me with this?

Thank you for your time and help!

@haileyschoelkopf
Copy link
Contributor

Hi there! Thanks for your interest.

PyREFT is a very cool project, but I think ultimately we can't support every external library / option without making either the maintenance overhead too high or making the code far less modifiable for a majority of users. I'm therefore disinclined to add this as a feature natively, though if many users request it or say a wide variety of ready-to-use REFTs are available on the HF hub then perhaps we can reconsider.

I'd recommend modifying the __main__.py script (or your own script that calls lm_eval.evaluate() or lm_eval.simple_evaluate() to apply the Reft modules / interventions to a loaded HF model, and pass that initialized HF model to initialize HFLM(pretrained=my_loaded_reft_model) . Or to subclass and override the relevant logic in lm_eval.models.huggingface.HFLM if that's more convenient. It should not be a significant quantity of code change! See e.g. https://github.com/state-spaces/mamba/blob/main/evals/lm_harness_eval.py for a minimal example of how this might be done. Hope this is helpful!

Might leave this issue open for now so that others can express interest if it's an often-requested feature though.

@crux82
Copy link
Author

crux82 commented Jun 24, 2024

Hi @haileyschoelkopf,

Thank you very much for your prompt and detailed response. I completely understand that it's almost impossible to support every new model or library out there.

Regarding your suggestions, I found the example at https://github.com/state-spaces/mamba/blob/main/evals/lm_harness_eval.py quite helpful. However, I'm still missing some contextual information to confidently proceed with customizing the library for a specific model.

Would it be possible to provide a minimal guide or some additional support for writing a main script? For instance, inspired by the tutorial for loading REFT models available at the following link:

https://medium.com/@syed_hasan/finetuning-llama-3-using-reft-representation-fine-tuning-technique-00f4fe1f497c

In this tutorial, the model is essentially loaded with:

import torch, transformers, pyreft
device = "cuda"

model_name_or_path = "meta-llama/Meta-Llama-3-8B"
model = transformers.AutoModelForCausalLM.from_pretrained(
    model_name_or_path, torch_dtype=torch.bfloat16, device_map=device)

reft_model = pyreft.ReftModel.load(
    "Syed-Hasan-8503/Llama-3-openhermes-reft", model, from_huggingface_hub=True
)

reft_model.set_device("cuda")

And used with:

instruction = "A rectangular garden has a length of 25 feet and a width of 15 feet. If you want to build a fence around the entire garden, how many feet of fencing will you need?"

# tokenize and prepare the input
prompt = prompt_no_input_template % instruction
prompt = tokenizer(prompt, return_tensors="pt").to(device)

base_unit_location = prompt["input_ids"].shape[-1] - 1  # last position
_, reft_response = reft_model.generate(
    prompt, unit_locations={"sources->base": (None, [[[base_unit_location]]])},
    intervene_on_prompt=True, max_new_tokens=512, do_sample=True, 
    eos_token_id=tokenizer.eos_token_id, early_stopping=True
)
print(tokenizer.decode(reft_response[0], skip_special_tokens=True))

I'd appreciate any guidance or resources you could provide to help with integrating REFT models into the lm-eval framework. This could also serve as a first script to be added to the examples section, benefiting other users with similar needs.

Thank you again for your time and assistance!

@haileyschoelkopf
Copy link
Contributor

haileyschoelkopf commented Jun 24, 2024

The Mamba example is pretty nice in that you can simply call cli_evaluate() and not hack any of the rest of the script.

I'd recommend in this instance subclassing HFLM, and overwriting the _create_model() method to include the logic that you put there for loading an REFT model! That'd be the simplest.

@crux82
Copy link
Author

crux82 commented Jun 24, 2024

Great! I also assume that I need to overwrite the _model_generate() method. Or not?

@LSinev
Copy link
Contributor

LSinev commented Jun 24, 2024

For cases that can be solved with subclass of LM class, ability to load them in the way like included tasks might be a solution. But this functionality is still awaiting PRs: #1457

@crux82
Copy link
Author

crux82 commented Jun 25, 2024

Hi, @LSinev! I think the solution suggested by @haileyschoelkopf can be "easy".

I think that I need to reimplement the init and the _model_call() method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants