Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge multiple LoRA adapters per request (linear, TIES, DARE) #212

Merged
merged 25 commits into from
Feb 1, 2024

Conversation

tgaddair
Copy link
Contributor

@tgaddair tgaddair commented Jan 30, 2024

Closes #159.

Takes large parts of the implementation from huggingface/peft#1364.

Notably, this approach merges adapters statically during load, rather than during the forward pass. The benefit is that if you make multiple calls to the same merge params, it will be much faster, but potentially slightly slower if the user wants to frequently change merge params each request.

Usage:

"merged_adapters": {
    "ids": ["smangrul/tinyllama_lora_norobots", "smangrul/tinyllama_lora_sql", "smangrul/tinyllama_lora_adcopy"],
    "weights": [2.0, 0.3, 0.7],
    "merge_strategy": "ties",
    "density": 0.2,
    "majority_sign_method": "total"
}

@flozi00
Copy link
Collaborator

flozi00 commented Jan 30, 2024

Great addition
Lorax Moe coming 😂

@tgaddair tgaddair marked this pull request as ready for review January 31, 2024 01:09
@tgaddair tgaddair merged commit 3b4c973 into main Feb 1, 2024
2 checks passed
@tgaddair tgaddair deleted the lora-merge branch February 1, 2024 05:39
@prateeky2806
Copy link

Hi @tgaddair, I am Prateek the author of TIES-Merging. This is super great that people find TIES and its follow-ups like DARE useful and thanks for integrating this in the library.

I see that you have already added the merging methods to the Docs, however, it would also be great if there is some mention of it on the GitHub Readme so that people know that this feature is added and can read more about it. Let me know what you think about this and if you need help with this.

Thanks for your amazing library and work!
Prateek Yadav

@tgaddair
Copy link
Contributor Author

tgaddair commented Feb 2, 2024

Hey @prateeky2806, thanks for your work on TIES-Merging and for your suggestion here! I agree, mentioning this on the README is a good idea. It's something I was hoping to do as a follow-up, as I wasn't sure what the best place to put it would be, but I should be able to put something together in the next day or two.

@tgaddair
Copy link
Contributor Author

tgaddair commented Feb 2, 2024

Added #218 to track README updates.

def merge(self, task_tensors: List[torch.Tensor], weights: torch.Tensor) -> torch.Tensor:
# sparsify
task_tensors = [prune(tensor, self.density, method="magnitude") for tensor in task_tensors]
weighted_task_tensors = _apply_weights(task_tensors, weights)
Copy link

@prateeky2806 prateeky2806 Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tgaddair, I think this should be fixed similar to what was done in the HF PEFT library. In their initial commit, they had the wrong ordering of the steps. I would suggest to maybe look at the discussion and fixed here.

Basically, the weight step should be done after obtaining the majority sign.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @prateeky2806! Let me put together a quick PR.

Copy link

@prateeky2806 prateeky2806 Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI you would also need to change the order of the steps in dare_ties.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prateeky2806 can you take a look at #239 and verify the fix looks correct? Thanks again for pointing this out!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tgaddair, these changes look good, there is just one last thing. It would be ideal to set the default weight for TIES to all ones because that a setting we experimented with in our paper and kind of works well if people do not want to try out different values.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have also left a similar comment in the HF PEFT pull request which I think will be fixed in the coming days.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @prateeky2806, I believe setting default weights to 1s is handled here: https://github.com/predibase/lorax/blob/main/server/lorax_server/utils/merges/strategies.py#L107

But let me know if I am misunderstanding something. Happy to change if needed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tgaddair, this looks correct. I missed this when I was going over the code.
Everything looks correct to me now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for verifying @prateeky2806!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support weighted adapters
3 participants