Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add use_exact_model_name option to prevent automatic model name modification #1339

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

niryuu
Copy link

@niryuu niryuu commented Nov 26, 2024

When loading quantized models, Unsloth automatically modifies the model name to load optimized versions. While this is helpful in most cases, it can lead to duplicate model caching when users specifically want to load both original and quantized ways.

This PR adds a new use_exact_model_name parameter that allows users to bypass this automatic modification and load the exact model specified.

Example:

model, tokenizer = FastLanguageModel.from_pretrained(
    "google/gemma-2-9b",
    load_in_4bit=True,
    use_exact_model_name=True  # Will load exactly this model without modification
)

@danielhanchen
Copy link
Contributor

So the goal is instead of loading google/gemma-2-9b-bnb-4bit, it should load directly from the cache of google/gemma-2-9b?

@niryuu
Copy link
Author

niryuu commented Nov 26, 2024

Yes. Unsloth has a mechanism for rewriting the given model name according to unsloth/models/mapper.py for efficiency. However, there are cases where we want to use the exact model name, such as cache control. The purpose of this PR is to provide an option for this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants