Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix/load-checkpoint-add-new-tokens #1225

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Erland366
Copy link
Contributor

#1215

Given this issue where we can't immediately use the changed vocab size because the difference size between the adapter and base model, we need to resize the base model before merging the LoRA into base model.

Note this need changes to the unsloth-zoo since we need a modification of it. which I also create a PR of it

unslothai/unsloth-zoo#9

@Erland366
Copy link
Contributor Author

I need a discussion about the embedding tho since I did not implement specification to specify the method to extend the embedding. So for example, when training the embedding, the user specify to use interpolation. Then when we load the checkpoint and resize the base model again, we need to make sure that the resize method is the same as in training.

Maybe we can store additional params in the model.config of the method? then we can pass it when we load the checkpoint and resize?

@Erland366 Erland366 changed the title Add functionality to update model vocabulary with new tokenizer tokens fix/load-checkpoint-add-new-tokens Oct 31, 2024
@Erland366
Copy link
Contributor Author

Also while here, seems like the value of tokenizer.vocab_size is unchanged when we do add_new_tokens. Is tokenizer.vocab_size only consider non special tokens and since we add all of the new tokens to the special tokens, that's why the attribute value is not increasing?

@Erland366
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant