Replies: 2 comments
-
To my knowledge, the inference API does not support adapter models. You might need to use model = AutoModelForCausalLM.from_pretrained(
base_model_name_or_path, device_map='auto')
model = PeftModel.from_pretrained(
model,
lora_model_name_or_path,
device_map='auto'
)
model = model.merge_and_unload() # Needs peft>=0.3.0
model.push_to_hub(model_name)
# For the inference API to work, we need to push the tokenizer too.
tokenizer.push_to_hub(model_name) |
Beta Was this translation helpful? Give feedback.
0 replies
-
This is very helpful. Thank you! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
After successfully training, how would I use the LLaMa based model in HuggingFace? I pushed the contents of the
lora_models
folder which I uniquely labeled but it is apparently missing the base model in order to successfully use an inference API?Beta Was this translation helpful? Give feedback.
All reactions