-
-
Notifications
You must be signed in to change notification settings - Fork 927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Glaive conversation format support #1365
Add Glaive conversation format support #1365
Conversation
@brianfitzgerald thanks for this PR. the ability to train models for function calling is very much needed. Currently, this PR breaks some of the existing tests for |
This is an exciting development. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank for putting this all together! 🚀
…brianfitzgerald/axolotl into glaive-function-calling-support
Thanks for the updated functionality. I would like to similarly treat any other tool handling dataset like for example this version of glaive lilacai/glaive-function-calling-v2-sharegpt , which already is in sharegpt, or similarly any other custom dataset of that format. When trying to do so and orientating at the above mentioned configs, I cannot make it work to recognize the tool role as 3rd role in axolotl and don't really know if it is possible and how the config has to look like. datasets: It would be nice if someone could help. |
* Add Glaive conversation format support * fix black formatting errors * Fix black and pylint formatting errors * only set role_key_tool if provided in the dataset constructor * Update src/axolotl/prompt_strategies/sharegpt.py Co-authored-by: Wing Lian <[email protected]> * sharegpt test * tokenizer test * fix formatting --------- Co-authored-by: Wing Lian <[email protected]>
Adds support for the Glaive function calling dataset. This dataset has 2 columns,
system
andchat
; and an additionaltool
role in the conversation. This role contains the output from a tool call.In this PR we add support for the Glaive dataset, which we convert to the ShareGPT format; tool calls are masked, and consecutive tool calls are merged to one message.
How has this been tested?
I SFT trained a TinyLlama lora with the config below.