Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Glaive conversation format support #1365

Merged

Conversation

brianfitzgerald
Copy link
Contributor

@brianfitzgerald brianfitzgerald commented Mar 6, 2024

Adds support for the Glaive function calling dataset. This dataset has 2 columns, system and chat; and an additional tool role in the conversation. This role contains the output from a tool call.

In this PR we add support for the Glaive dataset, which we convert to the ShareGPT format; tool calls are masked, and consecutive tool calls are merged to one message.

How has this been tested?

I SFT trained a TinyLlama lora with the config below.

base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
is_llama_derived_model: true

load_in_8bit: true
load_in_4bit: false
strict: false

datasets:
  - path: glaiveai/glaive-function-calling-v2
    type: sharegpt.load_glaive
    conversation: chatml_glaive

@winglian
Copy link
Collaborator

winglian commented Mar 6, 2024

@brianfitzgerald thanks for this PR. the ability to train models for function calling is very much needed. Currently, this PR breaks some of the existing tests for sharegpt datasets. I think if we can get the changes fixed so as not to break existing functionality as well as add new data fixtures and tests for the new functionality, we can get this merged. lmk if you'd like some help with this.

@ehartford
Copy link
Collaborator

This is an exciting development.
I will use this to add glaive to all my models

Copy link
Collaborator

@winglian winglian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank for putting this all together! 🚀

@winglian winglian merged commit b7d8a7d into axolotl-ai-cloud:main Mar 11, 2024
6 checks passed
@brianfitzgerald brianfitzgerald deleted the glaive-function-calling-support branch March 12, 2024 01:00
@hasan9090
Copy link

hasan9090 commented Jul 18, 2024

Thanks for the updated functionality. I would like to similarly treat any other tool handling dataset like for example this version of glaive lilacai/glaive-function-calling-v2-sharegpt , which already is in sharegpt, or similarly any other custom dataset of that format. When trying to do so and orientating at the above mentioned configs, I cannot make it work to recognize the tool role as 3rd role in axolotl and don't really know if it is possible and how the config has to look like.
From the above , I think setting type also to sharegpt.load_glaive would not work as the dataset is already in sharegpt and this function seems to convert it first to that format? But I do think I need to set the conversation to "chatml_glaive" since this is handling the tool role? When using the following config it is not working for me:

datasets:
-path: lilacai/glaive-function-calling-v2-sharegpt
type: sharegpt
conversation: chatml_glaive
field_human: human
field_model: gpt

It would be nice if someone could help.
Thanks Hasan

djsaunde pushed a commit that referenced this pull request Dec 17, 2024
* Add Glaive conversation format support

* fix black formatting errors

* Fix black and pylint formatting errors

* only set role_key_tool if provided in the dataset constructor

* Update src/axolotl/prompt_strategies/sharegpt.py

Co-authored-by: Wing Lian <[email protected]>

* sharegpt test

* tokenizer test

* fix formatting

---------

Co-authored-by: Wing Lian <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants