Skip to content

Commit

Permalink
[Model] tool calling support for ibm-granite/granite-20b-functioncall…
Browse files Browse the repository at this point in the history
…ing (vllm-project#8339)

Signed-off-by: Max de Bayser <[email protected]>
Co-authored-by: Max de Bayser <[email protected]>
Co-authored-by: Maximilien de Bayser <[email protected]>
  • Loading branch information
3 people authored Oct 29, 2024
1 parent 67bdf8e commit 882a1ad
Show file tree
Hide file tree
Showing 7 changed files with 456 additions and 28 deletions.
21 changes: 20 additions & 1 deletion docs/source/serving/openai_compatible_server.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,9 @@ from HuggingFace; and you can find an example of this in a `tokenizer_config.jso

If your favorite tool-calling model is not supported, please feel free to contribute a parser & tool use chat template!


#### Hermes Models (`hermes`)

All Nous Research Hermes-series models newer than Hermes 2 Pro should be supported.
* `NousResearch/Hermes-2-Pro-*`
* `NousResearch/Hermes-2-Theta-*`
Expand All @@ -197,7 +199,9 @@ step in their creation_.

Flags: `--tool-call-parser hermes`


#### Mistral Models (`mistral`)

Supported models:
* `mistralai/Mistral-7B-Instruct-v0.3` (confirmed)
* Additional mistral function-calling models are compatible as well.
Expand All @@ -216,7 +220,9 @@ when tools are provided, that results in much better reliability when working wi

Recommended flags: `--tool-call-parser mistral --chat-template examples/tool_chat_template_mistral_parallel.jinja`


#### Llama Models (`llama3_json`)

Supported models:
* `meta-llama/Meta-Llama-3.1-8B-Instruct`
* `meta-llama/Meta-Llama-3.1-70B-Instruct`
Expand All @@ -236,7 +242,9 @@ it works better with vLLM.

Recommended flags: `--tool-call-parser llama3_json --chat-template examples/tool_chat_template_llama3_json.jinja`


#### InternLM Models (`internlm`)

Supported models:
* `internlm/internlm2_5-7b-chat` (confirmed)
* Additional internlm2.5 function-calling models are compatible as well
Expand All @@ -246,6 +254,7 @@ Known issues:

Recommended flags: `--tool-call-parser internlm --chat-template examples/tool_chat_template_internlm2_tool.jinja`


#### Jamba Models (`jamba`)
AI21's Jamba-1.5 models are supported.
* `ai21labs/AI21-Jamba-1.5-Mini`
Expand All @@ -255,6 +264,16 @@ AI21's Jamba-1.5 models are supported.
Flags: `--tool-call-parser jamba`


#### IBM Granite (`granite-20b-fc`)

Supported models:
* `ibm-granite/granite-20b-functioncalling`

Flags: `--tool-call-parser granite-20b-fc --chat-template examples/tool_chat_template_granite_20b_fc.jinja`

The example chat template deviates slightly from the original on Huggingface, which is not vLLM compatible. It blends function description elements from the Hermes template and follows the same system prompt as "Response Generation" mode from [the paper](https://arxiv.org/abs/2407.00121). Parallel function calls are supported.


### How to write a tool parser plugin

A tool parser plugin is a Python file containing one or more ToolParser implementations. You can write a ToolParser similar to the `Hermes2ProToolParser` in vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py.
Expand Down Expand Up @@ -312,5 +331,5 @@ Then you can use this plugin in the command line like this.
--tool-parser-plugin <absolute path of the plugin file>
--tool-call-parser example \
--chat-template <your chat template> \
```
```

130 changes: 130 additions & 0 deletions examples/tool_chat_template_granite_20b_fc.jinja
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
{%- macro json_to_python_type(json_spec) %}
{%- set basic_type_map = {
"string": "str",
"number": "float",
"integer": "int",
"boolean": "bool"
} %}

{%- if basic_type_map[json_spec.type] is defined %}
{{- basic_type_map[json_spec.type] }}
{%- elif json_spec.type == "array" %}
{{- "list[" + json_to_python_type(json_spec|items) + "]" }}
{%- elif json_spec.type == "object" %}
{%- if json_spec.additionalProperties is defined %}
{{- "dict[str, " + json_to_python_type(json_spec.additionalProperties) + ']' }}
{%- else %}
{{- "dict" }}
{%- endif %}
{%- elif json_spec.type is iterable %}
{{- "Union[" }}
{%- for t in json_spec.type %}
{{- json_to_python_type({"type": t}) }}
{%- if not loop.last %}
{{- "," }}
{%- endif %}
{%- endfor %}
{{- "]" }}
{%- else %}
{{- "Any" }}
{%- endif %}
{%- endmacro %}

{%- if not full_function_description is defined %}
{%- set full_function_description = false %}
{%- endif %}

{%- macro full_description(tool) %}
{{- tool.name + '(' }}
{%- if tool.parameters is defined %}
{%- for param_name, param_fields in tool.parameters.properties|items %}
{{- param_name + ": " + json_to_python_type(param_fields) }}
{%- if not loop.last %}
{{- ", " }}
{%- endif %}
{%- endfor %}
{%- endif %}
{{- ")" }}
{%- if tool.return is defined %}
{{- " -> " + json_to_python_type(tool.return) }}
{%- endif %}
{{- " - " + tool.description + "\n\n" }}
{%- if tool.parameters is defined %}
{%- for param_name, param_fields in tool.parameters.properties|items %}
{%- if loop.first %}
{{- " Args:\n" }}
{%- endif %}
{{- " " + param_name + "(" + json_to_python_type(param_fields) + "): " + param_fields.description|trim }}
{%- endfor %}
{%- endif %}
{%- if tool.return is defined and tool.return.description is defined %}
{{- "\n Returns:\n " + tool.return.description }}
{%- endif %}
{{- '"' }}
{%- endmacro %}

{%- macro simple_description(tool) %}
{{- tool.description }}
{%- endmacro %}

{%- macro function_description(tool) %}
{%- if full_function_description %}
{{- full_description(tool) }}
{%- else %}
{{- simple_description(tool) }}
{%- endif %}
{%- endmacro %}

{%- if messages[0]["role"] == "system" %}
{%- set sys_prompt = messages[0]["content"] %}
{%- set loop_messages = messages[1:] %}
{%- else %}
{%- set loop_messages = messages %}
{% set sys_prompt = 'You are a helpful assistant with access to the following function calls. Your task is to understand the given conversation with function calls and responses and generate natural language response as the ASSISTANT to continue the conversation. You may use the following function calls to understand how to respond to the user query.' %}
{%- endif %}

{{ 'SYSTEM: ' + sys_prompt }}
{% if tools is iterable and tools | length > 0 %}
<|function_call_library|>
{%- for tool in tools %}
{%- if tool.function is defined %}
{%- set tool = tool.function %}
{%- endif %}
{{- '{"name": "' + tool.name + '", ' }}
{{- '"description": "' + function_description(tool) }}
{{- ', "parameters": ' }}
{%- if not tool.parameters is defined or tool.parameters.properties | length == 0 %}
{{- "{}" }}
{%- else %}
{{- tool.parameters|tojson }}
{%- endif %}
{{- "}" }}
{%- if not loop.last %}
{{- "\n" }}
{%- endif %}
{%- endfor %}
If none of the functions are relevant or the given question lacks the parameters required by the function, please output \"<function_call> {\"name\": \"no_function\", \"arguments\": {}}\".
{%- endif %}



{% for message in messages %}
{% if message['role'] == 'user' %}
{{- '\nUSER: ' + message['content'] }}
{% elif message['role'] == 'assistant' and message.tool_calls is defined %}
{{- '\nASSISTANT:' }}
{% for tc in message.tool_calls %}
{{- '<function_call> ' + {'name': tc.function.name, 'arguments': tc.function.arguments}|tojson }}
{% endfor %}
{{- '<|endoftext|>' }}
{% elif message['role'] == 'assistant' %}
{{- '\nASSISTANT: ' + message['content'] + ' <|endoftext|>' }}
{% elif message['role'] == 'tool' %}
{{- '<function_response> ' + message['content'] }}
{%- else %}
{{- raise_exception("Unexpected combination of role and message content") }}
{% endif %}
{% if loop.last and add_generation_prompt %}
{{- '\nASSISTANT: ' }}
{% endif %}
{% endfor %}
12 changes: 12 additions & 0 deletions tests/tool_use/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,18 @@ def ensure_system_prompt(messages: List[Dict[str, Any]],
"without calling a tool. DO NOT CALL A TOOL THAT IS IRRELEVANT "
"to the user's question - just respond to it normally."
},
## FIXME: temporary disabled due to lack of hardware specification
## for individual runs
#"granite20b": {
# "model":
# "ibm-granite/granite-20b-functioncalling",
# "arguments": [
# "--tool-call-parser", "granite-20b-fc", "--chat-template",
# str(VLLM_PATH / "examples/tool_chat_template_granite_20b_fc.jinja")
# ],
# "supports_parallel":
# False,
#},
"internlm": {
"model":
"internlm/internlm2_5-7b-chat",
Expand Down
7 changes: 4 additions & 3 deletions vllm/entrypoints/openai/tool_parsers/__init__.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
from .abstract_tool_parser import ToolParser, ToolParserManager
from .granite_20b_fc_tool_parser import Granite20bFCToolParser
from .hermes_tool_parser import Hermes2ProToolParser
from .internlm2_tool_parser import Internlm2ToolParser
from .jamba_tool_parser import JambaToolParser
from .llama_tool_parser import Llama3JsonToolParser
from .mistral_tool_parser import MistralToolParser

__all__ = [
"ToolParser", "ToolParserManager", "Hermes2ProToolParser",
"MistralToolParser", "Internlm2ToolParser", "Llama3JsonToolParser",
"JambaToolParser"
"ToolParser", "ToolParserManager", "Granite20bFCToolParser",
"Hermes2ProToolParser", "MistralToolParser", "Internlm2ToolParser",
"Llama3JsonToolParser", "JambaToolParser"
]
Loading

0 comments on commit 882a1ad

Please sign in to comment.