Python: New Feature: Python: Isn’t there an option to turn off only the final response generation in FunctionCallBehavior? #6998

yuichiromukaiyama · 2024-06-28T11:14:57Z

I’m using function calling in the following way. This implementation is very simple and currently helps me a lot.

(service, settings) = kernel.get_llm_service()

settings.tool_choice = "auto"
settings.function_call_behavior = FunctionCallBehavior.AutoInvokeKernelFunctions()

result = await service.get_chat_message_contents(
    chat_history=chat_history,
    settings=settings,
    kernel=kernel.kernel,
    arguments=KernelArguments(settings=settings),
)

print(content = result[0].inner_content)

However, in cases where I want to achieve the following use case, the generative AI ends up generating the final response each time, which significantly increases the processing time.

Execute a function call for preprocessing to retrieve specific data in advance (the goal is to gather data from various unspecified sources).
Execute a function call to invoke multiple APIs for auditing the retrieved data.
Execute the final function call.

This is an extreme example, but the point is that the current approach is sufficient for generating responses at step 3. However, for more complex workflows, at steps 1 and 2, response generation is not necessary; only the external data obtained through function calls is needed.

In these cases, is it possible with the current functionality to use function calls at steps 1 and 2 without generating the final response, and at step 3, use function calls while also generating the final response with the generative AI?
（Should I use OpenAIChatCompletionBase._process_function_call?）

environment

python 3.10
semantic-kernel==1.0.3

nmoeller · 2024-06-28T13:59:16Z

There are currently 3 FunctionCallBehaviors Supported :

AutoInvokeKernelFunction() This takes all functions and will execute them Automatically
EnableFunctions(auto_invoke=False) With this you can steer the functions that should be included or excluded and you can use auto_invoke to steer if the call should be processed automatically. When set to False you will get the FunctionCall without the execution.
RequiredFunction() With this you can force the model to do a function call.

I think you are looking for Option 2 here with auto_invoke=False and then you can invoke the functions yourself for step 1 and 2.
For step 3 you can use the llm again.

To invoke the function manually you can use kernel.invoke_from_function_call

Hopefully i've got the question right and could help you. Probably someone from the SK team will take a look later on this issue aswell.

Also another comment, there is an Update coming to FunctionCallBehavior, it will be FunctionChoiceBehavior soon with the following options :

FunctionChoiceBehavior.Auto()
FunctionChoiceBehavior.Required()
FunctionChoiceBehavior.NoneInvoke()

moonbox3 · 2024-06-28T20:04:14Z

Hi @yuichiromukaiyama, if I understand your request properly, it seems you'd be able to use filters to help achieve your goal. The filters can give you an "early out" after function calling that will not call the model for a final answer. Here is a sample of that.

If I misunderstood your request then please help clarify.

yuichiromukaiyama · 2024-07-01T02:38:29Z

@moonbox3

Thank you! By using the example you provided, I was able to reproduce the ideal processing. (The code I wrote is almost the same as the sample you shared.)
Just one more thing: using the filter below, I was able to get the tool call information before the inference. Does Semantic Kernel have a feature to execute a plugin based on this tool call information?

@kernel.filter(FilterTypes.AUTO_FUNCTION_INVOCATION)
async def auto_function_invocation_filter(context: AutoFunctionInvocationContext, next):
    """A filter that will be called for each function call in the response."""
    print("\nAuto function invocation filter")
    print(f"Function: {context.function.name}")
    print(f"Request sequence: {context.request_sequence_index}")
    print(f"Function sequence: {context.function_sequence_index}")

    # as an example
    function_calls = context.chat_history.messages[-1].items
    print(f"Number of function calls: {len(function_calls)}")
    # if we don't call next, it will skip this function, and go to the next one
    await next(context)
    result = context.function_result
    for fc in function_calls:
        if fc.plugin_name == "math":
            context.function_result = FunctionResult(
                function=result.function, value="Stop trying to ask me to do math, I don't like it!"
            )
            context.terminate = True

Alternatively, would it be appropriate to parse the obtained plugin name and argument strings yourself and execute them as follows? For example, the code would look like this:

kernel.invoke(plugin_name="${tool call plugin name}", arguments=json.loads("${tool call arguments}"))

@nmoeller
Thank you for the information! The function you mentioned at the end sounds very promising. This might be the feature I was looking for and mentioned above. However, I couldn’t pinpoint its exact location. （kernel.invoke_from_function_call）

nmoeller · 2024-07-01T06:42:38Z

@yuichiromukaiyama you can find the function here.
It is pretty new as far as i know, and maybe not available in your version yet.

moonbox3 · 2024-07-01T13:06:32Z

Hi @yuichiromukaiyama, @nmoeller is correct -- we have added this new kernel invoke_function_call() which will be available in our next release (either today or tomorrow). Thanks for your help here, @nmoeller.

As a side note: we've updated our function calling abstractions from FunctionCallBehavior to FunctionChoiceBehavior. This isn't a breaking change, so FunctionCallBehavior will continue to work, but we suggest that folks update their method to be FunctionChoiceBehavior as soon as possible as this new function calling abstraction allows for the configuration of function calling for new models (that support it) when we release the connectors. Azure's Model-as-a-Service will implement this new FunctionChoiceBehavior for models that support function calling.

moonbox3 · 2024-07-03T15:24:45Z

Hi @yuichiromukaiyama is there more action needed on this issue? If so, please feel free to re-open. Closing for now.

markwallace-microsoft added python Pull requests for the Python Semantic Kernel triage labels Jun 28, 2024

github-actions bot changed the title ~~New Feature: Python: Isn’t there an option to turn off only the final response generation in FunctionCallBehavior?~~ Python: New Feature: Python: Isn’t there an option to turn off only the final response generation in FunctionCallBehavior? Jun 28, 2024

moonbox3 self-assigned this Jun 28, 2024

moonbox3 added kernel Issues or pull requests impacting the core kernel and removed triage labels Jun 28, 2024

moonbox3 closed this as completed Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: New Feature: Python: Isn’t there an option to turn off only the final response generation in FunctionCallBehavior? #6998

Python: New Feature: Python: Isn’t there an option to turn off only the final response generation in FunctionCallBehavior? #6998

yuichiromukaiyama commented Jun 28, 2024

nmoeller commented Jun 28, 2024 •

edited

Loading

moonbox3 commented Jun 28, 2024

yuichiromukaiyama commented Jul 1, 2024

nmoeller commented Jul 1, 2024

moonbox3 commented Jul 1, 2024

moonbox3 commented Jul 3, 2024

Python: New Feature: Python: Isn’t there an option to turn off only the final response generation in FunctionCallBehavior? #6998

Python: New Feature: Python: Isn’t there an option to turn off only the final response generation in FunctionCallBehavior? #6998

Comments

yuichiromukaiyama commented Jun 28, 2024

environment

nmoeller commented Jun 28, 2024 • edited Loading

moonbox3 commented Jun 28, 2024

yuichiromukaiyama commented Jul 1, 2024

nmoeller commented Jul 1, 2024

moonbox3 commented Jul 1, 2024

moonbox3 commented Jul 3, 2024

nmoeller commented Jun 28, 2024 •

edited

Loading