Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: New Feature: Python: Isn’t there an option to turn off only the final response generation in FunctionCallBehavior? #6998

Closed
yuichiromukaiyama opened this issue Jun 28, 2024 · 6 comments
Assignees
Labels
kernel Issues or pull requests impacting the core kernel python Pull requests for the Python Semantic Kernel

Comments

@yuichiromukaiyama
Copy link
Contributor

I’m using function calling in the following way. This implementation is very simple and currently helps me a lot.

(service, settings) = kernel.get_llm_service()

settings.tool_choice = "auto"
settings.function_call_behavior = FunctionCallBehavior.AutoInvokeKernelFunctions()

result = await service.get_chat_message_contents(
    chat_history=chat_history,
    settings=settings,
    kernel=kernel.kernel,
    arguments=KernelArguments(settings=settings),
)

print(content = result[0].inner_content)

However, in cases where I want to achieve the following use case, the generative AI ends up generating the final response each time, which significantly increases the processing time.

  1. Execute a function call for preprocessing to retrieve specific data in advance (the goal is to gather data from various unspecified sources).
  2. Execute a function call to invoke multiple APIs for auditing the retrieved data.
  3. Execute the final function call.

This is an extreme example, but the point is that the current approach is sufficient for generating responses at step 3. However, for more complex workflows, at steps 1 and 2, response generation is not necessary; only the external data obtained through function calls is needed.

In these cases, is it possible with the current functionality to use function calls at steps 1 and 2 without generating the final response, and at step 3, use function calls while also generating the final response with the generative AI?
(Should I use OpenAIChatCompletionBase._process_function_call?)

environment

  • python 3.10
  • semantic-kernel==1.0.3
@markwallace-microsoft markwallace-microsoft added python Pull requests for the Python Semantic Kernel triage labels Jun 28, 2024
@github-actions github-actions bot changed the title New Feature: Python: Isn’t there an option to turn off only the final response generation in FunctionCallBehavior? Python: New Feature: Python: Isn’t there an option to turn off only the final response generation in FunctionCallBehavior? Jun 28, 2024
@nmoeller
Copy link
Contributor

nmoeller commented Jun 28, 2024

There are currently 3 FunctionCallBehaviors Supported :

  1. AutoInvokeKernelFunction() This takes all functions and will execute them Automatically
  2. EnableFunctions(auto_invoke=False) With this you can steer the functions that should be included or excluded and you can use auto_invoke to steer if the call should be processed automatically. When set to False you will get the FunctionCall without the execution.
  3. RequiredFunction() With this you can force the model to do a function call.

I think you are looking for Option 2 here with auto_invoke=False and then you can invoke the functions yourself for step 1 and 2.
For step 3 you can use the llm again.

To invoke the function manually you can use kernel.invoke_from_function_call

Hopefully i've got the question right and could help you. Probably someone from the SK team will take a look later on this issue aswell.

Also another comment, there is an Update coming to FunctionCallBehavior, it will be FunctionChoiceBehavior soon with the following options :

  1. FunctionChoiceBehavior.Auto()
  2. FunctionChoiceBehavior.Required()
  3. FunctionChoiceBehavior.NoneInvoke()

@moonbox3
Copy link
Contributor

Hi @yuichiromukaiyama, if I understand your request properly, it seems you'd be able to use filters to help achieve your goal. The filters can give you an "early out" after function calling that will not call the model for a final answer. Here is a sample of that.

If I misunderstood your request then please help clarify.

@moonbox3 moonbox3 self-assigned this Jun 28, 2024
@moonbox3 moonbox3 added kernel Issues or pull requests impacting the core kernel and removed triage labels Jun 28, 2024
@yuichiromukaiyama
Copy link
Contributor Author

@moonbox3

Thank you! By using the example you provided, I was able to reproduce the ideal processing. (The code I wrote is almost the same as the sample you shared.)
Just one more thing: using the filter below, I was able to get the tool call information before the inference. Does Semantic Kernel have a feature to execute a plugin based on this tool call information?

@kernel.filter(FilterTypes.AUTO_FUNCTION_INVOCATION)
async def auto_function_invocation_filter(context: AutoFunctionInvocationContext, next):
    """A filter that will be called for each function call in the response."""
    print("\nAuto function invocation filter")
    print(f"Function: {context.function.name}")
    print(f"Request sequence: {context.request_sequence_index}")
    print(f"Function sequence: {context.function_sequence_index}")

    # as an example
    function_calls = context.chat_history.messages[-1].items
    print(f"Number of function calls: {len(function_calls)}")
    # if we don't call next, it will skip this function, and go to the next one
    await next(context)
    result = context.function_result
    for fc in function_calls:
        if fc.plugin_name == "math":
            context.function_result = FunctionResult(
                function=result.function, value="Stop trying to ask me to do math, I don't like it!"
            )
            context.terminate = True

Alternatively, would it be appropriate to parse the obtained plugin name and argument strings yourself and execute them as follows? For example, the code would look like this:

kernel.invoke(plugin_name="${tool call plugin name}", arguments=json.loads("${tool call arguments}"))

@nmoeller
Thank you for the information! The function you mentioned at the end sounds very promising. This might be the feature I was looking for and mentioned above. However, I couldn’t pinpoint its exact location. (kernel.invoke_from_function_call)

@nmoeller
Copy link
Contributor

nmoeller commented Jul 1, 2024

@yuichiromukaiyama you can find the function here.
It is pretty new as far as i know, and maybe not available in your version yet.

@moonbox3
Copy link
Contributor

moonbox3 commented Jul 1, 2024

Hi @yuichiromukaiyama, @nmoeller is correct -- we have added this new kernel invoke_function_call() which will be available in our next release (either today or tomorrow). Thanks for your help here, @nmoeller.

As a side note: we've updated our function calling abstractions from FunctionCallBehavior to FunctionChoiceBehavior. This isn't a breaking change, so FunctionCallBehavior will continue to work, but we suggest that folks update their method to be FunctionChoiceBehavior as soon as possible as this new function calling abstraction allows for the configuration of function calling for new models (that support it) when we release the connectors. Azure's Model-as-a-Service will implement this new FunctionChoiceBehavior for models that support function calling.

@moonbox3
Copy link
Contributor

moonbox3 commented Jul 3, 2024

Hi @yuichiromukaiyama is there more action needed on this issue? If so, please feel free to re-open. Closing for now.

@moonbox3 moonbox3 closed this as completed Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kernel Issues or pull requests impacting the core kernel python Pull requests for the Python Semantic Kernel
Projects
Status: Sprint: Done
Development

No branches or pull requests

4 participants