Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gemini models via Openrouter not supported #5621

Open
ravishqureshi opened this issue Feb 19, 2025 · 15 comments
Open

Gemini models via Openrouter not supported #5621

ravishqureshi opened this issue Feb 19, 2025 · 15 comments
Milestone

Comments

@ravishqureshi
Copy link

ravishqureshi commented Feb 19, 2025

What happened?

Following code snippet works

config =  {
            "model": "anthropic/claude-3.5-sonnet",
            "base_url": "https://openrouter.ai/api/v1",
            "model_info": {
                "vision": True,
                "function_calling": True,
                "json_output": False,
                "family": "claude-3.5-sonnet"
            }
        }
model = config["model"]
api_key = settings.OPENROUTER_KEY
base_url = config["base_url"]
model_info=config.get("model_info", {})

model_client = OpenAIChatCompletionClient(
      model=model,
      api_key=api_key,
      base_url=base_url,
      model_info=model_info
  )
# do other stuff like create agent etc...
#####rest of the code####
response = await agent.on_messages(messages=messages,cancellation_token=cancellation_token)

Above code works well when we change
"model": "anthropic/claude-3.5-sonnet" -> model ="openai/gpt-4o-2024-11-20"
and
"family": "claude-3.5-sonnet" -> to "family": "gpt-4o"

However, when i change model to gemini flash from here - https://openrouter.ai/google/gemini-2.0-flash-001
ie "model" : "google/gemini-2.0-flash-001"
and
"family" : "gemini-2.0-flash" (picked up from https://microsoft.github.io/autogen/stable//reference/python/autogen_core.models.html#autogen_core.models.ModelInfo) code fails as following. Tried with family "unknown" as well.

venv_autogen_latest/lib/python3.12/site-packages/autogen_agentchat/agents/_assistant_agent.py:416: UserWarning: Resolved model mismatch: google/gemini-2.0-flash-001 != None. Model mapping in autogen_ext.models.openai may be incorrect.
  model_result = await self._model_client.create(
=== Exception during agent.on_messages call ===
'NoneType' object is not subscriptable
Traceback (most recent call last):
  File "/Users/ravishq/Library/CloudStorage/[email protected]/My Drive/iamai/autogen-ms/agent_backyard.py", line 38, in run_task
    response = await agent.on_messages(messages=messages,cancellation_token=cancellation_token)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ravishq/Library/CloudStorage/[email protected]/My Drive/iamai/venv_autogen_latest/lib/python3.12/site-packages/autogen_agentchat/agents/_assistant_agent.py", line 370, in on_messages
    async for message in self.on_messages_stream(messages, cancellation_token):
  File "/Users/ravishq/Library/CloudStorage/[email protected]/My Drive/iamai/venv_autogen_latest/lib/python3.12/site-packages/autogen_agentchat/agents/_assistant_agent.py", line 416, in on_messages_stream
    model_result = await self._model_client.create(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ravishq/Library/CloudStorage/[email protected]/My Drive/iamai/venv_autogen_latest/lib/python3.12/site-packages/autogen_ext/models/openai/_openai_client.py", line 569, in create
    choice: Union[ParsedChoice[Any], ParsedChoice[BaseModel], Choice] = result.choices[0]
                                                                        ~~~~~~~~~~~~~~^^^
TypeError: 'NoneType' object is not subscriptable
Error: 'NoneType' object is not subscriptable

Which packages was the bug in?

Python AgentChat (autogen-agentchat>=0.4.0)

AutoGen library version.

Python 0.4.7

Other library version.

No response

Model used

gpt4o, sonnet 3.5, gemini flash 2.0

Model provider

OpenRouter

Other model provider

No response

Python version

3.12

.NET version

None

Operating system

MacOS

@jackgerrits
Copy link
Member

jackgerrits commented Feb 20, 2025

@ekzhu do you know if OpenRouter presents all models as openai compatible or is gemini different?

@ravishqureshi
Copy link
Author

@jackgerrits - openrouter's claim to fame is that they provide a unified API and all models can be accessed via an openai compatible API schema.

https://openrouter.ai/docs/quickstart
https://openrouter.ai/docs/api-reference/overview
verbatim info from above link:
OpenRouter’s request and response schemas are very similar to the OpenAI Chat API, with a few small differences. At a high level, OpenRouter normalizes the schema across models and providers so you only need to learn one.

Another verbatim text from this link - https://openrouter.ai/openai/o1/api
OpenRouter provides an OpenAI-compatible completion API to 300+ models & providers that you can call directly, or using the OpenAI SDK. Additionally, some third-party SDKs are available.

They do say "very similar" but there is a reason why AI community is doubling down on openrouter and LiteLLM..because we want single interface for all AI models and make integrations model agnostic. Hope this helps. In case you find that Gemini's response is not openai compatible then do print the logs here and i will log a bug with openrouter. however it seems to me that it is not even about api response format. thr is some other problem of mapping in autogent
UserWarning: Resolved model mismatch: google/gemini-2.0-flash-001 != None. Model mapping in autogen_ext.models.openai may be incorrect.

copy-pasta :) the exact error again from this ticket so that you can look for it in your code base.

Waiting for this fix! Let's keep building!! Hyped with MS putting best minds of world on this.. so doubling down on Autogen while back stabbing langhain, crewai and smolagents... lfg!

@jackgerrits
Copy link
Member

UserWarning: Resolved model mismatch: google/gemini-2.0-flash-001 != None. Model mapping in autogen_ext.models.openai may be incorrect.

Yeah, I think this warning is probably okay, but I could be wrong here.

The error is in the issue indicates that result.choices is None. It would be good to reduce the repro down to just a model client call.

I don't have access to open router at the moment, so I will wait to see what @ekzhu thinks.

@Zochory
Copy link

Zochory commented Feb 21, 2025

you should try to integrate header like

Image

in the example it worked, not tried yet with openrouter

@ekzhu
Copy link
Collaborator

ekzhu commented Feb 21, 2025

I am getting the same error from OpenRouter when using Claude models. But it works with OpenAI models.

See my response in #5583

At this point I don't know what's the cause of it.

@ravishqureshi
Copy link
Author

ravishqureshi commented Feb 21, 2025

@ekzhu i am hoping that you are changing the family in your code when trying with Claude in the code snippet you pasted here - #5583

Coz Claude works just fine. This is the config that works:

{
            "model": "anthropic/claude-3.5-sonnet",
            "base_url": "https://openrouter.ai/api/v1",
            "api_type": "anthropic",
            "model_info": {
                "vision": True,
                "function_calling": True,
                "json_output": False,
                "family": "claude-3.5-sonnet"
            }
        }

ignore "api_type" key. Let me know if using above as well doesnt work for you for Claude models. Like i said, Claude works, Gemini doesnt. So we need to be on same page in terms of "reproducibility" of this issue else it will die a slow death and so would my project :D

Awaiting for your response on this...

@ekzhu
Copy link
Collaborator

ekzhu commented Feb 21, 2025

It's more about the model name rather than the model family. Have you tried with just calling open router directly using the openai library? Because from the error message it seems like the failure happened due to server returned a None in result.choices

@philippHorn
Copy link
Contributor

philippHorn commented Feb 22, 2025

I had the same error and inspected the result where the traceback comes from:

result.model_extra
Out[2]: 
{'error': {'message': 'Provider returned error',
  'code': 400,
  'metadata': {'raw': '{"type":"error","error":{"type":"invalid_request_error","message":"Requests which include `tool_use` or `tool_result` blocks must define tools."}}',
   'provider_name': 'Google',
   'isDownstreamPipeClean': True,
   'isErrorUpstreamFault': False}},
 'user_id': 'xxx'}

@philippHorn
Copy link
Contributor

I looked a bit more. I think the problem is:

  • Openai allows tool calls to be in the message history, even when the current API call does not include tools for the model
  • Some Openrouter models seem to not allow this

These were the messages sent to the LLM when I had the error:

[{'content': 'You are a helpful assistant.', 'role': 'system'},
 {'content': 'What is the weather in New York?',
  'role': 'user',
  'name': 'user'},
 {'tool_calls': [{'id': 'toolu_vrtx_01UonpGhPPQbzMNj8JaSREjv',
    'function': {'arguments': '{"city": "New York"}', 'name': 'get_weather'},
    'type': 'function'}],
  'role': 'assistant',
  'name': 'weather_agent'},
 {'content': 'The weather in New York is 73 degrees and Sunny.',
  'role': 'tool',
  'tool_call_id': 'toolu_vrtx_01UonpGhPPQbzMNj8JaSREjv'}]

@ravishqureshi
Copy link
Author

@philippHorn thanks for sharing the messages. This means that the issue lies with openrouter. May be i will write a wrapper in my code that if model is gemini i dont call agent.on_messages(...) but instead get the response using gemini's official python packag and whatever response i get, i attach it to agent's state in format it expects it to be... still thinking how to work around it.

@pengjunfeng11
Copy link
Contributor

@philippHorn thanks for sharing the messages. This means that the issue lies with openrouter. May be i will write a wrapper in my code that if model is gemini i dont call agent.on_messages(...) but instead get the response using gemini's official python packag and whatever response i get, i attach it to agent's state in format it expects it to be... still thinking how to work around it.

Can I participate this issue and submit PR to fix it? I want to help the community.

@jackgerrits jackgerrits added this to the 0.4.x-python milestone Feb 25, 2025
@philippHorn
Copy link
Contributor

@ravishqureshi You're welcome. By the way the issue is not specific to gemini, I have it on claude as well.

I'd be curious to understand this a bit.
What seems to happen:

  • The agent makes the first LLM call with the tools included for the LLM to select
  • The LLM responds with the tool call and the tool is run
  • Another LLM call is done, where the output from the tool is fed into the llm as a tool message, but without giving the LLM the tool to call anymore

Why are the tools not available in the second LLM call? Is it to force the LLM to compile an answer instead and to prevent too many LLM API calls? Does that mean the agent can't call the same tool multiple times within a run?
I think the tradeoff here is that the LLM looses access to the tool description, not sure if that is a big problem.

Here is a full script that reproduces the issue, just need to swap the openrouter API key:

import asyncio
from pprint import pprint

from autogen_agentchat.agents import AssistantAgent
from autogen_core.models import ModelFamily, ModelInfo
from autogen_ext.models.openai import (
    OpenAIChatCompletionClient,
)
from langsmith.wrappers import wrap_openai

info: ModelInfo = {
    "vision": False,
    "function_calling": True,
    "json_output": False,
    "family": ModelFamily.CLAUDE_3_5_SONNET,
}
model_client = OpenAIChatCompletionClient(
    model="anthropic/claude-3.5-sonnet",
    api_key="xxxx",
    model_info=info,
    base_url="https://openrouter.ai/api/v1",
)


def get_weather(city: str) -> str:
    """Get the weather for a given city."""
    return f"The weather is 73 degrees and Sunny."


agent = AssistantAgent(
    name="weather_agent",
    model_client=model_client,
    tools=[get_weather],
    system_message="You are a helpful assistant.",
    reflect_on_tool_use=True,
)

model_client._client = wrap_openai(model_client._client)


def main() -> None:
    result = asyncio.run(agent.run(task="What is the weather in New York?"))
    pprint(result)


main()

I've confirmed that the crash goes away if I add this line, giving the LLM tool access on the second call:
Image

@ekzhu
Copy link
Collaborator

ekzhu commented Feb 25, 2025

@philippHorn the reason for not including the tools in the reflection step is because we want to force a text response.

You can set reflect_on_tool_use=False to disable the second inference and repeatedly call the agent without a task or message to get agent to repeatedly execute tools.

For model provider compatibility issue, is this only related to Open Router? Because if you use Gemini directly from OpenAIChatCompletionClient it works fine.

A more complete fix would be to add tool_choice parameter in the extra_create_args in the reflection inference call, and set tool_choice=None. However, the syntax is only for OpenAI and not translated to other providers. https://platform.openai.com/docs/api-reference/chat/create#chat-create-tool_choice. We will need to add a new tool_choice parameter to ChatCompletionClient base class for this to work.

@philippHorn
Copy link
Contributor

Thanks, for now I use this as a workaround:

    for attempt in range(MAX_LLM_CALLS):
        result = asyncio.run(agent.run(task=None))
        if any(isinstance(message, TextMessage) for message in result.messages):
            break
    else:
        raise ValueError("Max attempts exceeded without LLM answer")

it seems to work well in practice, but probably like this it is not production ready

@ekzhu
Copy link
Collaborator

ekzhu commented Feb 26, 2025

@philippHorn, I created an issue to address this: #5732

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants