feat: support openai responses api #3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
IMPORTANT: I am publishing this as a release candidate rather than a proper release because I want to be able to dogfood it in our own API before we release it generally. The test coverage is pretty good, but I want to make sure that this is actually the interface we want. Automated tests can only test logic, not ergonomics.
Overview
With this PR, LLM Bridge supports the Responses API from OpenAI. This API differs from the completions APIs that we've been dealing with so far in that it is, or at least can be, a stateful API, and it manages that state per se by passing in some extra provider-specific parameters. For simplicity's sake, we remain unopinionated about the statefulness of the requests being made in LLM Bridge. We treat the messages that are passed in as those messages and we don't try to append state or hook into the state management portion of this API in any way. To do so is a bit outside of the scope of this library and can be trivially done in one's own implementation. We pass through Responses state hints like
store
andprevious_response_id
, so statefulness in the actual API requests is not broken.TL;DR
/v1/responses
emit a Responses body; other OpenAI endpoints emit Chat bodies by default.provider_params.openai_target = "responses"
before translating back to provider shape.store
,previous_response_id
,include
,text
,parallel_tool_calls
,service_tier
,truncation
,background
,user
, andmetadata
are preserved.universal.tools
(JSON Schema based).web_search_preview
,file_search
,code_interpreter
) round‑trip viaprovider_params.responses_builtin_tools
.max_tokens
maps to Responsesmax_output_tokens
when emitted.What changed (high level)
instructions
,input[]
, etc.) into the universal shape and emit a valid Responses body when appropriate./v1/responses
in the target URL and automatically annotates the universal request so the OpenAI formatter emits a Responses body.Before → After: Universal handler
Key differences:
instructions
andinput[]
parts; Chat usesmessages[]
.store
andprevious_response_id
are passed through onprovider_params
and preserved.max_tokens
(Chat) maps tomax_output_tokens
(Responses) when emitted.provider_params.responses_builtin_tools
.Before → After: Direct translators
Tools: function tools vs built-ins
Streaming and token limits
stream
,max_tokens
stream
,max_output_tokens
(mapped from universalmax_tokens
)Migration tips
/v1/responses
(auto‑emits Responses). To force Responses shape elsewhere, setprovider_params.openai_target = "responses"
prior to translation.input
parts such as{ type: "input_text" }
,{ type: "input_image" }
. These map to universalmessages
withtext
/image
content.References
docs/openai-responses.md
https://platform.openai.com/docs/api-reference/responses