Skip to content

Conversation

alexf37
Copy link

@alexf37 alexf37 commented Aug 25, 2025

IMPORTANT: I am publishing this as a release candidate rather than a proper release because I want to be able to dogfood it in our own API before we release it generally. The test coverage is pretty good, but I want to make sure that this is actually the interface we want. Automated tests can only test logic, not ergonomics.

Overview

With this PR, LLM Bridge supports the Responses API from OpenAI. This API differs from the completions APIs that we've been dealing with so far in that it is, or at least can be, a stateful API, and it manages that state per se by passing in some extra provider-specific parameters. For simplicity's sake, we remain unopinionated about the statefulness of the requests being made in LLM Bridge. We treat the messages that are passed in as those messages and we don't try to append state or hook into the state management portion of this API in any way. To do so is a bit outside of the scope of this library and can be trivially done in one's own implementation. We pass through Responses state hints like store and previous_response_id, so statefulness in the actual API requests is not broken.

TL;DR

  • Auto shape selection: Calls to /v1/responses emit a Responses body; other OpenAI endpoints emit Chat bodies by default.
  • Manual override: Force Responses emission via provider_params.openai_target = "responses" before translating back to provider shape.
  • State hints pass-through: store, previous_response_id, include, text, parallel_tool_calls, service_tier, truncation, background, user, and metadata are preserved.
  • Tools:
    • Custom function tools map to/from universal.tools (JSON Schema based).
    • Built-ins (e.g., web_search_preview, file_search, code_interpreter) round‑trip via provider_params.responses_builtin_tools.
  • Token limits: Universal max_tokens maps to Responses max_output_tokens when emitted.

What changed (high level)

  • The translator can parse a Responses‑shaped request (instructions, input[], etc.) into the universal shape and emit a valid Responses body when appropriate.
  • The handler detects /v1/responses in the target URL and automatically annotates the universal request so the OpenAI formatter emits a Responses body.
  • You can continue sending Chat requests; no changes required unless you want to opt‑in to Responses.

Before → After: Universal handler

// BEFORE: Chat Completions
import { handleUniversalRequest } from "llm-bridge"

async function editFunction(request: any) {
  return { request, contextModified: false }
}

const { response } = await handleUniversalRequest(
  "https://api.openai.com/v1/chat/completions",
  {
    model: "gpt-4o",
    messages: [{ role: "user", content: "Hello" }],
    stream: false,
  },
  { Authorization: `Bearer ${process.env.OPENAI_API_KEY}` },
  "POST",
  editFunction,
)
// AFTER: Responses API (auto-emitted when URL is /v1/responses)
import { handleUniversalRequest } from "llm-bridge"

async function editFunction(request: any) {
  // Optional: pass through Responses state hints via provider_params
  return {
    request: {
      ...request,
      provider_params: {
        ...(request.provider_params ?? {}),
        store: true,
        previous_response_id: "resp_123",
      },
    },
    contextModified: false,
  }
}

const { response } = await handleUniversalRequest(
  "https://api.openai.com/v1/responses",
  {
    model: "gpt-4o",
    // Either `instructions` + `input[]` (Responses shape) or `messages[]` (Chat) is accepted.
    // If the URL is /v1/responses, LLM Bridge will emit a Responses body.
    input: [
      {
        type: "message",
        role: "user",
        content: [{ type: "input_text", text: "Hello" }],
      },
    ],
    stream: false,
    store: true, // preserved via provider_params
  },
  { Authorization: `Bearer ${process.env.OPENAI_API_KEY}` },
  "POST",
  editFunction,
)

Key differences:

  • Shape: Responses uses instructions and input[] parts; Chat uses messages[].
  • State: store and previous_response_id are passed through on provider_params and preserved.
  • Tokens: max_tokens (Chat) maps to max_output_tokens (Responses) when emitted.
  • Built-in tools: Responses built-ins round‑trip via provider_params.responses_builtin_tools.

Before → After: Direct translators

// BEFORE: Chat -> universal -> Chat
import { toUniversal, fromUniversal } from "llm-bridge"

const chatReq = {
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hi" }],
}
const universal = toUniversal("openai", chatReq)
const backToChat = fromUniversal("openai", universal) // emits Chat body
// AFTER: Responses -> universal -> Responses (forced)
import { toUniversal, fromUniversal } from "llm-bridge"

const responsesReq = {
  model: "gpt-4o",
  instructions: "You are helpful.",
  input: [
    {
      type: "message",
      role: "user",
      content: [{ type: "input_text", text: "Hi" }],
    },
  ],
  store: true,
}
const universal = toUniversal("openai", responsesReq)

const emittedResponses = fromUniversal("openai", {
  ...universal,
  provider_params: {
    ...(universal.provider_params ?? {}),
    openai_target: "responses", // force Responses emission if not calling /v1/responses
  },
})

Tools: function tools vs built-ins

// Function tools (both Chat and Responses)
const universalWithFunctionTool = {
  ...universal,
  tools: [
    {
      name: "get_weather",
      description: "Get weather",
      parameters: { type: "object", properties: { city: { type: "string" } } },
    },
  ],
}
// Built-in Responses tools (preserved in provider_params)
const withBuiltin = {
  ...universal,
  provider_params: {
    ...(universal.provider_params ?? {}),
    responses_builtin_tools: [{ type: "web_search_preview" }],
  },
}

Streaming and token limits

  • Chat: stream, max_tokens
  • Responses: stream, max_output_tokens (mapped from universal max_tokens)

Migration tips

  • Keep the same business logic. The handler stays stateless and accepts both Chat and Responses‑shaped inputs.
  • To adopt Responses, call the universal handler against /v1/responses (auto‑emits Responses). To force Responses shape elsewhere, set provider_params.openai_target = "responses" prior to translation.
  • For multimodal input under Responses, use input parts such as { type: "input_text" }, { type: "input_image" }. These map to universal messages with text/image content.

References

  • Detailed mapping and examples: docs/openai-responses.md
  • OpenAI docs: https://platform.openai.com/docs/api-reference/responses

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant