Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: support x-request-id header #9593

Closed
1 task done
cjackal opened this issue Oct 22, 2024 · 0 comments · Fixed by #9594
Closed
1 task done

[Feature]: support x-request-id header #9593

cjackal opened this issue Oct 22, 2024 · 0 comments · Fixed by #9594

Comments

@cjackal
Copy link
Contributor

cjackal commented Oct 22, 2024

🚀 The feature, motivation and pitch

Related: #9550

It is a common approach for server admin to backtrack client requests via a "request id"(cf. this SE post). It is a unique identifier assigned to each HTTP request which helps grepping the request info from server log.

OpenAI supports this in the form of response header (cf. API Reference - OpenAI API). Any response from OpenAI API server has x-request-id header, and openai-python package provides a convenient way to retrieve this header. (cf. openai/_models.py)

In the referenced PR, we observed the need for request identification (a typical usecase of x-request-id), so it may be a good time to support this optional HTTP header in online serving.

Suggestion:

  • Each API response is sent with a x-request-id header
  • When x-request-id header is given in user request, send it back to the response header ("Idempotency")
  • Otherwise, x-request-id is a random uuid hex value ("compatibility with OpenAI API behavior")

Demo:

Case 1. X-Request-Id header is not given - return random hex

$ curl -v -X POST http://localhost:8000/v1/chat/completions \
> -d '{"model":"meta-llama/Llama-3.2-1B-Instruct","messages":[{"role":"user","content":"Hi"}]}' \
> -H 'Content-Type: Application/json'
*   Trying 127.0.0.1:8000...
* Connected to localhost (127.0.0.1) port 8000 (#0)
> POST /v1/chat/completions HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.68.0
> Accept: */*
> X-Request-Id: aaaa
> Content-Type: application/json
> Content-Length: 88
> 
< HTTP/1.1 200 OK
< date: Tue, 22 Oct 2024 15:58:16 GMT
< server: uvicorn
< content-length: 230
< content-type: application/json
< x-request-id: 26970bda2e124bb8ad293d3d25d8a4cf
...

Case 2. X-Request-Id header is specified - pass it back

$ curl -v -X POST http://localhost:8000/v1/chat/completions \
> -d '{"model":"meta-llama/Llama-3.2-1B-Instruct","messages":[{"role":"user","content":"Hi"}]}' \
> -H 'X-Request-Id: aaaa' \
> -H 'Content-Type: Application/json'
*   Trying 127.0.0.1:8000...
* Connected to localhost (127.0.0.1) port 8000 (#0)
> POST /v1/chat/completions HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.68.0
> Accept: */*
> X-Request-Id: aaaa
> Content-Type: application/json
> Content-Length: 88
> 
< HTTP/1.1 200 OK
< date: Tue, 22 Oct 2024 15:58:16 GMT
< server: uvicorn
< content-length: 230
< content-type: application/json
< x-request-id: aaaa
...

Alternatives

#9550 achieves similar goal, but this PR has some pros:

  • this PR lies within OpenAI API spec while [Frontend] Support custom request_id from request #9550 extends the API spec (possibility in API conflict in the future?)
  • OpenAI SDKs supports x-request-id in the way that returns the exact value set by a client (e,g, response._request_id attribute in openai-python), so no post-processing needed
  • HTTP header is easier to handle than HTTP body, e.g. it is easier to grep request id using curl cmd etc.

Additional context

Some other language model APIs send request id (anthropic, hyperclova)

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant