Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support chat models in dstack-proxy #1953

Merged
merged 2 commits into from
Nov 5, 2024
Merged

Support chat models in dstack-proxy #1953

merged 2 commits into from
Nov 5, 2024

Conversation

jvstme
Copy link
Collaborator

@jvstme jvstme commented Nov 4, 2024

This commit adds the OpenAI-compatible endpoint
to dstack-proxy, which effectively allows running
services with model mappings without a gateway.

Most of the OpenAI- and TGI-specific code is
copied from dstack-gateway. This code
duplication will be eliminated later, once
dstack-proxy supports running on gateways.

The commit also contains some refactoring in
dstack-proxy: introduces ProxyError and
UnexpectedProxyError exceptions and simplifies
error logging in service_proxy.py.

Behind the PROXY feature flag.

Part of #1595

This commit adds the OpenAI-compatible endpoint to
`dstack-proxy`, which effectively allow running
services with model mappings without a gateway.

Most of the OpenAI- and TGI-specific code is
copied from `dstack-gateway`. This code
duplication will be eliminated later, once
`dstack-proxy` supports running on gateways.

The commit also contains some refactoring in
`dstack-proxy`: introduces `ProxyError` and
`UnexpectedProxyError` exceptions and simplifies
error logging in `service_proxy.py`.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from dstack-gateway with minor adjustments

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next 3 files were copied from dstack-gateway with minor adjustments

@jvstme jvstme requested a review from r4victor November 4, 2024 14:05
Pydantic can actually work with two discriminators
as long as all classes define both.
@jvstme jvstme merged commit f139f01 into master Nov 5, 2024
23 checks passed
@jvstme jvstme deleted the issue_1595_openai branch November 5, 2024 07:43
superprat pushed a commit to bahaal-tech/dstack that referenced this pull request Dec 20, 2024
This commit adds the OpenAI-compatible endpoint to
`dstack-proxy`, which effectively allows running
services with model mappings without a gateway.

Most of the OpenAI- and TGI-specific code is
copied from `dstack-gateway`. This code
duplication will be eliminated later, once
`dstack-proxy` supports running on gateways.

The commit also contains some refactoring in
`dstack-proxy`: introduces `ProxyError` and
`UnexpectedProxyError` exceptions and simplifies
error logging in `service_proxy.py`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants