Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Allow running services and models without a gateway #1595

Closed
Tracked by #1782
peterschmidt85 opened this issue Aug 21, 2024 · 6 comments
Closed
Tracked by #1782

[Feature]: Allow running services and models without a gateway #1595

peterschmidt85 opened this issue Aug 21, 2024 · 6 comments
Assignees

Comments

@peterschmidt85
Copy link
Contributor

Use case:

  • As a user, I want to run a service/model and make it accessible to my project team.
  • I don't want to set up a domain.
  • I don't want to make the endpoint public.

Suggestion:

  • In addition to the existing ability to create "managed" gateways (public IP address, custom domain), allow the dstack server to handle service/model traffic directly.

Motivation:

  • So far, services have primarily been used for development, but the need to set up a domain and create a public endpoint poses a significant barrier. If the dstack server could run services/models out of the box, it would greatly simplify usage for both individuals and companies.
@peterschmidt85
Copy link
Contributor Author

Another reason why this issue is important is that in our documentation, we often use tasks for deploying models because it doesn’t require creating a gateway, which is also not supported for all backends. However, when a model is running as a task, the user cannot use the UI to interact with the model.

@jvstme
Copy link
Collaborator

jvstme commented Sep 26, 2024

Implementation progress

  • Python-based reverse proxy for services
    • HTTP proxying
    • Request/response streaming
    • Websocket proxying
    • Proxying to multiple service replicas
    • Auth
    • Configuring whether path prefix is stripped
  • Running services without a gateway
  • OpenAI endpoint
  • [UI]: Support in-server model proxy #1954
  • Configuring which gateway to use for the run
  • Collecting stats for replicas autoscaling
  • Using the new proxy implementation on gateways
    • Nginx- and certbot- related logic
    • Installation and updates

Usage instructions for the current version

  1. Make sure your project doesn't have a default gateway or set gateway: false in your service configuration.
  2. Run the service.
  3. Your service is now available at <dstack-server-base-url>/proxy/services/<project-name>/<run-name>/. If the service defines a model mapping, the OpenAI-compatible API is available at <dstack-server-base-url>/proxy/models/<project-name>/

@peterschmidt85
Copy link
Contributor Author

peterschmidt85 commented Sep 26, 2024

Plus, ensure the UI can leverage the built-in proxy

Copy link

github-actions bot commented Dec 8, 2024

This issue is stale because it has been open for 30 days with no activity.

Copy link

This issue is stale because it has been open for 30 days with no activity.

@jvstme jvstme closed this as completed Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants