Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to scale on tgi custom metrics #263

Merged
merged 9 commits into from
Mar 6, 2024

Conversation

rsgowman
Copy link
Collaborator

No description provided.

See changes for details. Some extra reasoning behind the changes:
* Prefix all GCS buckets with project_id since GCS buckets are globally
  namespaced.
* gcloud storage buckets add-iam-policy-binding" only accepts bucket
  URLs
This allows the tgi workload to scale up based on CPU demand (and
eventually other metrics).

NB: CPU is a poor choice for this workload, but acts as a baseline that
we can use to evaluate other metrics.
@rsgowman rsgowman marked this pull request as ready for review March 5, 2024 22:43
@rsgowman
Copy link
Collaborator Author

rsgowman commented Mar 5, 2024

/gcbrun

@rsgowman
Copy link
Collaborator Author

rsgowman commented Mar 5, 2024

/gcbrun

@rsgowman
Copy link
Collaborator Author

rsgowman commented Mar 6, 2024

/gcbrun

@rsgowman
Copy link
Collaborator Author

rsgowman commented Mar 6, 2024

/gcbrun

@rsgowman rsgowman merged commit a176e7d into GoogleCloudPlatform:main Mar 6, 2024
3 checks passed
@rsgowman rsgowman deleted the hpa_custommetrics branch March 6, 2024 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants