-
Notifications
You must be signed in to change notification settings - Fork 53
[AQUA][GPT-OSS] Add Shape-Specific Env Config for GPT-OSS Models in AQUA Deployment Config Reader #1244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ads/aqua/common/utils.py
Outdated
@@ -997,6 +997,45 @@ def get_container_params_type(container_type_name: str) -> str: | |||
return UNKNOWN | |||
|
|||
|
|||
@lru_cache(maxsize=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need expiration, otherwise if there is an update to the config, it might no reflect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never knew about casefold()!
Description:
Summary
This PR updates the AQUA deployment config reader to support shape-specific environment variables for GPT-OSS model deployments. The change introduces an
env
section in the deployment configuration for A-series GPU shapes, allowing AQUA handlers to return custom environment variables alongside existing parameter sets.Example
A request to:
will now return:
This change is required to ensure GPT-OSS model deployments on A-series shapes use the correct VLLM attention backend (
TRITON_ATTN_VLLM_V1
), which improves compatibility and performance for these hardware configurations.