Skip to content

[AQUA][GPT-OSS] Add Shape-Specific Env Config for GPT-OSS Models in AQUA Deployment Config Reader #1244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 10, 2025

Conversation

mrDzurb
Copy link
Member

@mrDzurb mrDzurb commented Aug 10, 2025

Description:

Summary
This PR updates the AQUA deployment config reader to support shape-specific environment variables for GPT-OSS model deployments. The change introduces an env section in the deployment configuration for A-series GPU shapes, allowing AQUA handlers to return custom environment variables alongside existing parameter sets.

Example
A request to:

/aqua/deployments/{model_ocid}/params?instance_shape=BM.GPU4.8

will now return:

{
  "data": [
    "--trust-remote-code",
    "--gpu-memory-utilization 0.98",
    "--enforce-eager",
    "--max-num-seqs 32",
    "--max_model_len 130000",
    "--dtype bfloat16"
  ],
  "env": {
    "VLLM_ATTENTION_BACKEND": "TRITON_ATTN_VLLM_V1"
  }
}

This change is required to ensure GPT-OSS model deployments on A-series shapes use the correct VLLM attention backend (TRITON_ATTN_VLLM_V1), which improves compatibility and performance for these hardware configurations.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Aug 10, 2025
darenr
darenr previously approved these changes Aug 10, 2025
Copy link

📌 Cov diff with main:

Coverage-94%

📌 Overall coverage:

Coverage-58.47%

Copy link

📌 Cov diff with main:

Coverage-94%

📌 Overall coverage:

Coverage-58.47%

@@ -997,6 +997,45 @@ def get_container_params_type(container_type_name: str) -> str:
return UNKNOWN


@lru_cache(maxsize=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need expiration, otherwise if there is an update to the config, it might no reflect

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

@darenr darenr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never knew about casefold()!

@mrDzurb mrDzurb merged commit ca17053 into main Aug 10, 2025
22 checks passed
Copy link

📌 Cov diff with main:

Coverage-92%

📌 Overall coverage:

Coverage-58.47%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants