-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can I assign a GPU resource to an enclave? #543
Comments
Hello @BaiChienKao, Enabling GPU attachment for Enclaves is on our radar, but there are no immediate plans to implement this feature. |
This should be set as a top priority for AWS cloud now, in the light of AI technologies evolving and the appearance of first GPU TEE discrete adapters (Hopper H100 and Blackwell H200 architectures from NVidia) for CC (confidential computing mode) on GPU, and 'cause P5 and P5e EC2 instances with H100 already available in AWS cloud. But looks like Nitro is still not support GPU TEE for AWS cloud and not support enabling discrete adapters on a PCI bus, although the NSM module itself is a virtual (virtio based) PCI device to interact with Nitro hypervisor (hope its code will be published as well, as it is based on KVM - this will improve the chain of trust and will gives improved attestation for all components of the Nitro platform). There are other options available - the KVM/QEMU VMs with support for AMD SEV-SNP or Intel TDX, VM based CPU TEE, and NVidia's Hopper/Blackwell MIG TEE enabled with NVtrust. Guys and gals, you definitely should take this into more closer consideration and implement it ASAP in near perspective. Cc @meerd @andraprs @eugkoira @axlprv @agraf @jdbean Our ML researching and cloud infrastructure teams at @sentient-xyz (https://sentient.foundation) are really do need GPU TEE feature for P5 and P5e instances with H100/H200 GPUs with support of on-chip confidential computing (MIG based TEE in Hopper architecture) in isolated GPU memory. Found only this article, which mentioned P5, P5e and Nitro, but doesn't give any meaningful information about support of GPU TEE and only gives false expectations.
|
I would also like this. For now, I'm connecting my enclave to another provider that runs their stuff on azure's confidential compute in order to get the H100 TEE feature. |
With the new Trainium 2, is this essentially possible now? Would the enclave part work out of the box with one of those instances? |
I'm currently engaged in research involving enclaves and I'm interested in optimizing certain applications by utilizing GPU resources. Unfortunately, I cannot find a way to assign a GPU resource to an enclave. My research from 2021 indicated that this feature was not supported. I'm curious if there have been any developments since then, and whether GPU assignment for enclaves is now possible.
The text was updated successfully, but these errors were encountered: