-
Notifications
You must be signed in to change notification settings - Fork 27
Issues: GoogleCloudPlatform/ai-infra-cluster-provisioning
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Megatron GKE workload: RuntimeError: CUDA error: an illegal memory access was encountered
#383
opened Oct 24, 2024 by
xibinliu
Megatron GKE workload: MountVolume.SetUp failed for volume "dmabuf" : hostPath type check failed
#382
opened Oct 24, 2024 by
xibinliu
[P2] Support passing existing resource policy for MIG or GKE cluster set up.
enhancement
New feature or request
[P4] Add resource label for billing.
Feature
New feature request
good first issue
Good for newcomers
[P3] Disable master_authorized_network_config in GKE cluster
bug
Something isn't working
good first issue
Good for newcomers
move metadata to calling modules
bug
Something isn't working
#240
opened Jun 27, 2023 by
stevenBorisko
local
apparently wont fail if subshell in variable assignment fails
bug
#232
opened Jun 22, 2023 by
stevenBorisko
search env vars for tfvars if not in file
bug
Something isn't working
#231
opened Jun 22, 2023 by
stevenBorisko
update slurm image name
enhancement
New feature or request
#222
opened Jun 21, 2023 by
stevenBorisko
Validate perf of GCS fuse in parallel containers
bug
Something isn't working
#207
opened Jun 12, 2023 by
soumyapani
Don't create ssh firewalls for production A3 VMs
bug
Something isn't working
#167
opened May 9, 2023 by
sdlin
service account does not exist in a brand new project
invalid
This doesn't seem right
#106
opened Mar 15, 2023 by
stevenBorisko
Previous Next
ProTip!
Adding no:label will show everything without a label.