Releases: GoogleCloudPlatform/ai-infra-cluster-provisioning
Releases · GoogleCloudPlatform/ai-infra-cluster-provisioning
Release v0.14.0
What's Changed
- Adjust subnet cidr range to allow for more hosts by @sdlin in #272
- Adding BYOPP support for GKE beta. by @soumyapani in #273
- Install GPU driver before node pool is fully created. by @soumyapani in #274
- Adding support for TCPX in gke-beta by @soumyapani in #275
- Release v0.14.0 by @soumyapani in #276
Full Changelog: v0.13.0...v0.14.0
v0.13.0
What's Changed
- Adding gke-beta cluster type for mutli-NIC gke cluster creation. by @soumyapani in #264
- Fixing bugs in gke-beta. by @soumyapani in #265
- Adding support to create resource policy. by @soumyapani in #266
- Fixing resource policy creation. by @soumyapani in #267
- Remove the unnecessary labeling by @valentinali2008 in #269
- Adding support for compact placement policy for MIG creation. by @soumyapani in #268
Full Changelog: v0.12.0...v0.13.0
v0.12.0
What's Changed
- add gke-beta to entrypoint by @stevenBorisko in #256
- adding a maintenace interval by @valentinali2008 in #258
- add maintenance interval variable by @stevenBorisko in #261
Full Changelog: v0.11.1...v0.12.0
v0.11.1
v0.11.0: A3/TCPD
What's Changed
- add a3 sample by @stevenBorisko in #248
- tcpd script by @stevenBorisko in #251
Full Changelog: v0.10.0...v0.11.0
force recreation of instances on template changes
What's Changed
- fix startup script by @stevenBorisko in #243
- Adding support to update existing VMs when instance_template changes. by @soumyapani in #244
- fix ci by @stevenBorisko in #245
Full Changelog: v0.9.0...v0.10.0
Support for creation A3 multi-NIC MIG with COS
What's Changed
- Adding static naming for instance_template. by @soumyapani in #228
- Adding common module for instance_group_manager and changing network routing mode. by @soumyapani in #233
- Add metadata to instance vm to enable fluent bit logging by @valentinali2008 in #235
- Using google_compute_network to create VPCs. by @soumyapani in #236
- Adding project_id in network module. by @soumyapani in #237
- restructure templating by @stevenBorisko in #239
- Release v0.9.0 by @soumyapani in #241
New Contributors
- @valentinali2008 made their first contribution in #235
Full Changelog: v0.8.1...v0.9.0
Support for MIG with COS VMs and Containers.
What's Changed
- merge main into develop by @soumyapani in #201
- Continuous test to run daily in develop branch by @soumyapani in #204
- Enabling GCSFuse driver for GKE cluster. by @soumyapani in #206
- use cloud-init to start contianer on cos via mig by @stevenBorisko in #203
- Adding support for running private container image via cloud-init in … by @soumyapani in #208
- Making MIG maintenance interval periodic. by @soumyapani in #209
- Support cloud logging in mig-with-container. by @soumyapani in #210
- add metadata to mig by @stevenBorisko in #212
- cloudinit no container by @stevenBorisko in #211
- increase continuous timeout by @stevenBorisko in #216
- Release v0.8.0 by @stevenBorisko in #218
- Add iptables setting to startup script. by @sdlin in #219
- remove line that mysteriously breaks everything by @stevenBorisko in #221
- Main to develop by @stevenBorisko in #224
- Release v0.8.1 by @stevenBorisko in #225
Full Changelog: v0.7.0...v0.8.1
v0.7.0: Individual modules for MIG, GKE and Slurm. Changing input config to .tfvar format.
What's Changed
- Create module for MIG by @stevenBorisko in #174
- Create module for Slurm by @stevenBorisko in #177
- Create module for GKE by @soumyapani in #179
- Adding examples for MIG, Slurm and GKE by @stevenBorisko and @soumyapani
- Add continuous test by @stevenBorisko in #194
- Release v0.7.0 by @soumyapani in #200
Full Changelog: v0.6.0...v0.7.0
Adding support for slurm cluster creation to the tool and additional post creation setup for GKE cluster.
What's Changed
- 1.Enable GKE compact placement 2.Disable GKE maintenance. by @soumyapani in #148
- Adding instructions to use aiinfra-cluster terraform module. by @soumyapani in #152
- slurm cluster module by @stevenBorisko in #141
- 1.Using project SA as default. 2.Adding SA binding for default KSA in GKE cluster by @soumyapani in #156
- Updating Terraform config example for GKE. by @soumyapani in #163
- Install nvidia gpu driver for GKE cluster as a daemonset. by @soumyapani and @nick-stroud in #165
Full Changelog: v0.5.0...v0.6.0