diff --git a/docs/faq.md b/docs/faq.md index 5906fcd1ff..ff29b1007c 100644 --- a/docs/faq.md +++ b/docs/faq.md @@ -227,7 +227,7 @@ across all instances and allows easy user control with By default, the [slurm_cluster](../terraform/slurm_cluster/README.md) terraform module uses the latest Slurm image family (e.g. -`slurm-gcp-6-1-hpc-rocky-linux-8`). As new Slurm image families are released, +`slurm-gcp-6-2-hpc-rocky-linux-8`). As new Slurm image families are released, coenciding with periodic Slurm releases, the terraform module will be updated to track the newest image family by setting it as the new default. This update can be considered a breaking change. diff --git a/docs/images.md b/docs/images.md index aac8d1edec..18cbf6ee0e 100644 --- a/docs/images.md +++ b/docs/images.md @@ -74,28 +74,28 @@ For the [TPU](./glossary.md#tpu) nodes docker images are also released. | Project | Image Family | Arch | Status | | :------------------: | :---------------------------------- | :----- | :------------- | -| schedmd-slurm-public | slurm-gcp-6-1-debian-11 | x86_64 | Supported | -| schedmd-slurm-public | slurm-gcp-6-1-hpc-rocky-linux-8 | x86_64 | Supported | -| schedmd-slurm-public | slurm-gcp-6-1-ubuntu-2004-lts | x86_64 | Supported | -| schedmd-slurm-public | slurm-gcp-6-1-ubuntu-2204-lts-arm64 | ARM64 | Supported | -| schedmd-slurm-public | slurm-gcp-6-1-hpc-centos-7-k80 | x86_64 | EOL 2024-05-01 | -| schedmd-slurm-public | slurm-gcp-6-1-hpc-centos-7 | x86_64 | EOL 2024-01-01 | +| schedmd-slurm-public | slurm-gcp-6-2-debian-11 | x86_64 | Supported | +| schedmd-slurm-public | slurm-gcp-6-2-hpc-rocky-linux-8 | x86_64 | Supported | +| schedmd-slurm-public | slurm-gcp-6-2-ubuntu-2004-lts | x86_64 | Supported | +| schedmd-slurm-public | slurm-gcp-6-2-ubuntu-2204-lts-arm64 | ARM64 | Supported | +| schedmd-slurm-public | slurm-gcp-6-2-hpc-centos-7-k80 | x86_64 | EOL 2024-05-01 | +| schedmd-slurm-public | slurm-gcp-6-2-hpc-centos-7 | x86_64 | EOL 2024-01-01 | ### Published Docker Image Family | Project | Image Family | Status | | :------------------: | :-------------------------- | :-------- | -| schedmd-slurm-public | tpu:slurm-gcp-6-1-tf-2.8.0 | Supported | -| schedmd-slurm-public | tpu:slurm-gcp-6-1-tf-2.8.3 | Supported | -| schedmd-slurm-public | tpu:slurm-gcp-6-1-tf-2.9.1 | Supported | -| schedmd-slurm-public | tpu:slurm-gcp-6-1-tf-2.9.3 | Supported | -| schedmd-slurm-public | tpu:slurm-gcp-6-1-tf-2.10.0 | Supported | -| schedmd-slurm-public | tpu:slurm-gcp-6-1-tf-2.10.1 | Supported | -| schedmd-slurm-public | tpu:slurm-gcp-6-1-tf-2.11.0 | Supported | -| schedmd-slurm-public | tpu:slurm-gcp-6-1-tf-2.11.1 | Supported | -| schedmd-slurm-public | tpu:slurm-gcp-6-1-tf-2.12.0 | Supported | -| schedmd-slurm-public | tpu:slurm-gcp-6-1-tf-2.12.1 | Supported | -| schedmd-slurm-public | tpu:slurm-gcp-6-1-tf-2.13.0 | Supported | +| schedmd-slurm-public | tpu:slurm-gcp-6-2-tf-2.8.0 | Supported | +| schedmd-slurm-public | tpu:slurm-gcp-6-2-tf-2.8.3 | Supported | +| schedmd-slurm-public | tpu:slurm-gcp-6-2-tf-2.9.1 | Supported | +| schedmd-slurm-public | tpu:slurm-gcp-6-2-tf-2.9.3 | Supported | +| schedmd-slurm-public | tpu:slurm-gcp-6-2-tf-2.10.0 | Supported | +| schedmd-slurm-public | tpu:slurm-gcp-6-2-tf-2.10.1 | Supported | +| schedmd-slurm-public | tpu:slurm-gcp-6-2-tf-2.11.0 | Supported | +| schedmd-slurm-public | tpu:slurm-gcp-6-2-tf-2.11.1 | Supported | +| schedmd-slurm-public | tpu:slurm-gcp-6-2-tf-2.12.0 | Supported | +| schedmd-slurm-public | tpu:slurm-gcp-6-2-tf-2.12.1 | Supported | +| schedmd-slurm-public | tpu:slurm-gcp-6-2-tf-2.13.0 | Supported | ## Custom Image diff --git a/docs/tpu.md b/docs/tpu.md index f00a189204..70663e50ff 100644 --- a/docs/tpu.md +++ b/docs/tpu.md @@ -72,12 +72,12 @@ state we will also include if it is tested or not. | Project | Image Family | Arch | TPU Status | | :------------------: | :---------------------------------- | :----- | :---------- | -| schedmd-slurm-public | slurm-gcp-6-1-debian-11 | x86_64 | Untested | -| schedmd-slurm-public | slurm-gcp-6-1-hpc-rocky-linux-8 | x86_64 | Tested | -| schedmd-slurm-public | slurm-gcp-6-1-ubuntu-2004-lts | x86_64 | Untested | -| schedmd-slurm-public | slurm-gcp-6-1-ubuntu-2204-lts-arm64 | ARM64 | Untested | -| schedmd-slurm-public | slurm-gcp-6-1-hpc-centos-7-k80 | x86_64 | Unsupported | -| schedmd-slurm-public | slurm-gcp-6-1-hpc-centos-7 | x86_64 | Unsupported | +| schedmd-slurm-public | slurm-gcp-6-2-debian-11 | x86_64 | Untested | +| schedmd-slurm-public | slurm-gcp-6-2-hpc-rocky-linux-8 | x86_64 | Tested | +| schedmd-slurm-public | slurm-gcp-6-2-ubuntu-2004-lts | x86_64 | Untested | +| schedmd-slurm-public | slurm-gcp-6-2-ubuntu-2204-lts-arm64 | ARM64 | Untested | +| schedmd-slurm-public | slurm-gcp-6-2-hpc-centos-7-k80 | x86_64 | Unsupported | +| schedmd-slurm-public | slurm-gcp-6-2-hpc-centos-7 | x86_64 | Unsupported | ## Terraform diff --git a/terraform/slurm_cluster/modules/slurm_instance_template/main.tf b/terraform/slurm_cluster/modules/slurm_instance_template/main.tf index bebd277d58..2552ab843e 100644 --- a/terraform/slurm_cluster/modules/slurm_instance_template/main.tf +++ b/terraform/slurm_cluster/modules/slurm_instance_template/main.tf @@ -47,7 +47,7 @@ locals { source_image_family = ( var.source_image_family != "" && var.source_image_family != null ? var.source_image_family - : "slurm-gcp-6-1-hpc-rocky-linux-8" + : "slurm-gcp-6-2-hpc-rocky-linux-8" ) source_image_project = ( var.source_image_project != "" && var.source_image_project != null diff --git a/terraform/slurm_cluster/modules/slurm_nodeset_tpu/README_TF.md b/terraform/slurm_cluster/modules/slurm_nodeset_tpu/README_TF.md index a989fb1a6c..1d9157573f 100644 --- a/terraform/slurm_cluster/modules/slurm_nodeset_tpu/README_TF.md +++ b/terraform/slurm_cluster/modules/slurm_nodeset_tpu/README_TF.md @@ -49,7 +49,7 @@ No modules. |------|-------------|------|---------|:--------:| | [accelerator\_config](#input\_accelerator\_config) | Nodeset accelerator config, see https://cloud.google.com/tpu/docs/supported-tpu-configurations for details. |
object({
topology = string
version = string
})
|
{
"topology": "",
"version": ""
}
| no | | [data\_disks](#input\_data\_disks) | The data disks to include in the TPU node | `list(string)` | `[]` | no | -| [docker\_image](#input\_docker\_image) | The gcp container registry id docker image to use in the TPU vms, it defaults to gcr.io/schedmd-slurm-public/tpu:slurm-gcp-6-1-tf- | `string` | `""` | no | +| [docker\_image](#input\_docker\_image) | The gcp container registry id docker image to use in the TPU vms, it defaults to gcr.io/schedmd-slurm-public/tpu:slurm-gcp-6-2-tf- | `string` | `""` | no | | [enable\_public\_ip](#input\_enable\_public\_ip) | Enables IP address to access the Internet. | `bool` | `false` | no | | [network](#input\_network) | The name of the network to attach the TPU-vm of this nodeset to. | `string` | `""` | no | | [node\_count\_dynamic\_max](#input\_node\_count\_dynamic\_max) | Maximum number of nodes allowed in this partition to be created dynamically. | `number` | `0` | no | diff --git a/terraform/slurm_cluster/modules/slurm_nodeset_tpu/main.tf b/terraform/slurm_cluster/modules/slurm_nodeset_tpu/main.tf index 9ce97deda5..fbb8988577 100644 --- a/terraform/slurm_cluster/modules/slurm_nodeset_tpu/main.tf +++ b/terraform/slurm_cluster/modules/slurm_nodeset_tpu/main.tf @@ -68,7 +68,7 @@ locals { service_account = var.service_account != null ? var.service_account : local.service_account preserve_tpu = local.can_preempt ? var.preserve_tpu : false data_disks = var.data_disks - docker_image = var.docker_image != "" ? var.docker_image : "gcr.io/schedmd-slurm-public/tpu:slurm-gcp-6-1-tf-${var.tf_version}" + docker_image = var.docker_image != "" ? var.docker_image : "gcr.io/schedmd-slurm-public/tpu:slurm-gcp-6-2-tf-${var.tf_version}" network = var.network subnetwork = local.snetwork } diff --git a/terraform/slurm_cluster/modules/slurm_nodeset_tpu/variables.tf b/terraform/slurm_cluster/modules/slurm_nodeset_tpu/variables.tf index 934875a826..1b6c270cd8 100644 --- a/terraform/slurm_cluster/modules/slurm_nodeset_tpu/variables.tf +++ b/terraform/slurm_cluster/modules/slurm_nodeset_tpu/variables.tf @@ -51,7 +51,7 @@ variable "accelerator_config" { } variable "docker_image" { - description = "The gcp container registry id docker image to use in the TPU vms, it defaults to gcr.io/schedmd-slurm-public/tpu:slurm-gcp-6-1-tf-" + description = "The gcp container registry id docker image to use in the TPU vms, it defaults to gcr.io/schedmd-slurm-public/tpu:slurm-gcp-6-2-tf-" type = string default = "" }