Note
Slurm-gcp-v5-login module is deprecated. See this update for specific recommendations and timelines.
This module creates a login node for a Slurm cluster based on the SchedMD/slurm-gcp slurm_instance_template and slurm_login_instance terraform modules. The login node is used in conjunction with the Slurm controller.
- id: slurm_login
source: community/modules/scheduler/schedmd-slurm-gcp-v5-login
use:
- network1
- slurm_controller
settings:
machine_type: n2-standard-4
This creates a Slurm login node which is:
- connected to the primary subnet of network1 via
use
- associated with the
slurm_controller
module as the slurm controller viause
- of VM machine type
n2-standard-4
For more information on creating valid custom images for the login node VM instances or for custom instance templates, see our vm-images.md documentation page.
More information on GPU support in Slurm on GCP and other Cluster Toolkit modules can be found at docs/gpu-support.md
The Cluster Toolkit team maintains the wrapper around the slurm-on-gcp terraform modules. For support with the underlying modules, see the instructions in the slurm-gcp README.
Copyright 2023 Google LLC
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Name | Version |
---|---|
terraform | >= 1.1 |
>= 3.83 |
Name | Version |
---|---|
>= 3.83 |
Name | Source | Version |
---|---|---|
slurm_login_instance | github.com/GoogleCloudPlatform/slurm-gcp.git//terraform/slurm_cluster/modules/slurm_login_instance | 5.12.2 |
slurm_login_template | github.com/GoogleCloudPlatform/slurm-gcp.git//terraform/slurm_cluster/modules/slurm_instance_template | 5.12.2 |
Name | Type |
---|---|
google_compute_default_service_account.default | data source |
google_compute_image.slurm | data source |
Name | Description | Type | Default | Required |
---|---|---|---|---|
access_config | Access configurations, i.e. IPs via which the VM instance can be accessed via the Internet. | list(object({ |
[] |
no |
additional_disks | List of maps of disks. | list(object({ |
[] |
no |
allow_automatic_updates | If false, disables automatic system package updates on the created instances. This feature is only available on supported images (or images derived from them). For more details, see https://cloud.google.com/compute/docs/instances/create-hpc-vm#disable_automatic_updates |
bool |
true |
no |
can_ip_forward | Enable IP forwarding, for NAT instances for example. | bool |
false |
no |
controller_instance_id | The server-assigned unique identifier of the controller instance. This value must be supplied as an output of the controller module, typically via use . |
string |
n/a | yes |
deployment_name | Name of the deployment. | string |
n/a | yes |
disable_login_public_ips | If set to false. The login will have a random public IP assigned to it. Ignored if access_config is set. | bool |
true |
no |
disable_smt | Disables Simultaneous Multi-Threading (SMT) on instance. | bool |
true |
no |
disk_auto_delete | Whether or not the boot disk should be auto-deleted. | bool |
true |
no |
disk_labels | Labels specific to the boot disk. These will be merged with var.labels. | map(string) |
{} |
no |
disk_size_gb | Boot disk size in GB. | number |
50 |
no |
disk_type | Boot disk type. | string |
"pd-standard" |
no |
enable_confidential_vm | Enable the Confidential VM configuration. Note: the instance image must support option. | bool |
false |
no |
enable_oslogin | Enables Google Cloud os-login for user login and authentication for VMs. See https://cloud.google.com/compute/docs/oslogin |
bool |
true |
no |
enable_reconfigure | Enables automatic Slurm reconfigure on when Slurm configuration changes (e.g. slurm.conf.tpl, partition details). NOTE: Requires Google Pub/Sub API. |
bool |
false |
no |
enable_shielded_vm | Enable the Shielded VM configuration. Note: the instance image must support option. | bool |
false |
no |
gpu | DEPRECATED: use var.guest_accelerator | object({ |
null |
no |
guest_accelerator | List of the type and count of accelerator cards attached to the instance. | list(object({ |
[] |
no |
instance_image | Defines the image that will be used in the Slurm login node VM instances. Expected Fields: name: The name of the image. Mutually exclusive with family. family: The image family to use. Mutually exclusive with name. project: The project where the image is hosted. For more information on creating custom images that comply with Slurm on GCP see the "Slurm on GCP Custom Images" section in docs/vm-images.md. |
map(string) |
{ |
no |
instance_image_custom | A flag that designates that the user is aware that they are requesting to use a custom and potentially incompatible image for this Slurm on GCP module. If the field is set to false, only the compatible families and project names will be accepted. The deployment will fail with any other image family or name. If set to true, no checks will be done. See: https://goo.gle/hpc-slurm-images |
bool |
false |
no |
instance_template | Self link to a custom instance template. If set, other VM definition variables such as machine_type and instance_image will be ignored in favor of the provided instance template. For more information on creating custom images for the instance template that comply with Slurm on GCP see the "Slurm on GCP Custom Images" section in docs/vm-images.md. |
string |
null |
no |
labels | Labels, provided as a map. | map(string) |
{} |
no |
machine_type | Machine type to create. | string |
"n2-standard-2" |
no |
metadata | Metadata, provided as a map. | map(string) |
{} |
no |
min_cpu_platform | Specifies a minimum CPU platform. Applicable values are the friendly names of CPU platforms, such as Intel Haswell or Intel Skylake. See the complete list: https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform |
string |
null |
no |
network_ip | DEPRECATED: Use static_ips variable to assign an internal static ip address. |
string |
null |
no |
network_self_link | Network to deploy to. Either network_self_link or subnetwork_self_link must be specified. | string |
null |
no |
num_instances | Number of instances to create. This value is ignored if static_ips is provided. | number |
1 |
no |
on_host_maintenance | Instance availability Policy. | string |
"MIGRATE" |
no |
preemptible | Allow the instance to be preempted. | bool |
false |
no |
project_id | Project ID to create resources in. | string |
n/a | yes |
pubsub_topic | The cluster pubsub topic created by the controller when enable_reconfigure=true. | string |
null |
no |
region | Region where the instances should be created. Note: region will be ignored if it can be extracted from subnetwork. |
string |
null |
no |
service_account | Service account to attach to the login instance. If not set, the default compute service account for the given project will be used with the "https://www.googleapis.com/auth/cloud-platform" scope. |
object({ |
null |
no |
shielded_instance_config | Shielded VM configuration for the instance. Note: not used unless enable_shielded_vm is 'true'. - enable_integrity_monitoring : Compare the most recent boot measurements to the integrity policy baseline and return a pair of pass/fail results depending on whether they match or not. - enable_secure_boot : Verify the digital signature of all boot components, and halt the boot process if signature verification fails. - enable_vtpm : Use a virtualized trusted platform module, which is a specialized computer chip you can use to encrypt objects like keys and certificates. |
object({ |
{ |
no |
slurm_cluster_name | Cluster name, used for resource naming and slurm accounting. If not provided it will default to the first 8 characters of the deployment name (removing any invalid characters). | string |
null |
no |
source_image | DEPRECATED: Use instance_image instead. |
string |
null |
no |
source_image_family | DEPRECATED: Use instance_image instead. |
string |
null |
no |
source_image_project | DEPRECATED: Use instance_image instead. |
string |
null |
no |
startup_script | Startup script that will be used by the login node VM. | string |
"" |
no |
static_ips | List of static IPs for VM instances. | list(string) |
[] |
no |
subnetwork_project | The project that subnetwork belongs to. | string |
null |
no |
subnetwork_self_link | Subnet to deploy to. Either network_self_link or subnetwork_self_link must be specified. | string |
null |
no |
tags | Network tag list. | list(string) |
[] |
no |
zone | Zone where the instances should be created. If not specified, instances will be spread across available zones in the region. |
string |
null |
no |
No outputs.