Custom Images in the Cluster Toolkit (formerly HPC Toolkit)

Please review the introduction to image building for general information on building custom images using the Toolkit.

Introduction

This module uses Packer to create an image within an Cluster Toolkit deployment. Packer operates by provisioning a short-lived VM in Google Cloud on which it executes scripts to customize the boot disk for repeated use. The VM's boot disk is specified from a source image that defaults to the HPC VM Image. This Packer "template" supports customization by the following approaches following a recommended use:

startup-script metadata from raw string or file
Shell scripts uploaded from the Packer execution environment to the VM
Ansible playbooks uploaded from the Packer execution environment to the VM

They can be specified independently of one another, so that anywhere from 1 to 3 solutions can be used simultaneously. In the case that 0 scripts are supplied, the source boot disk is effectively copied to your project without customization. This can be useful in scenarios where increased control over the image maintenance lifecycle is desired or when policies restrict the use of images to internal projects.

Minimum requirements

Outbound internet access

Most customization scripts require access to resources on the public internet. This can be achieved by one of the following 2 approaches:

Using a public IP address on the VM

Set var.omit_external_ip to true

Configuring a VPC with a Cloud NAT in the region of the VM

Use the vpc module which automates NAT creation

Inbound internet access

Read order of execution below for a discussion of VM customization solutions and their requirements for inbound SSH access. Environments without SSH access should use the metadata-based startup-script solution.

A simple way to enable inbound SSH access is to use the VPC module with allowed_ssh_ip_ranges set to 0.0.0.0/0.

User or service account running Packer

The user or service account running Packer must have the permission to create VMs in the selected VPC network and, if use_iap is set, must have the "IAP-Secured Tunnel User" role. Recommended roles are:

roles/compute.instanceAdmin.v1
roles/iap.tunnelResourceAccessor

VM service account roles

The service account attached to the temporary build VM created by Packer should have the ability to write Cloud Logging entries so that you may inspect and debug build logs. When using the metadata startup-script customization solution, the service account attached to the temporary build VM created by Packer must have the permission to modify its own metadata and to read from Cloud Storage buckets. Recommended roles are:

roles/compute.instanceAdmin.v1
roles/logging.logWriter
roles/monitoring.metricWriter
roles/storage.objectViewer

It is recommended to create this service account as a separate step outside a blueprint due to known delay in IAM bindings propagation.

Example blueprints

A recommended pattern for building images with this module is to use the terraform based startup-script module along with this packer custom-image module. Below you can find links to several examples of this pattern, including usage instructions.

Image Builder

The Image Builder blueprint demonstrates a solution that builds an image using:

The HPC VM Image as a base upon which to customize
A VPC network with firewall rules that allow IAP-based SSH tunnels
A Toolkit runner that installs a custom script

Please review the examples README for usage instructions.

Order of execution

The startup script specified in metadata executes in parallel with the other supported methods. However, the remaining methods execute in a well-defined order relative to one another.

All shell scripts will execute in the configured order
After shell scripts complete, all Ansible playbooks will execute in the configured order

NOTE: if both startup_script and startup_script_file are specified, then startup_script_file takes precedence.

Recommended use

Because the metadata startup script executes in parallel with the other solutions, conflicts can arise, especially when package managers (yum or apt) lock their databases during package installation. Therefore, it is recommended to choose one of the following approaches:

Specify either startup_script or startup_script_file and do not specify shell_scripts or ansible_playbooks.
- This can be especially useful in environments that restrict SSH access
Specify any combination of shell_scripts and ansible_playbooks and do not specify startup_script or startup_script_file.

If any of the startup script approaches fail by returning a code other than 0, Packer will determine that the build has failed and refuse to save the image.

External access with SSH

The shell scripts and Ansible playbooks customization solutions both require SSH access to the VM from the Packer execution environment. SSH access can be enabled one of 2 ways:

The VM is created without a public IP address and SSH tunnels are created using Identity-Aware Proxy (IAP).
- Allow use_iap to take on its default value of true
The VM is created with an IP address on the public internet and firewall rules allow SSH access from the Packer execution environment.
- Set omit_external_ip = false (or omit_external_ip: false in a blueprint)
- Add firewall rules that open SSH to the VM

The Packer template defaults to using to the 1st IAP-based solution because it is more secure (no exposure to public internet) and because the vpc module automatically sets up all necessary firewall rules for SSH tunneling and outbound-only access to the internet through Cloud NAT.

In either SSH solution, customization scripts should be supplied as files in the shell_scripts and ansible_playbooks settings.

Environments without SSH access

Many network environments disallow SSH access to VMs. In these environments, the metadata-based startup scripts are appropriate because they execute entirely independently of the Packer execution environment.

In this scenario, a single scripts should be supplied in the form of a string to the startup_script input variable. This solution integrates well with Toolkit runners. Runners operate by using a single startup script whose behavior is extended by downloading and executing a customizable set of runners from Cloud Storage at startup.

NOTE: Packer will attempt to use SSH if either shell_scripts or ansible_playbooks are set to non-empty values. Leave them at their default, empty values to ensure access by SSH is disabled.

Supplying startup script as a string

The startup_script parameter accepts scripts formatted as strings. In Packer and Terraform, multi-line strings can be specified using heredoc syntax in an input Packer variables file (*.pkrvars.hcl) For example, the following snippet defines a multi-line bash script followed by an integer representing the size, in GiB, of the resulting image:

startup_script = <<-EOT
  #!/bin/bash
  yum install -y epel-release
  yum install -y jq
  EOT

disk_size = 100

In a blueprint, the equivalent syntax is:

...
    settings:
      startup_script: |
        #!/bin/bash
        yum install -y epel-release
        yum install -y jq
      disk_size: 100
...

Monitoring startup script execution

When using startup script customization, Packer will print very limited output to the console. For example:

==> example.googlecompute.toolkit_image: Waiting for any running startup script to finish...
==> example.googlecompute.toolkit_image: Startup script not finished yet. Waiting...
==> example.googlecompute.toolkit_image: Startup script not finished yet. Waiting...
==> example.googlecompute.toolkit_image: Startup script, if any, has finished running.

Using the default value for [var.scopes][#input_scopes], the output of startup script execution will be stored in Cloud Logging. It can be examined using the Cloud Logging Console or with a gcloud logging read command (substituting <<PROJECT_ID>> with your project ID):

$ gcloud logging --project <<PROJECT_ID>> read \
    'logName="projects/<<PROJECT_ID>>/logs/GCEMetadataScripts" AND jsonPayload.message=~"^startup-script: "' \
    --format="table[box](timestamp, resource.labels.instance_id, jsonPayload.message)" --freshness 2h

Note that this command will print all startup script entries within the project within the "freshness" window in reverse order. You may need to identify the instance ID of the Packer VM and filter further by that value using gcloud or grep. To print the entries in the order they would have appeared on your console, we recommend piping the output of this command to the standard Linux utility tac.

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Requirements

No requirements.

Providers

No providers.

Modules

No modules.

Resources

No resources.

Inputs

Name	Description	Type	Default	Required
accelerator_count	Number of accelerator cards to attach to the VM; not necessary for families that always include GPUs (A2).	`number`	`null`	no
accelerator_type	Type of accelerator cards to attach to the VM; not necessary for families that always include GPUs (A2).	`string`	`null`	no
ansible_playbooks	A list of Ansible playbook configurations that will be uploaded to customize the VM image	list(object({ playbook_file = string galaxy_file = string extra_arguments = list(string) }))	`[]`	no
communicator	Communicator to use for provisioners that require access to VM ("ssh" or "winrm")	`string`	`null`	no
deployment_name	Cluster Toolkit deployment name	`string`	n/a	yes
disk_size	Size of disk image in GB	`number`	`null`	no
disk_type	Type of persistent disk to provision	`string`	`"pd-balanced"`	no
enable_shielded_vm	Enable the Shielded VM configuration (var.shielded_instance_config).	`bool`	`false`	no
image_family	The family name of the image to be built. Defaults to `deployment_name`	`string`	`null`	no
image_name	The name of the image to be built. If not supplied, it will be set to image_family-$ISO_TIMESTAMP	`string`	`null`	no
image_storage_locations	Storage location, either regional or multi-regional, where snapshot content is to be stored and only accepts 1 value. See https://developer.hashicorp.com/packer/plugins/builders/googlecompute#image_storage_locations	`list(string)`	`null`	no
labels	Labels to apply to the short-lived VM	`map(string)`	`null`	no
machine_type	VM machine type on which to build new image	`string`	`"n2-standard-4"`	no
manifest_file	File to which to write Packer build manifest	`string`	`"packer-manifest.json"`	no
metadata	Instance metadata for the builder VM (use var.startup_script or var.startup_script_file to set startup-script metadata)	`map(string)`	`{}`	no
network_project_id	Project ID of Shared VPC network	`string`	`null`	no
omit_external_ip	Provision the image building VM without a public IP address	`bool`	`true`	no
on_host_maintenance	Describes maintenance behavior for the instance. If left blank this will default to `MIGRATE` except the use of GPUs requires it to be `TERMINATE`	`string`	`null`	no
project_id	Project in which to create VM and image	`string`	n/a	yes
scopes	DEPRECATED: use var.service_account_scopes	`set(string)`	`null`	no
service_account_email	The service account email to use. If null or 'default', then the default Compute Engine service account will be used.	`string`	`null`	no
service_account_scopes	Service account scopes to attach to the instance. See https://cloud.google.com/compute/docs/access/service-accounts.	`set(string)`	[ "https://www.googleapis.com/auth/cloud-platform" ]	no
shell_scripts	A list of paths to local shell scripts which will be uploaded to customize the VM image	`list(string)`	`[]`	no
shielded_instance_config	Shielded VM configuration for the instance (must set var.enabled_shielded_vm)	object({ enable_secure_boot = bool enable_vtpm = bool enable_integrity_monitoring = bool })	{ "enable_integrity_monitoring": true, "enable_secure_boot": true, "enable_vtpm": true }	no
source_image	Source OS image to build from	`string`	`null`	no
source_image_family	Alternative to source_image. Specify image family to build from latest image in family	`string`	`"hpc-rocky-linux-8"`	no
source_image_project_id	A list of project IDs to search for the source image. Packer will search the first project ID in the list first, and fall back to the next in the list, until it finds the source image.	`list(string)`	`null`	no
ssh_username	Username to use for SSH access to VM	`string`	`"hpc-toolkit-packer"`	no
startup_script	Startup script (as raw string) used to build the custom Linux VM image (overridden by var.startup_script_file if both are set)	`string`	`null`	no
startup_script_file	File path to local shell script that will be used to customize the Linux VM image (overrides var.startup_script)	`string`	`null`	no
state_timeout	The time to wait for instance state changes, including image creation	`string`	`"10m"`	no
subnetwork_name	Name of subnetwork in which to provision image building VM	`string`	n/a	yes
tags	Assign network tags to apply firewall rules to VM instance	`list(string)`	`null`	no
use_iap	Use IAP proxy when connecting by SSH	`bool`	`true`	no
use_os_login	Use OS Login when connecting by SSH	`bool`	`false`	no
windows_startup_ps1	A list of strings containing PowerShell scripts which will customize a Windows VM image (requires WinRM communicator)	`list(string)`	`[]`	no
wrap_startup_script	Wrap startup script with Packer-generated wrapper	`bool`	`true`	no
zone	Cloud zone in which to provision image building VM	`string`	n/a	yes

Outputs

No outputs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Custom Images in the Cluster Toolkit (formerly HPC Toolkit)

Introduction

Minimum requirements

Outbound internet access

Inbound internet access

User or service account running Packer

VM service account roles

Example blueprints

Image Builder

Order of execution

Recommended use

External access with SSH

Environments without SSH access

Supplying startup script as a string

Monitoring startup script execution

License

Requirements

Providers

Modules

Resources

Inputs

Outputs

Files

README.md

Latest commit

History

README.md

File metadata and controls

Custom Images in the Cluster Toolkit (formerly HPC Toolkit)

Introduction

Minimum requirements

Outbound internet access

Inbound internet access

User or service account running Packer

VM service account roles

Example blueprints

Image Builder

Order of execution

Recommended use

External access with SSH

Environments without SSH access

Supplying startup script as a string

Monitoring startup script execution

License

Requirements

Providers

Modules

Resources

Inputs

Outputs