-
Notifications
You must be signed in to change notification settings - Fork 510
Adding a new cloud provider
This document is currently focused on setup for Linux-based virtual machines. TODO(klausw): update for Windows.
The cloud provider must provide an automated way to provision and configure cloud resources, including appropriate error reporting and status checking.
A command line tool is generally the preferred option. The benchmarker will automatically log the command lines as they are executed, and users can examine them or reuse them for troubleshooting.
Alternatively, a Python API also works, but is more difficult to troubleshoot unless you add detailed logging to ensure that actions are clearly visible. Also, new cloud providers should not impose additional requirements for users who are not using it, so using nonstandard modules will currently require lazy loading or other methods to ensure this. (See issue #373.)
Required tool/API features:
-
Create a new VM using a specified boot disk image such as Ubuntu 14.04, and retrieve its assigned IP address(es).
-
Create a non-root user (default is perfkit) for Linux VMs:
- passwordless SSH access using a provided public key
- passwordless sudo capability
-
The guest OS must be able to download data from the Internet, for example to fetch distribution tarballs for benchmarks. Using proxies is supported.
-
Destroy a VM.
Optional but useful tool/API features:
-
Configure a firewall. (A simple "permit everything" static config works also.)
-
Create / attach / destroy additional block storage devices for use as scratch disks. This is optional if the standard boot disk has sufficient free space for running benchmarks.
-
Check existence of a specific virtual machine or resource. This is useful for automatically retrying failed creation / deletion requests.
-
Use a package manager to add dependencies. Alternatively, a custom base image could be created that contains all needed software.
Briefly, adding support for a new cloud provider in PerfKitBenchmarker needs the following steps:
- Create implementation classes for virtual machine / disk / network resources.
- If needed, add custom package installation support.
- Edit perfkitbenchmarker/benchmark_spec.py to integrate the new implementation.
- Update README.md to document the new provider.
As an example, see this commit which adds support for the DigitalOcean provider.
The cloud provider implementation including resource management must be thread safe. PerfKitExplorer can execute multiple benchmarks in parallel, including parallel creation and destruction of resources.
Deployed VMs and OS images should use default performance-related configuration and options so that results match what customers would typically experience.
It's ok to add support for opt-in tuning options such as machine type specific
sysfs configuration, but this should controlled by a flag that is
off by default, for example --rackspace_apply_onmetal_ssd_tuning
.
PerfKitBenchmarker is based around a hierarchy of resource classes. Incomplete example for illustration:
BaseResource
│
├── BaseDisk
│ │
│ ├─ StripedDisk
│ ├─ AwsDisk
│ ├─ AzureDisk
│ └─ GceDisk
│
├── BaseFirewall
│
├── BaseNetwork
│
├── BaseVirtualMachine
│ │
│ ├─ AwsVirtualMachine
│ │ ├─ DebianBasedAwsVirtualMachine
│ │ └─ RhelBasedAwsVirtualMachine
│ │
│ └─ GceVirtualMachine
│ ├─ DebianBasedGceVirtualMachine
│ └─ RhelBasedGceVirtualMachine
│
├── AzureAffinityGroup
├── AzureStorageAccount
└── AzureVirtualNetwork
At the most basic level, each resource must support a Create and Delete method to implement a basic life cycle. The default implementations for these methods in BaseResource split this up into individual steps, with built-in retry logic:
def Create(self):
self._CreateDependencies()
self._CreateResource()
self._PostCreate()
def Delete(self):
self._DeleteResource()
self._DeleteDependencies()
@vm_util.Retry(retryable_exceptions=(errors.Resource.RetryableCreationError,))
def _CreateResource(self):
self._Create()
try:
if not self._Exists():
raise errors.Resource.RetryableCreationError(...)
except NotImplementedError:
pass
To follow this life cycle, your subclass should implement the
_Create
and _Delete
methods. _Exists
is optional but highly
recommended for robustness.
If a virtual machine requires additional resources that need to be created and destroyed separately, you should handle them through separate resource classes that are managed as dependencies for the parent resource. This helps ensure that they are consistently cleaned up when appropriate. Examples include IP addresses, networks, or scratch disks that are not automatically included as part of the base VM.
PerfKitBenchmarker provides convenient mixins for Debian (Apt-based) and RHEL (Yum-based) installers:
class DebianBasedGceVirtualMachine(GceVirtualMachine, linux_virtual_machine.DebianMixin):
pass
class RhelBasedGceVirtualMachine(GceVirtualMachine, linux_virtual_machine.RhelMixin):
pass
Adding support for additional installers is more complex. In brief,
this would require adding new {name}Install(vm)
toplevel functions
to all supported packages in perfkitbenchmarker/packages/, and then
adding a new mixin class to perfkitbenchmarker/linux_virtual_machine.py
which uses the corresponding {name}Install(vm)
functions.
See the current examples. Briefly:
- import the cloud-specific resource classes
- add new sections to the DEFAULTS and CLASSES dictionaries
- add the new cloud name to the
'cloud'
enum flag.
Lastly, please update instructions in the toplevel README.md file to
explain how to use pkb.py
for the new cloud provider. This is
especially important for required cloud-specific initialization such
as setting up accounts and authentication.
You may also want to update the provider contact wiki page.