Skip to content
This repository has been archived by the owner on Sep 12, 2022. It is now read-only.

Blueprint for Atmosphere(2) script, "application_to_provider.py"

Chris Martin edited this page Mar 14, 2017 · 28 revisions

Use case: Adding a provider to Atmosphere(1) deployment, need to make applications available on the new provider.

This script makes an Application (a.k.a. image) available on a specified new provider by doing any/all of the following as needed:

  • Populates Glance image metadata
  • Transfers image data from existing provider
    • Image data migrated using Glance API or iRODS for Atmosphere(0)
  • (Future coming soon) if Application uses an AMI-style image, ensures the kernel (AKI) and ramdisk (ARI) images are also present on destination provider, and sets appropriate properties
  • Creates models (ProviderMachine, InstanceSource) in Atmosphere database

Gracefully handles the case where destination provider is already partially populated with image data/metadata (missing information will be added).

Design Assumptions

  • Atmosphere(2) database is the source of truth for image metadata, so every time we touch a Glance image, we overwrite whatever is in Glance database with what Atmosphere(2) knows
  • User/tenant names are consistent across Atmosphere(2) and its providers. In other words, "edwins on provider A is the same project as edwins on provider B"
  • Image UUIDs will be made to match on source and destination provider. If image UUID is already in use on destination provider, that image will be overwritten. This also applies for kernel (AKI) and ramdisk (ARI) images for AMI-style images.

Procedure

Note: the code has evolved and no longer follows these steps exactly. Still keeping this for posterity.

Parse arguments

  • Fetch Application and destination Provider objects from user-provided application ID and provider ID
  • Optionally, fetch source Provider object from user-provided source provider ID

Get Access to Destination Provider

Get an account driver and image manager for destination provider (We don't get these for source provider yet because each ApplicationVersion may have different available source providers)

Get Application-Specific Metadata from Atmosphere(2) and resolve identifiers on destination provider

  • Application owner
    • Determine tenant UUID on destination provider
    • If Application owner is missing from destination cloud, either raise exception or set owner to atmosphere administrator if --ignore-missing-owner set
  • Application members, if image is private
    • Deduplicate Application members and ApplicationVersion members
    • Resolve application member tenant UUIDs on destination provider
    • If Application members are missing from destination cloud, either raise exception or ignore them if --ignore-missing-members set
  • Application tags

Loop for each ApplicationVersion of the specified Application

Choose a Source Provider (for image data + some metadata) and Get Access

  • If specified by user calling script, ensure this ApplicationVersion has an InstanceSource+ProviderMachine for given source provider
  • If not specified, choose one that is not the destination provider
  • Get an account driver and image manager for source provider
  • Look up image UUID in source provider (ProviderMachine.identifier)

Get Image-Specific Metadata from Source Provider

  • Get source image metadata (JSON?) from Glance, which contains these things we need (maybe more?):
    • container_format
    • disk_format
    • size
    • checksum

Check for existing ProviderMachine + InstanceSource for ApplicationVersion on destination provider

  • If exists, we don't need to create one
  • If not exists, we will create one later

Create Glance Image (as needed)

  • If this ApplicationVersion already has a ProviderMachine/InstanceSource for destination provider (determined earlier), determine if InstanceSource's identifier exists as a UUID on destination provider
  • If no InstanceSource on destinaton provider, or InstanceSource's identifier doesn't exist as UUID on destination provider:
    • Create image and keep UUID for later putting into database

Migrate Glance Image Data+Metadata

Populate Image Metadata (do this every time)
  • Fields:
    • name (application.name)
    • container_format (from metadata of image in source provider)
    • disk_format (from metadata of image in source provider)
    • visibility (from Application.private)
    • owner UUID (determined earlier in script)
    • tags (determined earlier in script)
    • application_name (application.name)
    • application_version (app_version.name?)
    • application_description (application.description)
    • application_owner (application.created_by.user?)
    • application_tags?
    • application_uuid (str(application.uuid))
    • min_disk?
    • min_ram?
  • Set members UUIDs (determined earlier in script)
Populate Image Data (do this only if needed)
  • Get checksum of UUID on destination provider, compare with checksum in source provider
    • If they don't match, we need to move data

      • See if already we have a complete local copy of the image in source provider (based on checksum)
        • If not, try downloading until checksum matches (or eventually give up)
      • Upload local copy of image to deestination provider, keep trying until checksums match
    • If using iRODS to transfer:

      • Need to collect some new arguments
      • Derive iRODS path from glance location
      • icp file to iRODS collection for new provider
      • Set irods:// URI location for image in glance

Create/update ProviderMachine and InstanceSource for destination provider

  • If existing ProviderMachine + InstanceSource and we created a new Glance image, update InstanceSource's identifier reflecting new UUID
  • Else,
    • Create new InstanceSource, passing required fields, all of which we now know
      • What to do about created_by_identity when Application owner has no identity on destination cloud? Leave blank?
    • What to do about end_date?

AMI-based images

AMI-based images background

  • Some images are AMI-based, i.e. they have dependent kernel ("AKI") and ramdisk ("ARI") images
  • We know an AMI-based image based on the kernel_id and ramdisk_id custom properties in Glance
  • There is also an "ami" disk_format and container_format but these may not be set properly for all AMI-based images, so we should not trust them
  • Atmosphere(2) doesn't know or care which images are AMI-based

AMI-based images migration strategy

  • When we're migrating an image, see if it is AMI-based (both kernel_id and ramdisk_id are set, throw an exception if only one is set), if so, ami_image = True

When we populate image metadata for AMI-based image:

  • set kernel_id and ramdisk_id to same UUIDs as in source provider
  • disk_format and container_format are always AMI, regardless of what was indicated in the source provider

We also do this for AMI-based images:

  • Confirm that are AKI/ARI images in source provider which match the AMI images's kernel_id and ramdisk_id
  • Get or create AKI and ARI images in destination provider (with same UUID as source AKI/ARIs)
  • Set metadata for AKI and ARI images in destination provider (how exactly?) - Name matches image name in source provider - Anything else?
  • migrate_image_data from AKI and ARI in source provider to destination provider

Conversation with Steve 2017-02-27

  • Steve interested in using iRODS copy to skip glance download/upload, for clouds with iRODS-backed glance. download/upload not only takes longer, has been less reliable with many GBs of image data over HTTPS.

  • Regarding private images and owners/members:

    • In order for private images to work on a new provider, both image ownership and membership must also transfer. Must determine new UUIDs for owner and members on new provider and set accordingly. (Atmosphere should also know this with Application membership and ApplicationVersion membership, to the extent it is used)
    • If owner has identity on desintation provider then set owner to that identity, else set owner to atmosphere administrator
    • If image maintainer does not have an identity on destination provider, script should error out unless ran with something like --force, in which case the atmosphere administrator will become the owner, and only members with identities on new cloud will be transferred
    • Script needs to be idempotent: if image already exists on destination provider with the same name, then only update ownership / membership.
  • Regarding replication between providers:

    • In the past there has been a master/slave relationship where the glance database has been replicated (at SQL level) from one cloud (e.g. IU Cloud) to another (e.g. TACC Cloud). This isn't feasible going forward because we'll be supporting heterogenous versions of OpenStack with different glance DB schemas. We can't replicate image UUIDs to new provider, they will be different
    • This workflow has previously been sort-of-handled by migrate_instance_task in chromogenic: downloads image from authoring provider, runs a bunch of bash to sanitize user-specific parts, uploads image to destination provider. But Steve and I agree that the workflow needed here is Atmosphere-specific and can reasonably exist as an Atmosphere script.
  • Regarding image properties (a.k.a. "metadata"):

    • Script should have an option to migrate or not migrate properties
    • If we do migrate image properties, kernel_id and ramdisk_id won't apply to new cloud, will need to correct them (more on this later)
    • Steve says tags probably out of sync between Atmosphere and OpenStack because Atmosphere sets tags on image creation but not on image update
  • Regarding Amazon AMI-based images:

    • Example: f77cb286-d5fc-4a3e-b15b-98e5a9dd2a86 on iPlant Tucson Cloud
    • Have kernel_id and ramdisk_id properties. These UUIDs point to Amazon AKI and ARI images (respectively) which are required for the main image to boot
      • Since these UUIDs would be different on new cloud, they would need to be corrected during migration process
    • They may be deceptively labeled in Glance as disk_format of qcow2
    • Team has tried unsuccessfully to "convert" these to qcow2 etc in the past
    • "First pass" of script can ignore these but ultimately we must have a strategy for migrating community AMI-based images and phasing out AMI-based base/featured images

Questions

Answered Questions

  • Does glance.images.update() PUT or PATCH (in REST nomenclature)? Does it matter here?

  • Do we care about image members for public images?

    • No, public images cannot have members per Glance API docs
  • Does image owner (in glance) matter? Do we care if it is 'atmoadmin' in all cases?

    • Yes, community-maintained images will be owned by the maintainer, this is important if image is not public
  • Which image metadata does and doesn't matter to Atmosphere(2)?

    • Atmosphere already knows which metadata matters to it. We should (for example) be able to migrate images without
  • How are tags handled, why are tags in Atmosphere(2) sometimes different than tags stored in Glance metadata?

    • Steve says tags probably out of sync between Atmosphere and OpenStack because Atmosphere sets tags on image creation but not on image update, bug to be fixed
  • Source of truth for image owner/members should be Atmosphere(2) or Glance DB for an existing provider?

    • Atmosphere(2)
  • Is Application.created_by in Atmosphere(2) the image owner?

    • No, use app.created_by_identity.project_name()
  • How to determine Atmosphere(2) administrator account? provider.admin.created_by.username?

    • provider.admin.project_name()
  • Do we care about start and end dates for ApplicationVersion? For InstanceSource? Where do we store start and end dates?

    • Make start and end dates the same as InstanceSource of source provider
  • Ensure that Atmosphere administrator is always an image member?

    • Yes
  • Is there a typical "staging directory" that we use on the Atmosphere server for images?

    • /storage?
  • What do about https://wiki.openstack.org/wiki/Glance-v2-community-image-visibility-design? This has changed between Havana and Newton

    • Behavior seems less strict than that document, don't worry about it for now?