Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Azure][Storage] Update default storage account naming with subscription id hash #3796

Conversation

landscapepainter
Copy link
Collaborator

@landscapepainter landscapepainter commented Jul 29, 2024

This resolves #3783.

Tested (run the relevant ones):

  • Code formatting: bash format.sh

  • Any manual or new tests for this PR (please specify below)

    • launching on azure blob storage by switching subscription IDs
    1. az login to subscription A
    2. run sky launch azure_blob.yaml with subscription A, which creates/uses default storage account #A
    3. az logout and az login to subscription B
    4. run sky launch azure_blob.yaml with subscription B, whic creates/uses default storage account #B
  • Backward compatibility tests:

    • check if sky managed storages created under old default storage accounts can be fetched after updating to new default storage account name with subscription ID hash in the code base:
    1. create storage under old default storage account without Azure subscription ID hash.
    2. update code so that new default storage account uses Azure subscription ID hash.
      1. If used same storage name from i., then it successfully fetches the sky managed container belonging to the old default storage account.
      2. If used different storage name from i., then it creates a new default storage account with subscription ID hash, and creates the container under it.
  • Smoke tests pytest tests/test_smoke.py::TestStorageWithCredentials --azure

  • pytest tests/test_smoke.py::test_azure_storage_mounts_with_stop --azure

  • sky launch comp_test.yaml --cloud azure -y

comp_test.yaml:

file_mounts:
  #### Azure Blob tests ####
  /az-sky-managed-copy:
    name: az-sky-managed-copy
    source: ~/sky_logs
    store: azure
    mode: COPY

  /az-sky-managed-mount:
    name: az-sky-managed-mount
    source: ~/sky_logs
    store: azure
    mode: MOUNT

  /az-external-private-by-user:
    source: https://mystorageaccount.blob.core.windows.net/az-external-private-by-user-dy
    mode: MOUNT

  /az-external-public-by-user:
    source: https://mystorageaccount.blob.core.windows.net/az-external-public-by-user-dy
    mode: MOUNT

  /az-external-public-not-by-user:
    source: https://azureopendatastorage.blob.core.windows.net/nyctlc
    mode: MOUNT

  #### S3 tests ####
  /s3-sky-managed-copy:
    name: s3-sky-managed-copy
    source: ~/sky_logs
    store: s3
    mode: COPY

  /s3-sky-managed-mount:
    name: s3-sky-managed-mount
    source: ~/sky_logs
    store: s3
    mode: MOUNT

  /s3-external-private-by-user:
    source: s3://s3-external-private-by-user-dy
    mode: MOUNT

  /s3-external-public-by-user:
    source: s3://s3-external-public-by-user-dy
    mode: MOUNT

  /s3-external-public-not-by-user:
    source: s3://digitalcorpora
    mode: MOUNT

  #### GCS tests ####
  /gcs-sky-managed-copy:
    name: gcs-sky-managed-copy
    source: ~/sky_logs
    store: gcs
    mode: COPY

  /gcs-sky-managed-mount:
    name: gcs-sky-managed-mount
    source: ~/sky_logs
    store: gcs
    mode: MOUNT

  /gcs-external-private-by-user:
    source: gs://gcs-external-private-by-user-dy
    mode: MOUNT

  /gcs-external-public-by-user:
    source: gs://gcs-external-public-by-user-dy
    mode: MOUNT

  /gcs-external-public-not-by-user:
    source: gs://gcp-public-data-sentinel-2
    mode: MOUNT

workdir: ~/yaml

setup: |
  echo hi

run: |
  # Show workdir
  ls -l .
  # Show private/public storage copy/mount
  ls -l /s3-sky-managed-copy
  ls -l /s3-sky-managed-mount
  ls -l /s3-external-private-by-user
  ls -l /s3-external-public-by-user
  ls -l /s3-external-public-not-by-user
  ls -l /gcs-sky-managed-copy
  ls -l /gcs-sky-managed-mount
  ls -l /gcs-external-private-by-user
  ls -l /gcs-external-public-by-user
  ls -l /gcs-external-public-not-by-user
  ls -l /az-sky-managed-copy
  ls -l /az-sky-managed-mount
  ls -l /az-external-private-by-user
  ls -l /az-external-public-by-user
  ls -l /az-external-public-not-by-user
  # Write files on a mounted storage with access
  date > /s3-sky-managed-mount/hellotest.txt
  date > /s3-external-private-by-user/hellotest.txt
  date > /s3-external-public-by-user/hellotest.txt
  date > /gcs-sky-managed-mount/hellotest.txt
  date > /gcs-external-private-by-user/hellotest.txt
  date > /gcs-external-public-by-user/hellotest.txt
  date > /az-sky-managed-mount/hellotest.txt
  date > /az-external-private-by-user/hellotest.txt
  date > /az-external-public-by-user/hellotest.txt
  # Confirm file was written
  cat /s3-sky-managed-mount/hellotest.txt
  cat /s3-external-private-by-user/hellotest.txt
  cat /s3-external-public-by-user/hellotest.txt
  cat /gcs-sky-managed-mount/hellotest.txt
  cat /gcs-external-private-by-user/hellotest.txt
  cat /gcs-external-public-by-user/hellotest.txt
  cat /az-sky-managed-mount/hellotest.txt
  cat /az-external-private-by-user/hellotest.txt
  cat /az-external-public-by-user/hellotest.txt

@landscapepainter
Copy link
Collaborator Author

@cblmemo @romilbhardwaj @Michaelvll ran additional backwards compatibility test and noted at the PR description at the top. This is ready for a look!

Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sky/adaptors/azure.py Outdated Show resolved Hide resolved
# subscription, storage account names must be globally unique across all of
# Azure users. Hence, the storage account name includes the subscription
# hash as well to ensure its uniqueness.
DEFAULT_STORAGE_ACCOUNT_NAME = 'sky{region}{user_hash}{subscription_hash}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like azure allows upto 24 characters for account name. With a 4 char for subscription hash, 8 chars for user hash and 3 chars for sky, we are getting dangerously to the limit (9 characters left). E.g., using region southcentralus (14 chars) may cause Azure to return an error.

Can you check if we need to shorten it? If so, perhaps we can truncate the user hash and subscription has to 2 characters each?

Copy link
Collaborator Author

@landscapepainter landscapepainter Aug 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. We indeed have to shorten this. It seems like the longest region name is australiasoutheast with 18 chars. So keeping subscription hash and user hash to 2 characters each would not work as the length becomes 25 with the prefix, sky. Also using 1 character for either subscription hash or user hash seemed to be concerning as well. So I hashed the region and kept it to have length of 4. So this default storage account name will have a fixed length of 19 characters now.

@landscapepainter
Copy link
Collaborator Author

@romilbhardwaj This is ready for another look!

Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @landscapepainter. LGTM if tests pass.

@landscapepainter landscapepainter added this pull request to the merge queue Aug 20, 2024
Merged via the queue into skypilot-org:master with commit 2691d4b Aug 20, 2024
20 checks passed
@landscapepainter landscapepainter deleted the azure-storage-account-subscription-fix branch August 20, 2024 01:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Azure blob] Storage account already exists in a different subscription
2 participants