-
Notifications
You must be signed in to change notification settings - Fork 77
feat(workflows-local-runner-unifiedstorage): update start container script to reslove symlink and use its target for mount #709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…cript to reslove symlink and use its target for mount
Were you able to verify a successful workflow run after the local runner started? |
yeah please see the "How Has This Been Tested?" section in pr description |
@@ -7,6 +7,8 @@ is_s3_storage=${1:-"1"} # Default to 1 (Git storage) if no parameter is passed | |||
if [ "$is_s3_storage" -eq 0 ]; then | |||
PROJECT_DIR="$HOME/shared" | |||
echo "Project is using S3 storage, project directory set to: $PROJECT_DIR" | |||
MOUNT_DIR=$(readlink -f $PROJECT_DIR) # get the symlink source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not have unit tests for this file ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have unit tests for this file here: https://github.com/aws/sagemaker-distribution/blob/main/test/test_artifacts/v2/scripts/run_sagemaker_workflows_tests.sh
basically checks if healthchecker + airflow APIs are running (containers are up successfully) so should cover the e2e experience
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to check if the symlink source was used for mounting the local runner in the unit test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean to check the /mnt/custom-file-system/s3/shared
is the actual path used for mount in local runner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, basically ensuring that we are mounting the symlink source of "$HOME/shared"
Description
With unified storage projects, Workflows local runner is going to mount on the new shared directory /home/sagemaker-user/shared/. As the
/home/sagemaker-user/shared/
is a symlink created from/mnt/custom-file-systems/..
(s3fs mounted) and symlinks are currently rejected by LL docker proxy for mount, so we've decided to go with option1 proposed in this doc as a short-term solution:https://quip-amazon.com/PGgoAw5bTv5N/1-Pager-LL-docker-proxy-change-for-SMUS-workflows-local-sidecar-container, that is the local runner would use the resloved symlink target for mount.
Type of Change
Release Information
Does this change need to be included in patch version releases? By default, any pull requests will only be added to the next SMD image minor version release once they are merged in template folder. Only critical bug fix or security update should be applied to new patch versions of existed image minor versions.
If yes, please explain why:
[Explain the criticality of this change and why it should be included in patch releases]
How Has This Been Tested?
tested in SMUS with updated scripts + docker-compose file, created
/home/sagemaker-user/d/c
symlinked from/home/sagemaker-user/c
restart the local runner container and was able to see all 3 container started successfully:
Checklist:
Test Screenshots (if applicable):
Related Issues
[Link any related issues here]
Additional Notes
[Any additional information that might be helpful for reviewers]