-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add reusable workflow to build openedx images #1
Changes from all commits
e8aecee
a934cdf
b3f0c8b
1fdf90e
926ef7a
98a2821
39fd6a0
142d336
19bfd1b
934e261
acb78ae
90db1b4
e7054ad
89726a6
35495fc
1dddf54
05e8b01
6e56a6e
55095fd
3c70de2
13da45d
b4c9f68
7c370f9
23da84a
7cbd2e6
a43ca13
abf7ca7
5765f2b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,171 @@ | ||
name: Picasso V2 | ||
|
||
on: | ||
workflow_call: | ||
inputs: | ||
STRAIN_REPOSITORY: | ||
description: 'The repository for strains to checkout. It should follow the format: ORGANIZATION/REPO' | ||
required: true | ||
type: string | ||
STRAIN_REPOSITORY_BRANCH: | ||
description: 'The branch of the repository to checkout' | ||
required: true | ||
type: string | ||
STRAIN_PATH: | ||
description: 'The path within the repository for strains' | ||
required: true | ||
type: string | ||
SERVICE: | ||
description: 'The service name to build' | ||
required: true | ||
type: string | ||
secrets: | ||
DOCKERHUB_USERNAME: | ||
description: 'DockerHub username for login' | ||
required: true | ||
DOCKERHUB_PASSWORD: | ||
description: 'DockerHub password for login' | ||
required: true | ||
SSH_PRIVATE_KEY: | ||
description: 'Service user SSH key for repository checkout' | ||
required: true | ||
|
||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- name: Login to DockerHub | ||
uses: docker/login-action@v3 | ||
with: | ||
username: ${{ secrets.DOCKERHUB_USERNAME }} | ||
password: ${{ secrets.DOCKERHUB_PASSWORD }} | ||
|
||
- name: Checkout strains repository | ||
uses: actions/checkout@v4 | ||
with: | ||
repository: ${{ inputs.STRAIN_REPOSITORY }} | ||
ref: ${{ inputs.STRAIN_REPOSITORY_BRANCH }} | ||
ssh-key: ${{ secrets.SSH_PRIVATE_KEY }} | ||
|
||
- name: Set TVM project name and version | ||
shell: bash | ||
working-directory: ${{ inputs.STRAIN_PATH }} | ||
run: | | ||
if [ ! -e "./config.yml" ]; then | ||
echo "ERROR: file config.yml doesn't exist" | ||
exit 1 # terminate and indicate error | ||
fi | ||
|
||
# Initialize variables | ||
strain_name="" | ||
strain_tutor_version="" | ||
|
||
# Read config.yml line by line | ||
while IFS= read -r line || [[ -n $line ]]; do | ||
if [[ "$line" == *":"* ]]; then | ||
name=$(echo "$line" | cut -d ":" -f1) | ||
value=$(echo "$line" | cut -d ":" -f2- | awk '{$1=$1};1' | tr -d '\r') | ||
|
||
case $name in | ||
*"TUTOR_APP_NAME"*) | ||
strain_name=$value | ||
;; | ||
*"TUTOR_VERSION"*) | ||
strain_tutor_version=$value | ||
;; | ||
esac | ||
fi | ||
done < "config.yml" | ||
|
||
# Check if variables are set | ||
if [ -z "$strain_name" ]; then | ||
echo "ERROR: TUTOR_APP_NAME not found in the given config.yml" >&2 | ||
exit 1 | ||
fi | ||
|
||
if [ -z "$strain_tutor_version" ]; then | ||
echo "ERROR: TUTOR_VERSION not found in the given config.yml" >&2 | ||
exit 1 | ||
fi | ||
|
||
echo "TUTOR_APP_NAME=${strain_name}" >> $GITHUB_ENV | ||
echo "TUTOR_VERSION=${strain_tutor_version}" >> $GITHUB_ENV | ||
|
||
- name: Install TVM | ||
shell: bash | ||
working-directory: ${{ inputs.STRAIN_PATH }} | ||
run: | | ||
pip install git+https://github.com/eduNEXT/tvm.git | ||
|
||
- name: Configure TVM project | ||
shell: bash | ||
working-directory: ${{ inputs.STRAIN_PATH }} | ||
run: | | ||
tvm install $TUTOR_VERSION | ||
tvm project init $TUTOR_APP_NAME $TUTOR_VERSION | ||
|
||
# This command copies all the files and folders | ||
# from the repository STRAIN_PATH to the recently above created TVM PROJECT folder | ||
cp -r $(ls --ignore=$TUTOR_APP_NAME) $TUTOR_APP_NAME/ | ||
|
||
- name: Enable distro plugin | ||
shell: bash | ||
working-directory: ${{ inputs.STRAIN_PATH }}/${{ env.TUTOR_APP_NAME }} | ||
run: | | ||
. .tvm/bin/activate | ||
|
||
major_version=$(pip show tutor | grep Version | awk '{print $2}' | cut -d'.' -f1) | ||
distro_version=$( | ||
git ls-remote --tags https://github.com/eduNEXT/tutor-contrib-edunext-distro \ | ||
| grep -E "$major_version\\.[0-9]+\\.[0-9]+$" \ | ||
| tail -n 1 \ | ||
| awk -F'/' '{print $NF}' \ | ||
) | ||
|
||
pip install git+https://github.com/eduNEXT/tutor-contrib-edunext-distro@$distro_version | ||
tutor plugins enable distro | ||
Comment on lines
+115
to
+127
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [comment to reviewers]: I removed these lines from the original script since we're not using the plugin index to install distro:
Please let me know if you think otherwise. |
||
|
||
- name: Setup SSH agent for private repositories cloning | ||
uses: webfactory/[email protected] | ||
with: | ||
ssh-private-key: ${{ secrets.SSH_PRIVATE_KEY }} | ||
|
||
- name: Add GitHub to known hosts | ||
shell: bash | ||
run: | | ||
ssh-keyscan github.com >> ~/.ssh/known_hosts | ||
|
||
- name: Execute extra commands | ||
shell: bash | ||
working-directory: ${{ inputs.STRAIN_PATH }}/${{ env.TUTOR_APP_NAME }} | ||
run: | | ||
. .tvm/bin/activate | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [comment to reviewers]: Which environment variables does this script set? I think we can store those in the Github Environment, so there's no need to activate the tvm environment before running each step. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that we could find a workaround to avoid activating the environment in each step, but I think we can leave it as an improvement for another PR |
||
tutor distro run-extra-commands | ||
|
||
- name: Prepare docker if building MFE | ||
if: ${{ inputs.SERVICE == 'mfe' }} | ||
shell: bash | ||
run: | | ||
echo "[worker.oci]" > buildkit.toml | ||
echo "max-parallelism = 2" >> buildkit.toml | ||
docker buildx create --use --node=max2cpu --driver=docker-container --config=./buildkit.toml | ||
Comment on lines
+150
to
+152
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [note to author]: add a comment explaining why we're doing this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this connection error with the parallel build also occur in the Jenkins pipeline? Could this potentially slow down build times or impact us in other ways? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I asked internally about the Jenkins agent's configuration but am still waiting for an answer so I'll ask again. I'll post here once I know more. And yes, limiting parallelism will slow down the building process compared to the default buildkit configuration. Here's a successful build with this buildkit config, which takes about 20 minutes. I'll compare this time with our current building time for MFEs in Jenkins. I was also thinking about taking this to the Tutor Users' Group meeting because it seems like it hasn't been addressed upstream yet. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with you that this is not a blocker for the MVP. I opened this issue so we can work on it later: #2 |
||
|
||
- name: Build service image with no cache | ||
shell: bash | ||
working-directory: ${{ inputs.STRAIN_PATH }}/${{ env.TUTOR_APP_NAME }} | ||
env: | ||
SERVICE: ${{ inputs.SERVICE }} | ||
run: | | ||
. .tvm/bin/activate | ||
tutor config save | ||
tutor images build $SERVICE --no-cache | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [note to author]: I don't think we need the --no-cache argument. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you explain why you think we wouldn't need this? Especially considering that Picasso builds the images with the same --no-cache argument, and we want to compare the build times as fairly and accurately as possible. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Of course. Our current jenkins setup uses dynamically provisioned agents, each with a max number of concurrent builds. As I understand, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Alec4r @MaferMazu, please correct me if I'm wrong. Thanks! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can decide it after get some numbers and make some tests with the MVP, but I agree with you so far. |
||
|
||
- name: Push service image to DockerHub | ||
shell: bash | ||
working-directory: ${{ inputs.STRAIN_PATH }}/${{ env.TUTOR_APP_NAME }} | ||
env: | ||
SERVICE: ${{ inputs.SERVICE }} | ||
run: | | ||
. .tvm/bin/activate | ||
tutor images push $SERVICE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[comment to reviewers]: this is a private repository, so only repositories from the edunext organization can use the workflow, see this documentation for more details. According to our previous discussions, the plan for this workflow is also to be used by other organizations. Which means this repository should be public.
Is that feasible for this repository? Considering that the Jenkins pipeline implementation is private as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first glance, I don't see any major issues with making this repo public. I would like to hear from those who worked on the Jenkins implementation to see if this was previously discussed and to understand if there was any specific reason for keeping the implementation private. It might be simply that they didn't see a need for it to be public since Jenkins allows access to the service for as many people as needed without requiring access to the script repository (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, I should've added this context to my previous comment but I forgot. I created the dedalo-scripts repository and also committed the first changes. I didn't know at the time whether it would be a long-term solution or what implementations would be hosted there, so I used private as the default visibility setting. Now, I'd argue against this decision. The dedalo-scripts repository looks like a team-specific solution, but it shouldn't be, mainly because it's solving the tech-wide problem of how to build Open edX images.
Knowing this, this repo should be public regardless of the decisions made about the visibility of dedalo-scripts.