-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add envfile rotation DAG #4954
Add envfile rotation DAG #4954
Conversation
Here are my suggestions for testing:
|
Okay, thanks! I'll go with the specific-connection approach first, I do remember now using that to test the RDS snapshot rotation DAG before. |
39d84ca
to
9a2a453
Compare
9a2a453
to
bde1b33
Compare
bde1b33
to
9360dba
Compare
Full-stack documentation: https://docs.openverse.org/_preview/4954 Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again. You can check the GitHub pages deployment action list to see the current status of the deployments. Changed files 🔄: |
Based on the contributor urgency of this PR, the following reviewers are being gently reminded to review this PR: @AetherUnbound Excluding weekend1 days, this PR was ready for review 9 day(s) ago. PRs labelled with contributor urgency are expected to be reviewed within 3 weekday(s)2. @sarayourfriend, if this PR is not ready for a review, please draft it to prevent reviewers from getting further unnecessary pings. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested it and worked as mentioned and expected. Great!
I was wondering how you set up the AWS connection. I used my own credentials, but I think there should be an alternative with more specific permissions and limitations, as you describe.
Co-authored-by: Krystle Salazar <[email protected]>
I used my regular credentials as well. By the way, @krysal I'm unable to merge this PR due to permissions, it seems, so I'll have to ask you to do it. After merging, you'll need to enable the DAG in Airflow. However, before enabling it, you'll need to add IAM permissions for the environments buckets to the Airflow execution role in I suggest letting it do a "dry run" first, by leaving |
Fixes
Fixes https://github.com/WordPress/openverse-infrastructure/issues/968 by @sarayourfriend
Description
Adds a DAG to rotate the envfiles in the environment files buckets. The DAG finds the 3 most recent envfiles for each service, and then deletes everything else in the buckets. It will only do a dry-run unless the
ENABLE_S3_ENVFILE_DELETION
Airflow variable is set to True. Otherwise, it just detects and lists the files it will delete.Testing Instructions
The DAG optionally uses a separate AWS connection ID to facilitate testing.
To set up for testing, set up an AWS connection with the ability to describe launch templates, describe launch template versions, describe task definitions, and list bucket objects. (I will share specific IAM permissions later to help reviewers get this, if they need it). Set that to a new AWS connection (call it whatever you want, I used
aws_s3_envfile_rotation
), and then set the name of that connection to a new Airflow variableS3_ENVFILE_ROTATION_AWS_CONN_ID
. Do not setENABLE_S3_ENVFILE_DELETION
when you are testing as you just want a dry run, which is the default behaviour.Then enable the DAG and let it run (visit the DAG's page here: http://localhost:9090/dags/rotate_envfiles/grid?search=rotate_envfiles). You should see it log all the currently-in-use environment files in the two tasks to detect them from launch templates and task definitions (one task for each). In the logs for
identify_stale_envfiles
, you will see it find some production envfiles that are in the format{service}/{container}/.env
, that is, specifically without the hash in the final key portion. I created these when I was working on the feature, before I decided to use the hashed file names. They are safe to delete, and will be the first ones that get cleaned up by the live DAG when we enable it after deploying it. The DAG considers these stale because there are no launch templates or task definitions referencing them. Neat!Critically, you should see none of the in-use environment files considered stale by
identify_stale_envfiles
. Thedelete_stale_envfiles
will log all the files it would have deleted, had this not been a dry run. Here you can again confirm that you only see it list those that I mentioned above without the hash in the key.Checklist
Update index.md
).main
) or a parent feature branch.ov just catalog/generate-docs
for catalogPRs) or the media properties generator (
ov just catalog/generate-docs media-props
for the catalog or
ov just api/generate-docs
for the API) where applicable.Developer Certificate of Origin
Developer Certificate of Origin