Improve retention policy and management of node and ci images #767
Labels
lifecycle/frozen
Indicates that an issue or PR should not be auto-closed due to staleness.
triage/accepted
Indicates an issue is ready to be actively worked on.
Current Situation
Currently, the retention policy saves the last 5 images, but this isn't very safe. The process to take a new image is manual and lacks visibility into which image is currently in use. You have to be a Jenkins Admin to see and change the image for CI, which means someone with triggering rights to the build can start it without knowing they might erase the actively used image from OpenStack.
This problem also applies to node images. However, everyone currently has visibility into both Artifactory and the dev-env code, allowing them to see what image is used and understand how a new trigger will affect it.
What needs to be fixed
To address these issues, we need to ensure that the actively used image is never deleted. We also need a way to ensure that if the active image is changed, the new image will work properly through some testing. Additionally, any changes to an image build should be testable in the PR before merging.
Potential solution
A potential solution could involve having the active image with a separate naming convention from the candidate images. For promotion, there would be a pipeline that takes a candidate image as input, runs tests on it, and if the tests pass, automatically changes the active image to the candidate.
Note: Jenkins also offers an artifactory plugin which supports promtion logic out of the box which could be investigated. Not sure if same exists for the openstack plugin
By implementing these changes, we can increase the reliability and safety of our image retention process, improve coordination among team members triggering builds, and reduce the risk of active image overwriting and build failures. Testing new images before they become active will ensure they are reliable and functional, providing a smoother and more predictable CI/CD process.
The text was updated successfully, but these errors were encountered: