Retagger improvements #3408
Labels
componenet/registries
epic/2023registries
kind/epic
needs/refinement
Needs refinement in order to be actionable
team/honeybadger
Team Honey Badger
As part of our registries efforts, we wanted to improve retagger as well. This ticket catches more details about that task.
How retagger currently works
Retagger is a go CLI and a huge circle CI build pipeline. In general, there are 2 modes, executed in the CI pipeline one after the other:
skopeo-*.yaml
files.They are synced in the following way:
retagger filter
is invoked for eachskopeo-*.yaml
file. Internally, thisretagger
command works as follows:skopeo sync
with the singleskopeo-*.yaml
file in dry mode. This lists all the tags available for all the images in the source repository (so it "expands" semver ranges and tags with regular expressions into specific tags).skopeo-*.yaml.filtered
retagger sync
is run for that*.filtered
file to do actual synchronization.split
thensync
above comes, AFAIR, from two sources, the main one being that some time ago it was impossible to runsync
on semver ranges, and we had to discover and list the images before we were able to useskopeo
for sync. There was something about better performance as well, but I'm not sure if that argument holds after semver ranges are now supported in scopeo.The idea is that if someone wants an upstream image, but with some minor change (like adding
USER <UID>
),retagger
can do this as well: it will build the image, then upload it to a target registry. It is important to note, that majority of "custom images" actually only include image rename. The main reason for that seems to be to avoid the confusion about what this image does (if we replicatebitnami/postgresql
intogiantswarm/postgres
, we no longer know which build of postgres is that, so to avoid that we rename the image intogiantswarm/bitnami-postgres
; this is no longer need - see "current problems" below). The rest of customized images actually mutates the source image.The build works like this, in a loop for each image in the
customized-images.yaml
retagger
creates aDockerfile
, that hasFROM <src image>
set using thecustomized-images.yaml
entry, then runsdocker build + tag
anddocker push
on it.Current problems
retagger
synchronizes all the images into the target repositorygiantswarm/*
, so the original repository name is lost, for ex.postresql/postgresql
image from dockerhub becomesgiantswarm/postgresql
. This was necessary when we were replicating images from docker hub to our own docker hub account, as all of the docker hub is available as a single domain. Unfortunately, this creates a problem, as by looking at images in a registry, it is impossible to tell if a specific image is really custom and build by Giant Swarm, or a verbatim copy from upstream.retagger
is building images. This is a problem on a complexity level (a lot has to be setup and working forretagger
to do that), and also on a logical level (retagger is supposed to synchronize images, not build them). It is also super important to note, that right now this "customized images" support is the only real difference betweenretagger
and just plain upstreamskopeo
. If we stop doing this, we don't need retagger at all. Also, this build process makes it harder to prepare images for security extensions we have in mind (singing and clear ownership).Desired state
retagger
and replace it with vanilla upstreamskopeo
.postresql/postgresql
image from docker hub synchronized togsoci.azurecr.io
becomesgsoci.azurecr.io/postgresql/postgresql
and notgsoci.azurecr.io/giantswarm/postgresql
.Migration plan
skopeo sync
runsskopeo copy
runs - necessary renamesskopeo
for themcustomized-images.yaml
into a new repo with everything already (mostly) set up for building the image (should this be a single repo for all the images?). Make sure repos have owners and renovate for automated images updates.customized-images.yaml
Tasks
Tasks
The text was updated successfully, but these errors were encountered: