-
Notifications
You must be signed in to change notification settings - Fork 31
Pipeline Fundamentals
This document is an exploration of moving towards a generic container based pipeline.
This design is heavily influenced by the open source drone.io project. Note that although the configuration is based on drone, we intend to continue to build on Jenkins using the existing pipeline code as a base.
- self-onboarding
- good documentation and examples
- self-documenting config
- easy to reason about
- good documentation and examples
- easy to test in isolation
- able to extend without modifying core pipeline code
- multiple people / groups can extend
- new deployment options
- not too tied to jenkins
A pipeline is a list of steps to build, test, and deploy code.
Each step is self contained and implemented as a container. The pipeline workflow is reponsible for executing each step, but does not need to understand the step internals.
If a step returns a non-zero exit code, the pipeline aborts and returns a failure status.
All steps share the same workspace, but otherwise are not connected.
All of the stage logic for building, testing, deploying, etc. will be moved out of the pipeline and into containers.
The pipeline is just a generic container execution engine. The pipeline has no expectation of which steps are present or what they are named. The pipeline will be the only component that interfaces with jenkins (to schedule stage execution).
Each individual stage container will have no interface with jenkins or other steps. They will share a workspace, but otherwise there is no communication between these components. We may add a mechanism for sharing state in the future (see below), but in general each step is a generic container step.
When adding new types of builds, deployments, or other steps (e.g. copying config from DUDE to SCCS), all of the logic will be implemented in containers. The existing pipeline code won't need to be updated.
Note, this specifically calls out some configuration we could continue to provide standard conventions and defaults for (see below).
pipeline:
appName: open-source-poet
appVersion:
master: 2.4.2
feature/jdk_add_tmorootca: jdk_rootca.2.4.1
environment:
# user specific environment variables to pass to all steps
# also available in pipeline config
# none of these names are special or understood by the pipeline
# they can reference the standard environment variables (see below)
IMAGE_NAME: ${PIPELINE_APP_NAME}/${FULL_BRANCH_NAME_WITHOUT_FORWARD_SLASH}
BUILD_TAG: ${PIPELINE_APP_VERSION}.${PIPELINE_BUILD_NUMBER}
BUILD_CONTAINERS: registry.hub.docker.com/library
steps:
- name: jar-build
image: ${BUILD_CONTAINERS}/maven:3.6.1-jdk-12
commands:
- mvn -s /apps/tools/.m2/settings.xml clean package
- name: docker-image-build
image: ${BUILD_CONTAINERS}/docker
commands:
- docker build -t ${IMAGE_NAME} .
- name: docker-image-test
image: ${BUILD_CONTAINERS}/docker-compose
commands:
- cd src/test/docker
- chmod +x run-tests.sh
- ENABLE_JENKINS_WORKAROUNDS=1 ./run-tests.sh ${IMAGE_NAME}
- name: docker-image-publish
image: ${BUILD_CONTAINERS}/docker
environment:
repo: ${IMAGE_NAME}
tags:
- latest
- ${BUILD_TAG}
registries:
- registry: registry.hub.docker.com/library
cred: docker_registry
- registry: registry.hub.docker.com/library
registryPath: scc-docker-release-local
cred: svc_scc_prd_cicd
- name: slack
image: ${BUILD_CONTAINERS}/slack
environment:
CHANNEL: my_room
when:
# by default steps will only execute while `PIPELINE_STATUS` is `success`, but we
# can override
status: [ success, failure ]
Note that above example specifically calls out configuration elements that are performed automatically and standardized by convention in the current pipeline (e.g. passing in the image name and tags explicitly to the docker build and publish steps).
We can still continue to standardize on these conventions and provide them as defaults.
Since there's nothing special about the build steps, and we can provide any container, we can still build these defaults into a specific container.
For example, in the jenkins/build-container/docker
container, we could default both image name and list of tags to our current defaults.
Reducing the above example to:
- name: docker-image-publish
image: ${BUILD_CONTAINERS}/docker
environment:
REGISTRIES:
...
i.e. remove/default the repo
to ${PIPELINE_APP_NAME}/${FULL_BRANCH_NAME_WITHOUT_FORWARD_SLASH}
and remove/default the tags
to latest, ${PIPELINE_APP_VERSION}.${PIPELINE_BUILD_NUMBER}
Note that there's still nothing special about this step or container, it's just defaulting config based on env variables
As mentioned, each step is just a container with configuration.
Drone makes a differention between a "build step" and a "plugin", in that "build steps" require a command or list of commands, but plugins do not (they have an explicit entrypoint defined).
I'm not sure we necessarily need to make the same distinction, but for the purpose of this document we will continue to use their terminalogy.
As mentioned, build steps are just containers and a list of arbitrary commands to execute. The commands are executed using the workspace as the working directory
pipeline:
steps:
- name: build
image: maven:3.5-jdk-8
commands:
# just separated for an example
- mvn compile
- mvn test
The above commands are converted to a simple shell script. The commands in the above example are roughly converted to the below script:
#!/bin/sh
set -e
mvn compile
mvn test
TODO: can we also set -u
, set -o pipefail
?
The above shell script is then executed as the docker entrypoint. The below docker command is an (incomplete) example of how the script is executed:
docker run --entrypoint=build.sh maven:3.5-jdk-8
Step-containers are arbitrary docker containers that perform tasks in a pipeline. They can be used to deploy code, publish artifacts, send notifications, etc. Some useful examples could be found under Step-Containers section. See the drone plugin marketplace for more out-of-the-box examples.
Example slack notification plugin:
pipeline:
steps:
- name: notify
image: plugins/slack
environment:
ROOM: my_room
In addition to the standard build environment variables, they can be configured by adding a environment section to the step, with arbitrary plugin-specific configuration. These elements will be passed on to the plugin via environment variables. Only simple key/value pairs are supported, but you can pass complex configuration by either JSON serializing the values, or using multi-line YAML strings.
Example:
pipeline:
steps:
- name: my-step
image: plugins/my_step
environment:
GREETING: hello
JSON_OBJECT: >
{
"title": "${PIPELINE_APP_NAME} | ${PIPELINE_BRANCH} | ${PIPELINE_APP_VERSION}.${PIPELINE_BUILD_NUMBER}",
"title_link": "${PIPELINE_RUN_DISPLAY_URL}",
"fields": [
{
"title": "Docker images published",
"value": "${DOCKER_REGISTRY}/${IMAGE_NAME}:${IMAGE_TAG}",
"short": false
}
]
}
The step-container environment would contain
GREETING=hello
JSON_OBJECT=[{"title": "${PIPELINE_APP_NAME} | ${PIPELINE_BRANCH} | ${PIPELINE_APP_VERSION}.${PIPELINE_BUILD_NUMBER}"}, {"title_link":"${PIPELINE_RUN_DISPLAY_URL}"},
{"fields":{"title": "Docker images published", "value": "${DOCKER_REGISTRY}/${IMAGE_NAME}:${IMAGE_TAG}", "short": false}}]
No assumptions are made or required about step-container implementation.
This example section is borrowed from drone's Example Bash Plugin. It's included to show that step-containers can be simple (just call a program or script) and don't need pipeline dependencies.
This provides a brief tutorial for creating a webhook step-container, using simple shell scripting, to make an http requests during the build pipeline. The below example demonstrates how we might configure a webhook step-container in the yaml file:
pipeline:
steps:
- name: webhook
image: foo/webhook
environment:
URL: http://foo.com
METHOD: post
BODY: |
Hello world
Create a simple shell script that invokes curl using the Yaml configuration parameters, which are passed to the script as environment variables.
#!/bin/sh
curl \
-X ${METHOD} \
-d ${BODY} \
${URL}
Create a Dockerfile that adds your shell script to the image, and configures the image to execute your shell script as the main entrypoint.
FROM alpine
ADD script.sh /bin/
RUN chmod +x /bin/script.sh
RUN apk -Uuv add curl ca-certificates
ENTRYPOINT /bin/script.sh
Build and publish your step-container to the Docker registry.
docker build -t foo/webhook .
docker push foo/webhook
We can use conditions based on branch, event, status, implemented as needed.
E.g.
when:
status: [ failure, success ]
when:
branch: master
when:
event: [push, pull_request, tag]
See more details and examples here.
POET pipeline may want to also support conditional execution based on environment variables, to support things like conditional deployment, or conditional config pushes.
when:
environment:
DEPLOY_CONFIG: true
The complete list of variables are published here for reference.
We can continue to add to this list. All environment variables should be namespaced with PIPELINE_
.
As mentioned, there's nothing special about build steps or step-containers. Any user could use any arbitrary image in their pipeline. It still makes sense for us to provide containers for common uses. Some useful examples could be found under Step-Containers section, as mentioned earlier.
Each step is executed using a shared docker volume.
For the first revision, this will be the only shared state between steps. See Sharing Data below, for more.
Besides sharing the source and any artifacts between steps, we can utilize the workspace for ad-hoc data sharing between plugins.
E.g. if we needed to dynamically generate a list of tags for a docker image, we could do something like:
pipeline:
steps:
- name: generate-tag-list
image: ubuntu:14.04
commands:
- echo -e "1.0\n1.1\n1.2\n" > .tags
- name: publish-docker-image
image: build-containers/docker
tag-input-file: .tags
I.e. allow the step-containers to take input via file, and generate the config file in a previous step. Obviously this requires some coordination between the steps.
We can pursue more formalized/generic mechanisms for sharing data (see below in Future Work), but I believe we can push this out.
Note: this section is a bit rough and could use improvement/feedback/suggestions.
Credentials will be provided to containers via environment variables.
POET pipeline could support:
- jenkins credentials
- vault credentials
- SCCS config/credentials
Should we also support pulling arbitrary config from SCCS?
Eventually for AWS, we may want to support AWS Systems Manager Parameter Store, AWS Secrets Manager.
To connect to Vault or SCCS, you will need Vault or SCCS credentials.
The most straightforward way seems to be to keep those credentials in jenkins. Alternatively, we could come up with a private/public key or shared secret encryption scheme, but then those keys would need to be in jenkins.
If we store the SCCS or Vault passwords in the jenkins credential store, then that's just one managed secret per team. The team can then manage their own secrets in vault or SCCS.
More generally, we could say that we allow multiple secrets providers and that there's an initialization order such that later secrets providers can use previously provided secrets to bootstrap.
As mentioned, we will have multiple implementations.
We could implement these as docker images... I wonder if we should implement them as "internal" step-containers to start? i.e. groovy/java code that follows an interface?
secrets:
providers:
- name: jenkins
...
- name: vault
environment:
user: ...
...
secrets:
- source: my/vault/path/docker_username
target: DOCKER_USERNAME
- source: my/vault/path/slack_bot_token
target: SLACK_TOKEN
- name: sccs
environment:
application: ...
profile: ...
...
- name: sccs_prod
environment:
application: ...
profile: prod
...
We may also want to allow multiple instances with different configs? (i.e. two SCCS data sources)?
steps:
- name: notify
image: plugins/slack
environment:
room: sccs_dev
secrets: [ SLACK_TOKEN, ... ]
steps:
- name: notify
image: plugins/slack
environment:
room: sccs_dev
secrets:
- source: SLACK_PROD_TOKEN
target: SLACK_TOKEN
For providing secrets to each stage, we will use the mechanism above.
For pulling the step/plugin images to run each stage, do we need a way to provide credentials to those repos? Or will jenkins have them? We may want to have a more global credential mechanism?...
- Jenkins provides a mechanism to restrict its secrets to user/job
- Do we want to do that for external secrets?
- Would it just be a config thing? How else would we enforce?
- Do we want to do that for external secrets?
- Do we want to be able to restrict secrets for certain events?
- E.g. a malicious PR could expose secrets
- e.g. a new script that adds echo $MY_SECRET
- probably not a big deal for internal use cases?
- e.g. a new script that adds echo $MY_SECRET
- E.g. a malicious PR could expose secrets
The logic could be added to a new container. The nice part is that it would not affect the existing pipeline logic, config format, etc.