Skip to content

Pipeline Fundamentals

Rajan, Sarat edited this page Aug 22, 2019 · 2 revisions

Pipeline Fundamentals

This document is an exploration of moving towards a generic container based pipeline.

This design is heavily influenced by the open source drone.io project. Note that although the configuration is based on drone, we intend to continue to build on Jenkins using the existing pipeline code as a base.

Goals

  • self-onboarding
    • good documentation and examples
      • self-documenting config
    • easy to reason about
  • easy to test in isolation
  • able to extend without modifying core pipeline code
    • multiple people / groups can extend
    • new deployment options
  • not too tied to jenkins

Pipeline

A pipeline is a list of steps to build, test, and deploy code.

Each step is self contained and implemented as a container. The pipeline workflow is reponsible for executing each step, but does not need to understand the step internals.

If a step returns a non-zero exit code, the pipeline aborts and returns a failure status.

All steps share the same workspace, but otherwise are not connected.

Logic Separation

Container Separation Diagram

All of the stage logic for building, testing, deploying, etc. will be moved out of the pipeline and into containers.

The pipeline is just a generic container execution engine. The pipeline has no expectation of which steps are present or what they are named. The pipeline will be the only component that interfaces with jenkins (to schedule stage execution).

Each individual stage container will have no interface with jenkins or other steps. They will share a workspace, but otherwise there is no communication between these components. We may add a mechanism for sharing state in the future (see below), but in general each step is a generic container step.

When adding new types of builds, deployments, or other steps (e.g. copying config from DUDE to SCCS), all of the logic will be implemented in containers. The existing pipeline code won't need to be updated.

Example pipeline config

Note, this specifically calls out some configuration we could continue to provide standard conventions and defaults for (see below).

pipeline:
  appName: open-source-poet
  appVersion:
    master: 2.4.2
    feature/jdk_add_tmorootca: jdk_rootca.2.4.1
  environment:
    # user specific environment variables to pass to all steps
    # also available in pipeline config
    # none of these names are special or understood by the pipeline
    # they can reference the standard environment variables (see below)
    IMAGE_NAME: ${PIPELINE_APP_NAME}/${FULL_BRANCH_NAME_WITHOUT_FORWARD_SLASH}
    BUILD_TAG: ${PIPELINE_APP_VERSION}.${PIPELINE_BUILD_NUMBER}
    BUILD_CONTAINERS: registry.hub.docker.com/library

  steps:
    - name: jar-build
      image: ${BUILD_CONTAINERS}/maven:3.6.1-jdk-12
      commands:
        - mvn -s /apps/tools/.m2/settings.xml clean package

    - name: docker-image-build
      image: ${BUILD_CONTAINERS}/docker
      commands:
        - docker build -t ${IMAGE_NAME} .

    - name: docker-image-test
      image: ${BUILD_CONTAINERS}/docker-compose
      commands:
        - cd src/test/docker
        - chmod +x run-tests.sh
        - ENABLE_JENKINS_WORKAROUNDS=1 ./run-tests.sh ${IMAGE_NAME}

    - name: docker-image-publish
      image: ${BUILD_CONTAINERS}/docker
      environment:
        repo: ${IMAGE_NAME}
        tags:
          - latest
          - ${BUILD_TAG}
        registries:
          - registry: registry.hub.docker.com/library
            cred: docker_registry
          - registry: registry.hub.docker.com/library
            registryPath: scc-docker-release-local
            cred: svc_scc_prd_cicd

    - name: slack
      image: ${BUILD_CONTAINERS}/slack
      environment:
        CHANNEL: my_room
      when:
        # by default steps will only execute while `PIPELINE_STATUS` is `success`, but we
        # can override
        status: [ success, failure ]

Conventions, standardization and defaults

Note that above example specifically calls out configuration elements that are performed automatically and standardized by convention in the current pipeline (e.g. passing in the image name and tags explicitly to the docker build and publish steps).

We can still continue to standardize on these conventions and provide them as defaults.

Since there's nothing special about the build steps, and we can provide any container, we can still build these defaults into a specific container.

For example, in the jenkins/build-container/docker container, we could default both image name and list of tags to our current defaults.

Reducing the above example to:

- name: docker-image-publish
  image: ${BUILD_CONTAINERS}/docker
  environment:
    REGISTRIES:
      ...

i.e. remove/default the repo to ${PIPELINE_APP_NAME}/${FULL_BRANCH_NAME_WITHOUT_FORWARD_SLASH} and remove/default the tags to latest, ${PIPELINE_APP_VERSION}.${PIPELINE_BUILD_NUMBER}

Note that there's still nothing special about this step or container, it's just defaulting config based on env variables

Steps

As mentioned, each step is just a container with configuration.

Drone makes a differention between a "build step" and a "plugin", in that "build steps" require a command or list of commands, but plugins do not (they have an explicit entrypoint defined).

I'm not sure we necessarily need to make the same distinction, but for the purpose of this document we will continue to use their terminalogy.

Build Steps

As mentioned, build steps are just containers and a list of arbitrary commands to execute. The commands are executed using the workspace as the working directory

pipeline:
  steps:
    - name: build
      image: maven:3.5-jdk-8
      commands:
        # just separated for an example
        - mvn compile
        - mvn test

The above commands are converted to a simple shell script. The commands in the above example are roughly converted to the below script:

#!/bin/sh
set -e

mvn compile
mvn test

TODO: can we also set -u, set -o pipefail ?

The above shell script is then executed as the docker entrypoint. The below docker command is an (incomplete) example of how the script is executed:

docker run --entrypoint=build.sh maven:3.5-jdk-8

Step-Containers

Step-containers are arbitrary docker containers that perform tasks in a pipeline. They can be used to deploy code, publish artifacts, send notifications, etc. Some useful examples could be found under Step-Containers section. See the drone plugin marketplace for more out-of-the-box examples.

Example slack notification plugin:

pipeline:
  steps:
    - name: notify
      image: plugins/slack
      environment:
        ROOM: my_room

Configuration

In addition to the standard build environment variables, they can be configured by adding a environment section to the step, with arbitrary plugin-specific configuration. These elements will be passed on to the plugin via environment variables. Only simple key/value pairs are supported, but you can pass complex configuration by either JSON serializing the values, or using multi-line YAML strings.

Example:

pipeline:
  steps:
    - name: my-step
      image: plugins/my_step
      environment:
        GREETING: hello
        JSON_OBJECT: >
          {
            "title": "${PIPELINE_APP_NAME} | ${PIPELINE_BRANCH} | ${PIPELINE_APP_VERSION}.${PIPELINE_BUILD_NUMBER}",
            "title_link": "${PIPELINE_RUN_DISPLAY_URL}",
            "fields": [
              {
                "title": "Docker images published",
                "value": "${DOCKER_REGISTRY}/${IMAGE_NAME}:${IMAGE_TAG}",
                "short": false
              }
            ]
          }

The step-container environment would contain

GREETING=hello
JSON_OBJECT=[{"title": "${PIPELINE_APP_NAME} | ${PIPELINE_BRANCH} | ${PIPELINE_APP_VERSION}.${PIPELINE_BUILD_NUMBER}"}, {"title_link":"${PIPELINE_RUN_DISPLAY_URL}"},
            {"fields":{"title": "Docker images published", "value": "${DOCKER_REGISTRY}/${IMAGE_NAME}:${IMAGE_TAG}", "short": false}}]

Example Bash step-container

No assumptions are made or required about step-container implementation.

This example section is borrowed from drone's Example Bash Plugin. It's included to show that step-containers can be simple (just call a program or script) and don't need pipeline dependencies.

This provides a brief tutorial for creating a webhook step-container, using simple shell scripting, to make an http requests during the build pipeline. The below example demonstrates how we might configure a webhook step-container in the yaml file:

pipeline:
  steps:
    - name: webhook
      image: foo/webhook
      environment:
        URL: http://foo.com
        METHOD: post
        BODY: |
          Hello world

Create a simple shell script that invokes curl using the Yaml configuration parameters, which are passed to the script as environment variables.

#!/bin/sh

curl \
  -X ${METHOD} \
  -d ${BODY} \
  ${URL}

Create a Dockerfile that adds your shell script to the image, and configures the image to execute your shell script as the main entrypoint.

FROM alpine
ADD script.sh /bin/
RUN chmod +x /bin/script.sh
RUN apk -Uuv add curl ca-certificates
ENTRYPOINT /bin/script.sh

Build and publish your step-container to the Docker registry.

docker build -t foo/webhook .
docker push foo/webhook

Conditional execution

We can use conditions based on branch, event, status, implemented as needed.

E.g.

when:
  status: [ failure, success ]
when:
  branch: master
when:
  event: [push, pull_request, tag]

See more details and examples here.

POET pipeline may want to also support conditional execution based on environment variables, to support things like conditional deployment, or conditional config pushes.

when:
  environment:
    DEPLOY_CONFIG: true

Standard Environment Variables

The complete list of variables are published here for reference. We can continue to add to this list. All environment variables should be namespaced with PIPELINE_.

"Standard" step-containers

As mentioned, there's nothing special about build steps or step-containers. Any user could use any arbitrary image in their pipeline. It still makes sense for us to provide containers for common uses. Some useful examples could be found under Step-Containers section, as mentioned earlier.

Workspace

Each step is executed using a shared docker volume.

For the first revision, this will be the only shared state between steps. See Sharing Data below, for more.

Sharing data via the workspace

Besides sharing the source and any artifacts between steps, we can utilize the workspace for ad-hoc data sharing between plugins.

E.g. if we needed to dynamically generate a list of tags for a docker image, we could do something like:

pipeline:
  steps:
    - name: generate-tag-list
      image: ubuntu:14.04
      commands:
        - echo -e "1.0\n1.1\n1.2\n" > .tags
    - name: publish-docker-image
      image: build-containers/docker
      tag-input-file: .tags

I.e. allow the step-containers to take input via file, and generate the config file in a previous step. Obviously this requires some coordination between the steps.

We can pursue more formalized/generic mechanisms for sharing data (see below in Future Work), but I believe we can push this out.

Secrets/Credentials

Note: this section is a bit rough and could use improvement/feedback/suggestions.

Credentials will be provided to containers via environment variables.

POET pipeline could support:

  • jenkins credentials
  • vault credentials
  • SCCS config/credentials

Should we also support pulling arbitrary config from SCCS?

Eventually for AWS, we may want to support AWS Systems Manager Parameter Store, AWS Secrets Manager.

Secrets bootstrapping

To connect to Vault or SCCS, you will need Vault or SCCS credentials.

The most straightforward way seems to be to keep those credentials in jenkins. Alternatively, we could come up with a private/public key or shared secret encryption scheme, but then those keys would need to be in jenkins.

If we store the SCCS or Vault passwords in the jenkins credential store, then that's just one managed secret per team. The team can then manage their own secrets in vault or SCCS.

More generally, we could say that we allow multiple secrets providers and that there's an initialization order such that later secrets providers can use previously provided secrets to bootstrap.

Providing secrets

As mentioned, we will have multiple implementations.

We could implement these as docker images... I wonder if we should implement them as "internal" step-containers to start? i.e. groovy/java code that follows an interface?

secrets:
  providers:
    - name: jenkins
      ...
    - name: vault
      environment:
        user: ...
        ...
      secrets:
        - source: my/vault/path/docker_username
          target: DOCKER_USERNAME
        - source: my/vault/path/slack_bot_token
          target: SLACK_TOKEN
    - name: sccs
      environment:
        application: ...
        profile: ...
        ...
    - name: sccs_prod
      environment:
        application: ...
        profile: prod
        ...

We may also want to allow multiple instances with different configs? (i.e. two SCCS data sources)?

Using secrets

  steps:
    - name: notify
      image: plugins/slack
      environment:
        room: sccs_dev
      secrets: [ SLACK_TOKEN, ... ]
  steps:
    - name: notify
      image: plugins/slack
      environment:
        room: sccs_dev
      secrets:
        - source: SLACK_PROD_TOKEN
          target: SLACK_TOKEN

Docker registry credentials for build/plugin image pulls

For providing secrets to each stage, we will use the mechanism above.

For pulling the step/plugin images to run each stage, do we need a way to provide credentials to those repos? Or will jenkins have them? We may want to have a more global credential mechanism?...

Restricting secrets?

  • Jenkins provides a mechanism to restrict its secrets to user/job
    • Do we want to do that for external secrets?
      • Would it just be a config thing? How else would we enforce?
  • Do we want to be able to restrict secrets for certain events?
    • E.g. a malicious PR could expose secrets
      • e.g. a new script that adds echo $MY_SECRET
        • probably not a big deal for internal use cases?

Appendix

"Real World" Examples

conducktor/helm deploy

The logic could be added to a new container. The nice part is that it would not affect the existing pipeline logic, config format, etc.