Skip to content

Commit

Permalink
Add containerfile for resalloc and add agent spawner to hub
Browse files Browse the repository at this point in the history
Signed-off-by: Siteshwar Vashisht <[email protected]>
  • Loading branch information
siteshwar committed Apr 3, 2024
1 parent 834f284 commit abbbb2c
Show file tree
Hide file tree
Showing 17 changed files with 393 additions and 18 deletions.
20 changes: 19 additions & 1 deletion .github/workflows/build-and-push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,10 @@ on:
push:
branches:
- main
# TODO: REMOVE THIS BEFORE MERGING.
- add-resalloc-container
jobs:
build-and-push:
build-and-push-osh-hub:
runs-on: ubuntu-20.04
steps:
- name: Build and push to quay.io registry
Expand All @@ -17,5 +19,21 @@ jobs:
registry_token: ${{ secrets.REGISTRY_TOKEN }}
dockerfile: "containers/hub.Containerfile"
docker_context: "containers"
# TODO: Rename this image to `fedora-osh-hub` (or osh-hub).
# Keep the name consistent with container name in OpenShift.
image_name: "ocp"
tag: "staging"
build-and-push-resalloc-server:
runs-on: ubuntu-20.04
steps:
- name: Build and push to quay.io registry
uses: sclorg/build-and-push-action@v4
with:
registry: "quay.io"
registry_namespace: "openscanhub-fedora-infra"
registry_username: ${{ secrets.REGISTRY_LOGIN }}
registry_token: ${{ secrets.REGISTRY_TOKEN }}
dockerfile: "containers/resalloc-server.Containerfile"
docker_context: "containers"
image_name: "resalloc-server"
tag: "staging"
3 changes: 3 additions & 0 deletions containers/configs/aws-config
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[default]
region = us-west-2
output = table
1 change: 0 additions & 1 deletion containers/configs/osh-hub-httpd.conf
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
# TODO: Set a servername for staging and production servers.
# ServerName *
WSGISocketPrefix /var/run/wsgi

Expand Down
64 changes: 64 additions & 0 deletions containers/configs/resalloc-agent-spawner-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Configuration for resalloc-agent-spawner.service. YAML format.
# Specify groups of agents within the `agent_groups` section that agent spawner
# should take care of.

agent_groups:
workers:
# These commands are executed in the background async as they may take
# quite some time to process. If `cmd_prepare` fails (exit non-zero), the
# agent immediately marked for removal. `cmd_terminate` exit status is
# just ignored (we need to remove the agent no matter what).

# Prepare the agent. Variable $RESALLOC_RESOURCE_DATA (base64 encoded)
# is provided in the script environment.
cmd_prepare: /usr/bin/osh-worker-manager --create-worker $(echo "$AGENT_SPAWNER_RESOURCE_DATA" | base64 -d | head -1) &> /var/log/resalloc-agent-spawner/$$-preparation.log
# cmd_prepare: echo "$RESALLOC_RESOURCE_DATA" | base64 -d | head -1 &> /tmp/$$-preparation.log

# Prepare the agent for termination. Upon finishing this command, the
# resalloc resource ticket is closed and the resource deallocated.
cmd_terminate: /usr/bin/osh-worker-manager --delete-worker $(echo "$AGENT_SPAWNER_RESOURCE_DATA" | base64 -d | head -1) &> /var/log/resalloc-agent-spawner/$$-deletion.log
# cmd_terminate: /sbin/osh-resalloc --delete-worker "$(echo "$RESALLOC_RESOURCE_DATA" | base64 -d | head -1)"

# The following commands are executed synchronously by the agent spawner
# daemon (polling). Please keep them super fast to avoid overall system #
# halt!

# The `cmd_converge_to` needs to print integer number (the currently ideal
# number of agents to converge to) onto stdout.
cmd_converge_to: /usr/bin/osh-worker-manager --workers-needed

# Agents may decide to stop themselves. This hook is used to detect
# such a case -> if exit status 0 is returned, agent is going to be
# terminated (cmd_terminate is called against it).
cmd_check_finished: /usr/bin/osh-worker-manager --check-finished $(echo "$AGENT_SPAWNER_RESOURCE_DATA" | base64 -d | head -1) &> /var/log/resalloc-agent-spawner/$$-check-finished.log

# Some agents might be expected to run long-term (or indefinitely). This
# hook helps us to politely ask the agent whether it is OK to terminate.
# Returning exit status 1 means the agent can not be terminated.
# Returning 0 means that the agent was prepared for termination, and
# this has to be removed now. This is useful for gently downsizing
# the agent count while converging to `cmd_converge_to`.
#
# TODO: Implement `try_release` hook. Here are a few notes about it:
# - The `try_release` hook would be called on all agents until it removes number of agents before matching ideal number of running agents.
# - Setting max_load to 0 unconditionally would cause issues with preallocated agents, as they may be removed before being used. The
# idea of using resalloc-agent-spawner would be useless in that case, as we can directly use resalloc without it.
# - We should investigate if there may be an actual race condition in kobo, that stops us from deleting the node in `try_release` hook.
cmd_try_release: /bin/false

# List of resalloc tags to use while requesting tickets
tags:
- arch_x86_64

# Note that we use the 'request_survives_server_restart' resalloc client option,
# so the resalloc server must be running to avoid the overall system hang!
resalloc_server: "http://resalloc-server:49100"

# Where to log events.
logfile: /var/log/resalloc-agent-spawner/agent-spawner.log

# How to connect to redis-db. By default connects to 127.0.0.1:6379.
#redis_db: null
redis_host: redis
#redis_port": null
#redis_password": null
52 changes: 39 additions & 13 deletions containers/hub.Containerfile
Original file line number Diff line number Diff line change
@@ -1,50 +1,76 @@
# Builds should be available on https://quay.io/organization/openscanhub-fedora-infra/ocp
FROM registry.access.redhat.com/ubi9/httpd-24
# TODO: Check if we can use ubi.
# resalloc dependencies require a subscription.
# FROM registry.access.redhat.com/ubi9/httpd-24
FROM quay.io/sclorg/httpd-24-c9s

USER 0

RUN dnf install -y dnf-plugins-core https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
RUN dnf install -y epel-release

RUN dnf config-manager --set-enabled crb extras-common

# enable installation of gettext message objects
RUN rm /etc/rpm/macros.image-language-conf

#TODO: How to enable installation of a specifiec commit?
RUN dnf copr enable -y @openscanhub/devel
# RUN dnf copr enable -y @openscanhub/devel
# TODO: REMOVE `packit/openscanhub-openscanhub-234` WHEN UPSTREAM PR IS MERGED:
# https://github.com/openscanhub/openscanhub/pull/234
RUN dnf copr enable -y packit/openscanhub-openscanhub-234

# Keep these here as a reference for debugging in the future.
# RUN dnf copr enable -y praiskup/resalloc
# RUN dnf copr enable -y @copr/copr
RUN dnf config-manager --add-repo https://pagure.io/fedora-infra/ansible/raw/main/f/files/common/epel9.repo

# TODO: This would install osh-hub configurations from the `hub-conf-devel` package. How to install
# non-devel configurations for fedora infrastrucutre?
# TODO: There may be a race condition here, as it installs latest `osh-hub` package, that may have
# been built after a specific commit.
RUN dnf install -y osh-hub osh-hub-conf-devel openssl krb5-workstation
# tzdata is a dependency for django
RUN dnf install -y osh-hub osh-hub-conf-devel openssl krb5-workstation tzdata

RUN dnf install -y resalloc-agent-spawner osh-worker-manager

# TODO: Shall `/var/log/osh/` be a persistennt path? Shall this log be redirected to another logging
# service like splunk?
RUN touch /var/log/osh/hub/hub.log && chown :root /var/log/osh/hub/hub.log
# TODO: Set correct permissions on below files.
RUN chown -R :root /var/log/osh/hub /var/lib/osh/ /opt/app-root/ /var/run/
# TODO: Shall /var/lib/osh be a persistent path? Remove chmod command for it?
RUN chown -R :root /var/log/osh/hub /opt/app-root/ /var/run/
RUN chmod -R g+rw /var/log/osh/hub /opt/app-root/ /var/run/

# TODO: Disable `mod_security` in apache httpd.
# TODO: may be enable `mod_security` in apache httpd.
# This is a temporary workaround to allow large report uploads from worker to the hub.
# This change should be revisited before moving to production.
RUN rm -f /etc/httpd/conf.d/mod_security.conf

# Run a dummy SMTP server in background
RUN python3 -m smtpd -n -c DebuggingServer localhost:8025 >> /tmp/emails.log &

EXPOSE 8080
EXPOSE 8443

# Custom `redhat.css` changes color of the header to blue.
# This should be removed when https://issues.redhat.com/browse/OSH-198 is resolved.
COPY configs/redhat.css /usr/lib/python3.9/site-packages/osh/hub/static-assets/css/redhat.css
COPY configs/redhat.css /usr/lib/python3.9/site-packages/osh/hub/static/css/redhat.css
# TODO: These files should be copied at container runtime and not build time.
# COPY configs/settings_local.ocp.py /usr/lib/python3.9/site-packages/osh/hub/settings_local.py
# COPY configs/osh-hub-httpd.conf /etc/httpd/conf.d/osh-hub-httpd.conf

COPY configs/resalloc-agent-spawner-config.yaml /etc/resalloc-agent-spawner/config.yaml
RUN mkdir /var/log/resalloc-agent-spawner
RUN chmod g+rwx /var/log/resalloc-agent-spawner

COPY scripts/run_hub.sh /run_hub.sh

# Keep these here as a reference for debugging in the future.
# RUN dnf install -y python3-ipdb
# COPY worker.py /usr/lib/python3.9/site-packages/resalloc_agent_spawner/worker.py
# RUN chmod g+rwx /usr/lib/python3.9/site-packages/resalloc_agent_spawner/worker.py
# RUN chown :root /usr/bin/osh-worker-manager
# RUN chmod g+rwx /usr/bin/osh-worker-manager

###########
RUN chmod a+x /run_hub.sh
# This is for backward compatibility. Remove this?
USER 1001

# Commands in this path are used by osh-worker-manager
ENV PATH=/sbin:/bin:/usr/sbin:/usr/bin
CMD /run_hub.sh
52 changes: 52 additions & 0 deletions containers/resalloc-server.Containerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# TODO: Check if we can use ubi.
# resalloc dependencies require a subscription.
# FROM registry.access.redhat.com/ubi9/ubi
FROM quay.io/centos/centos:stream9
USER 0

RUN dnf install -y dnf-plugins-core
RUN dnf config-manager --set-enabled crb extras-common

RUN dnf install -y epel-release

# RUN dnf copr enable -y praiskup/resalloc
# RUN dnf copr enable -y @copr/copr
RUN dnf config-manager --add-repo https://pagure.io/fedora-infra/ansible/raw/main/f/files/common/epel9.repo

# Deployment instructions are described here
# https://github.com/praiskup/resalloc/blob/master/docs/start-resalloc-server.txt

RUN dnf install -y ansible \
findutils \
openssh-clients \
postgresql \
python3-psycopg2 \
resalloc \
resalloc-aws \
resalloc-server \
vim

# copy filesystem setup
# COPY files/ /

# `/etc/resallocserver/` belongs to `resalloc` group and is not readable by `root`.
RUN chgrp -R root /etc/resallocserver/ /var/log/resallocserver /var/lib/resallocserver
RUN chmod -R g+rwx /etc/resallocserver/ /var/log/resallocserver /var/lib/resallocserver

# Set up aws configurations. Ssh key would be mounted from OpenShift.
RUN mkdir /var/lib/resallocserver/.aws /var/lib/resallocserver/.ssh
COPY configs/aws-config /var/lib/resallocserver/.aws/config

# `server.yaml` and `pools.yaml` are mount through OpenShift configurations.
RUN rm -f /etc/resallocserver/server.yaml /etc/resallocserver/pools.yaml

# RUN git clone --branch openscanhub-fedora-infra https://pagure.io/forks/svashisht/rhcopr/vm-provisioning.git /etc/resallocserver/vm-provisioning
COPY vm-provisioning /etc/resallocserver/vm-provisioning
# RUN git clone --branch main https://github.com/siteshwar/openscanhub-deployment-configs.git /etc/resallocserver/openscanhub-deployment-configs
# COPY openscanhub-deployment-configs /etc/resallocserver/openscanhub-deployment-configs

USER 1001
ENV HOME=/var/lib/resallocserver
ENV CONFIG_DIR=/etc/resallocserver
CMD /usr/bin/resalloc-server
# CMD sleep inf
18 changes: 15 additions & 3 deletions containers/scripts/run_hub.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,20 +9,19 @@ if [[ -e /src ]]; then
# rm -f /etc/httpd/conf.d/ssl.conf

for _ in $(seq 100); do
# TODO: Put hostname of the db01 server here
pg_isready -h fedora-osh-db && break
sleep 0.5
done

popd
else
# We are in OpenShift deployment. Copy configuration files from persistent storage.
# We are in the OpenShift deployment. Copy configuration files from persistent storage.
# Probably set the up through ansible
# curl -o /mnt/osh-configs/osh-hub-httpd.conf https://raw.githubusercontent.com/siteshwar/openscanhub-deployment-configs/main/fedora-infra/osh-hub-httpd.conf
# curl -o /mnt/osh-configs/settings_local.ci.py https://raw.githubusercontent.com/siteshwar/openscanhub-deployment-configs/main/fedora-infra/settings_local.ci.py
# cp /mnt/osh-configs/settings_local.ci.py /usr/lib/python3.9/site-packages/osh/hub/settings_local.py
# cp /mnt/osh-configs/osh-hub-httpd.conf /etc/httpd/conf.d/osh-hub-httpd.conf
echo "Do nothing here!"
echo "We are in OpenShift. Container should be already configured through Ansible."
# cp /etc/keytabs/service.keytab
fi

Expand All @@ -40,6 +39,19 @@ if [ "$(/usr/lib/python3.9/site-packages/osh/hub/manage.py dumpdata scan.MockCon
/usr/lib/python3.9/site-packages/osh/hub/scan/fixtures/initial_data.json
fi

if [[ -e /src ]]; then
echo "Woker should be started by podman-compose"
else
# We are in the OpenShift deployment.

# TODO: https://github.com/openscanhub/fedora-infra/issues/12
# Run a dummy SMTP server in background
python3 -m smtpd -n -c DebuggingServer localhost:8025 &> /tmp/emails.log &

echo "Running resalloc-agent-spawner in the background. Logs would appear in /var/log/resalloc-agent-spawner/agent-spawner.log"
# This process is watched over by OpenShift (kubernetes) liveness probe.
/usr/bin/resalloc-agent-spawner &
fi
/usr/bin/run-httpd
# Leave it here for debugging in the future.
# sleep inf
7 changes: 7 additions & 0 deletions containers/vm-provisioning/ansible/playbook.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
- hosts: all
become: true
vars:
ansible_ssh_user: ec2-user
roles:
- osh-worker-role

Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
osh-worker-role
=========

This role is used to set up OpenScanHub workers.

License
-------

GPL-3.0-or-later
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
galaxy_info:
author: Siteshwar Vashisht
description: This role is used to set up OpenScanHub workers.
company: Red Hat

issue_tracker_url: https://github.com/openscanhub/fedora-infra

# Choose a valid license ID from https://spdx.org - some suggested licenses:
# - BSD-3-Clause (default)
# - MIT
# - GPL-2.0-or-later
# - GPL-3.0-only
# - Apache-2.0
# - CC-BY-4.0
license: GPL-3.0-or-later

min_ansible_version: 2.1

#
# Provide a list of supported platforms, and for each platform a list of versions.
# If you don't wish to enumerate all versions for a particular platform, use 'all'.
# To view available platforms and versions (or releases), visit:
# https://galaxy.ansible.com/api/v1/platforms/
#
# platforms:
# - name: Fedora
# versions:
# - all
# - 25
# - name: SomePlatform
# versions:
# - all
# - 1.0
# - 7
# - 99.99

galaxy_tags: []
# List tags for your role here, one per line. A tag is a keyword that describes
# and categorizes the role. Users find roles by searching for tags. Be sure to
# remove the '[]' above, if you add tags to this list.
#
# NOTE: A tag is limited to a single word comprised of alphanumeric characters.
# Maximum 20 tags per role.

dependencies: []
# List your role dependencies here, one per line. Be sure to remove the '[]' above,
# if you add dependencies to this list.
Loading

0 comments on commit abbbb2c

Please sign in to comment.