Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan B for parallel join ( try to get a certificate before joining ) #10

Open
wants to merge 39 commits into
base: main
Choose a base branch
from

Conversation

GinoHarlos
Copy link
Contributor

Plan B for parallel join ( try to get a certificate before joining )

Please make sure you considered the following things

Link to the issue in Bugzilla

Description of the changes

Please describe the changes with a few sentences.

Motivation

  • Ensuring the ability to deploy the "Univention Corporate Server - Container Mode" in future and minimize default live patching. ( minimizes the risk of missing a certificate )

Description

In case of starting many directory nodes or managed nodes at the same time, there are coming a lot of certificates in a short period of time too and maybe you will get a join failed. To prevent this fact, the container try to get a certificate before joining to a primary directory node. This functionality is particularly useful with the container option dcwait and using docker-compose too.

@GinoHarlos
Copy link
Contributor Author

Preparation for UCS 5.2-0, fix some issues and add systemd OnFailure option

Please make sure you considered the following things

Link to the issue in Bugzilla

Description of the changes

Please describe the changes with a few sentences.

Motivation ( preparation for UCS 5.2-0 with backwards compatibility )

  • fix latest version stability and switch to ucs-releases.json with fallback to GitHub default branch
  • add systemd unit option OnFailure for firstboot and recreate ( super useful when starting from very old containers with active latest option )
  • new development/test version ( bootstrap.sh.json )
  • fix missing GPG keys for APT ( bootstrap.sh and inside the latest loop "00-aA-APT-SOURCES-LIST-Aa-00" too )
  • fix restore/recreate ( slapadd will have an empty /etc/ldap/slapd.d directory )
  • fix ntpsec service unit ( ntpsec would like to run as root inside the container )
  • fix nslcd service unit ( needs to be restarted )
  • fix man-db service unit ( systemd PrivateTemp )
  • fix docker in docker ( add systemd reset-failed )
  • fix minimal postfix support ( regenarate postfix lookup tables )
  • fix dns/forwarder and ignore loopback addresses ( docker's internal DNS server 127.0.0.11 is not reliable enough as a forwarder )
  • fix dbus/avahi and bee nice to systemd-logind.service for UCS >= 5.0-0 ( failed to start dbus-org.freedesktop.login1.service )
  • fix named with samba as backend ( docker vs. podman "named: the working directory is not writable" set systemd WorkingDirectory to /var/cache/bind )
    ( docker has permission to write files into the "/" dirrectory with the systemd Group as bind )
  • add switching appcenter repository url for testing environment
    ( Is appcenter-test.software-univention.de really active and usable for the test environment? )

Documentation changes

  • fix missing command in basic examples by adding a fallback to watch -n 1 docker exec ${FQDN} systemctl status univention-container-mode-firstboot.service ...
  • fix latest version description
  • add description for systemd unit option OnFailure for firtboot and recreate ( univention-container-mode-firstboot-on-failure.service and univention-container-mode-recreate-on-failure.service )

Additional notes

With the "fix latest version" patch it is posible to start a container with UCS 4.4-0 and it will update/upgrade itself to version 5.2-0 ( univention-corporate-server:4.4-0-test ) or 5.0-3 ( univention-corporate-server:4.4-0 ) with default mirror url.

docker \
  run \
    --detach \
    --env DEBUG=TRUE \
    --cap-add SYS_ADMIN \
    --volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
    --restart unless-stopped \
    --tmpfs /tmp:exec \
    --tmpfs /run/lock \
    --tmpfs /run \
    --hostname dc.ucs.example \
    --name dc.ucs.example \
      docker.io/ginoharlos/univention-corporate-server:4.4-0
watch -n 1 docker exec dc.ucs.example systemctl status univention-container-mode-firstboot.service univention-container-mode-firstboot-on-failure.service
...
● univention-container-mode-firstboot.service - Univention container mode firstboot
   Loaded: loaded (/lib/systemd/system/univention-container-mode-firstboot.service; enabled; vendor preset: enabled)
  Drop-In: /usr/lib/systemd/system/univention-container-mode-firstboot.service.d
           └─override.conf
...
● univention-container-mode-firstboot-on-failure.service - Univention container mode firstboot
   Loaded: loaded (/lib/systemd/system/univention-container-mode-firstboot-on-failure.service; static; vendor preset: enabled)
  Drop-In: /usr/lib/systemd/system/univention-container-mode-firstboot-on-failure.service.d
           └─override.conf
docker \
  exec dc.ucs.example \
    systemd-analyze blame | egrep -- univention-container-mode
...
   13min 20.335s univention-container-mode-firstboot.service
docker \
  exec dc.ucs.example /bin/bash -c 'systemd-detect-virt --container && univention-check-join-status && \
    ucr search --brief ^repository/.*/server ^version ^appcenter/apps ^appcenter/docker$'
...
docker
Joined successfully
appcenter/docker: false
repository/app_center/server: appcenter.software-univention.de
repository/mirror/server: <empty>
repository/online/component/.*/server: <empty>
repository/online/server: https://updates.software-univention.de/
version/erratalevel: 648
version/patchlevel: 3
version/version: 5.0

@CLAassistant
Copy link

CLAassistant commented May 19, 2023

CLA assistant check
All committers have signed the CLA.

…ainer apps from univention app center by replacing the clean environment documentation including Dockerfile
@GinoHarlos
Copy link
Contributor Author

Add feature of pre installed role container images including non-container apps from univention app center by replacing the clean environment documentation including Dockerfile

Please make sure you considered the following things

Link to the issue in Bugzilla

Description of the changes

Please describe the changes with a few sentences.

Motivation

  • Adding the feature of new container images with pre installed role as primary, replica, backup directory node or managed node.

Documentation changes

  • del clean environment documentation
  • add pre installed role container images ( build with docker or podman, including non-container apps )
  • add new LATEST option value for pre installed role container images

Dockerfile changes

  • del clean environment Dockerfile ( clean.environment.Dockerfile )
  • add pre installed role Dockerfile ( pre.installed.role.Dockerfile )

Description

With amount of minimal changes to the firstboot deploy, get a "fork" but with pre installed system role to save time. The default firstboot needs 15 to 20 minutes for a first start/boot. If you run the same with a pre installed container image, it takes less the 10 minutes.

  • fix 00-aA-APT-USER-AGENT-Aa-00, 00-aA-DEPENDENCIES-Aa-00, 41-setup-pre-certificate, 52-setup-join, 70-docker-in-docker, 70-ssh-server, 99-system-cleanup and utils ( the APT User Agent get's a new flag )
  • fix recreate/00-aA-APT-SOURCES-LIST-Aa-00, recreate/00-aA-RESTORE-Aa-00 and recreate/52-setup-restore ( allow default to upgrade the system before restoring, only for UCS > 5.0-0 )
  • fix univention container mode service units ( switch to ConditionPathExists=/etc/machine.secret for backup, joined and init )
  • add {join,recreate}/{30,70}-ssh-server-keys ( be sure to have ssh private and public keys, also very usefull for every normal UCS container mode image )
  • add {join,recreate}/99-zZ-UPGRADE-LATEST-Zz-99 ( LATEST=SKIPP is an alternative option to LATEST=TRUE and will trigger the univention upgrade process for UCS > 5.0-0 )
for dc in pdc sdc bdc mdc; do docker exec ${dc}.ucs.example systemd-analyze blame | egrep -- univention-container-mode-pre-installed-role.service; done
...
    6min 12.302s univention-container-mode-pre-installed-role.service ( primary directory node )
    5min 17.301s univention-container-mode-pre-installed-role.service ( replica directory node )
      6min 529ms univention-container-mode-pre-installed-role.service ( backup directory node )
    4min 27.356s univention-container-mode-pre-installed-role.service ( managed node )
...
  • fix utils - UniventionAddApp - for container build process to add non-container apps ( directly support for Active Directory-compatible Domain Controller )
for dc in pdc sdc bdc mdc; do docker exec ${dc}.ucs.example systemd-analyze blame | egrep -- univention-container-mode-pre-installed-role.service; done
...
   11min 38.519s univention-container-mode-pre-installed-role.service ( primary directory node --ad-dc-- included )
    8min 19.857s univention-container-mode-pre-installed-role.service ( replica directory node --ad-dc-- included )
     9min 9.014s univention-container-mode-pre-installed-role.service ( backup directory node  --ad-dc-- included )
    4min 49.985s univention-container-mode-pre-installed-role.service ( managed node )
...

Additional notes

For an slimify container image it will take 300 to 400 MB of space and as normal 50 to 100 MB on top ( gziped tar archive on the container registry ).

BugFixes

  • fix systemd OnFailure for restore/recreate, firstboot and pre installed role(s)
  • fix systemd firstboot and recreate service units for requested RESTART option ( skipp univention-check-join-status )
  • fix setup pre certificate for pre installed roles ( generate host certificate only for a vaild or empty server role )
  • fix the stability of join process ( allow to fail once max twice )
  • fix the stability of automated setup ( counter overflow in utils )
  • fix slimify and prevent the reinstalling of all packages ( save time and avoid unnecessary reinstallations )
  • fix mimimal postfix mail server support ( missing lookup tables )
  • fix nameserver and forwarder ( be sure to overwrite any old nameservers and forwarders )
  • fix the stability of namespace error detection ( be nice to systemd-logind, this will benefit cifs/dbus and avahi )
  • fix app center test repository mirror ( ignore gpg for all.tar ... ucr set appcenter/index/verify=false )
  • fix restore/recreate stability and add system upgrade process ( one step further to automatic system restore/recreate )

CleanUp

  • del {join,recreate}/{40,60}-setup-*-certificate-patch ( no longer needed, Plan B for parallel join works very well )

@GinoHarlos
Copy link
Contributor Author

ToDo item systemd -- fix private tmp and restrict namespace

Please make sure you considered the following things

Link to the issue in Bugzilla

Description of the changes

Please describe the changes with a few sentences.

Motivation

  • One point less on the ToDo list and a clean start of systemd service units.

Documentation changes

  • add a comment for hugepages volume ( docker run ... --volume /dev/hugepages:/dev/hugepages ... )
  • update container build and run with podman ( podman run ... --systemd true ... since version 3.1.0 )

Description

Depending on the container start and its privileges, there are more and more errors in individual systemd service units. This is now recognized for every container start and repaired in a nice or hard way.

journalctl --full --all --no-pager --no-hostname --boot | awk '/^.*systemd\[1\]\:.*(NAMESPACE|BPF.*cgroup.*failed.*)$/{ print $0 }'
... ${unit} Attaching egress BPF program to cgroup ... ${unit} failed ...
... ${unit} Main process exited, code=exited, status=226/NAMESPACE

@GinoHarlos
Copy link
Contributor Author

Preparation for UCS 5.2-0 and fix some issues

Please make sure you considered the following things

Link to the issue in Bugzilla

Description of the changes

Please describe the changes with a few sentences.

Motivation

  • update to new version ( bootstrap.sh.json and docs )
  • fix systemd detect new machine, be sure to remove log, identity and key files ( Dockerfile )
  • fix slimify mode and allow a new path ( /usr/share/doc/univention*/changelog.gz )
  • fix pre installed role and be sure to skipp the first and second dist-upgrade
  • fix systemd private tmp and restrict namespace detection
  • fix systemd service unit firstboot ( univention-container-mode-firstboot )
  • fix restore/recreate ( cups )
  • fix and prepare sso web service for UCS >= 5.1-0 ( umc/web/sso/enabled=false )
  • fix and prepare sso certificate
  • fix and prepare systemd sssd service unit
  • fix and prepare systemd wait online service unit ( we don't need this service unit in the container )
  • fix and prepare systemd default services and timers ( add inetd, slapd and ntpsec )
  • prepare restore/recreate for containerbases apps
  • fix running under Docker >= 25.0.0 by changing control groups mount

Documentation changes

  • update to new version 5.0-7
  • add a comment for appcenter volume
  • add a comment for Docker >= 25.0.0
  • add a comment for keycloak app

Known issues

  • you can't realy restore/recreate with container based apps from univention appcenter

@GinoHarlos
Copy link
Contributor Author

Preparation for UCS 5.2-0, fix some issues and improve stability

Please make sure you considered the following things

Link to the issue in Bugzilla

Description of the changes

Please describe the changes with a few sentences.

Motivation

  • fix debootstrap usr-is-merged ( E: Couldn't find these debs: usr-is-merged )
  • fix Dockerfiles ( missing packages in pre installed roles and code syntax )
  • fix systemd on failure and set a nice start condition ( old systemd versions don't accept empty 'WantedBy' )
  • fix restore/recreate LDAP ( loop databases from static and/or dynamic config )
  • fix restore/recreate dns ( restart any name server, frontend and backend )
  • fix restore/recreate ssh ( cleanup known host public key cache )
  • fix restore/recreate ( force cleanup to get a fresh backup )
  • fix restore/recreate ( disable certificates json file )
  • fix restore/recreate ( restore local secrets and try to backup the old one )
  • add restore/recreate for PostgreSQL ( dump databases )
  • add restore/recreate for containerbases apps like keycloak

Documentation changes

  • add a comment for sso restore/recreate ( saml or keycloak )

Dockerfile changes

  • fix missing packages in pre installed roles

@GinoHarlos
Copy link
Contributor Author

Preparation for UCS 5.2-0, move the debootstrap to the docker build process and fix some issues

Please make sure you considered the following things

Link to the issue in Bugzilla

Description of the changes

Please describe the changes with a few sentences.

Motivation

  • fix restore/recreate ssl ( add keycloak/server/sso/fqdn )
  • fix slimify mode ( save another megabyte )
  • del bootstrap.sh and move debootstrap to docker build ( Dockerfile )

Documentation changes

  • update to new version 5.0-8
  • update docker and podman build ( the image univention-corporate-server-debootstrap is no longer needed )
  • update and complete the container name with univention namespace ( univention-corporate-server is not an official image )

Dockerfile changes

  • add debootstrap
  • fix slimify

@GinoHarlos
Copy link
Contributor Author

Update documentation to next patchlevel release

Please make sure you considered the following things

Link to the issue in Bugzilla

Description of the changes

Please describe the changes with a few sentences.

Documentation changes

  • update to new version 5.0-9

@GinoHarlos
Copy link
Contributor Author

Preparation for UCS 5.2-0 and fix some issues

Please make sure you considered the following things

Link to the issue in Bugzilla

Description of the changes

Please describe the changes with a few sentences.

Motivation

  • fix repository public key, add new public key type, this now contains the major and minor in the name
  • fix timezone data, different entries in /etc/timezone and /etc/locale.conf files are not consistent

@GinoHarlos
Copy link
Contributor Author

Update documentation to next minor release

Please make sure you considered the following things

Link to the issue in Bugzilla

Description of the changes

Please describe the changes with a few sentences.

Documentation changes

  • update to new version 5.2-0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants