Skip to content

SteamOS auto repair process

ProfessorKaos64 edited this page Jul 12, 2016 · 10 revisions

Table of Contents generated with DocToc

About

If the system has too much trouble starting, or experiences abrupt shutdowns, this script will likely fire off upon starting your system. Technically speaking, if the greeter (lightdm) service fails to start, the steamos-autorepair script is invoked via a systemd service. A list of services on SteamOS can be found here.

The main autorepair script attempts to:

  • Fix dpkg configurations that may be broken/unfinished
  • Fix unfinished/broken packages with apt-get -f -y install
  • Update Plymouth (Graphical GUI during boot)
  • rebuild dkms modules (such as nvidia drivers).

Script outputs updated: 20160711

What triggers this

The SteamOS recovery process involes 3 key files (that is known at the moment):

/lib/systemd/system/lightdm.service.d/steamos-autorecover.conf

This is what triggers the unit file. If the lightdm service fails, this conf file indicates to fail the steamos-autorepair service.

[Unit]
OnFailure=steamos-autorepair.service

steamos-autorepair.service

This is the static systemd unit file. After the above lightdm conf file fails this service, /usr/bin/steamos-autorepair.sh is initiated.

[Unit]
Description=SteamOS Autorepair

[Service]
ExecStart=/usr/bin/steamos-autorepair.sh
Type=oneshot

/usr/bin/steamos-autorepair

This is the repair process itself.

#!/bin/bash

# 10s is the time window where systemd stops trying to restart a service
sleep 15

# if lightdm is not running after 15s, it's not a random crash, but many
# otherwise nothing to do, systemd will call us again if it crashes more
if pidof -x lightdm > /dev/null
then
    exit 0
fi

# can't have this be a dependency of our unit or it'll trigger too early
service plymouth-reboot start

plymouth display-message --text="SteamOS is attempting to recover from a fatal error"
plymouth system-update --progress=10
dpkg --configure -a
apt-get -f -y install
plymouth system-update --progress=50

#
# force rebuild dkms modules
#
dkms_modules=`find /usr/src -maxdepth 2 -name dkms.conf`
arr=($dkms_modules)
let prog=50

# compute how far to move the progress bar for each module
let delta="50/${#arr[@]}"

for i in $dkms_modules
do
  module_name=`grep ^PACKAGE_NAME $i | cut -d= -f2 | tr -d \"`
  module_version=`grep ^PACKAGE_VERSION $i | cut -d= -f2 | tr -d \"`

  dkms remove $module_name/$module_version --all
  dkms build -m $module_name -v $module_version
  dkms install -m $module_name -v $module_version
  let prog="$prog + $delta"
  plymouth system-update --progress=$prog
done

plymouth system-update --progress=100
plymouth display-message --text="Recovery complete, restarting..."

sleep 1

reboot
Clone this wiki locally