Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Runbook: Failed Transition Recovery

Status: living

Use this when mode status reports failed-transition, a degraded state, or a post-action observation mismatch.

Inspect State

mode status
mode current --refresh
sudo cat /run/mode-controller/last-transition.json
sudo cat /run/mode-controller/last-guards.json
journalctl -u 'mode-controller@*' -b

Classify the Failure

Common buckets:

  • guard policy block, such as active audio or unsafe user jobs
  • guard execution error, such as missing nvidia-smi
  • graphical session did not terminate
  • GPU release predicate did not pass
  • vLLM failed to start or stop
  • target isolation stopped required services
  • post-action observation remained conflicted

Recover to Desktop

If the system is not in the middle of an active transition:

sudo mode request desktop
mode status

Success requires observer confirmation, not just successful systemd commands.

If desktop recovery fails:

  • inspect journalctl -b
  • inspect display-manager/session logs
  • stop compute-only services manually only if their ownership is clear
  • consider rebooting to the v1 boot default, desktop

Record Evidence

For every failure worth keeping:

  • final mode status
  • last transition JSON
  • last guards JSON
  • relevant systemd unit status
  • whether rollback restored desktop
  • whether the failure suggests runtime switching is insufficient

Repeated GPU release or desktop restoration failures should trigger specialisation/reboot-mediated compute evaluation.