Runbook: Failed Transition Recovery
Status: living
Use this when mode status reports failed-transition, a degraded state, or a
post-action observation mismatch.
Inspect State
mode status
mode current --refresh
sudo cat /run/mode-controller/last-transition.json
sudo cat /run/mode-controller/last-guards.json
journalctl -u 'mode-controller@*' -b
Classify the Failure
Common buckets:
- guard policy block, such as active audio or unsafe user jobs
- guard execution error, such as missing
nvidia-smi - graphical session did not terminate
- GPU release predicate did not pass
- vLLM failed to start or stop
- target isolation stopped required services
- post-action observation remained conflicted
Recover to Desktop
If the system is not in the middle of an active transition:
sudo mode request desktop
mode status
Success requires observer confirmation, not just successful systemd commands.
If desktop recovery fails:
- inspect
journalctl -b - inspect display-manager/session logs
- stop compute-only services manually only if their ownership is clear
- consider rebooting to the v1 boot default,
desktop
Record Evidence
For every failure worth keeping:
- final
mode status - last transition JSON
- last guards JSON
- relevant systemd unit status
- whether rollback restored desktop
- whether the failure suggests runtime switching is insufficient
Repeated GPU release or desktop restoration failures should trigger specialisation/reboot-mediated compute evaluation.