Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

First Bring-Up Checklist

This is the shortest practical path to getting the first live build onto a real NixOS machine.

It assumes:

  • this repo is available on the target machine
  • the target machine is the intended workstation host
  • the current v1 policy remains:
    • boot default = desktop
    • studio-local is an overlay on desktop
    • vLLM is compute-only when explicitly enabled

1. Put the Repo on the Target Machine

git clone <repo-url> /path/to/dubnium
cd /path/to/dubnium

If the repo is already local:

cd /path/to/dubnium

2. Generate Real Hardware Configuration

The scaffold currently contains a placeholder hardware file.

On the target NixOS machine:

sudo nixos-generate-config --dir ./hosts/workstation

This should populate:

  • hosts/workstation/hardware-configuration.nix

Review that file and make sure:

  • it matches the actual boot disk/filesystem layout
  • it does not remove the existing import structure in hosts/workstation/default.nix

3. Review Host-Specific Settings Before First Build

Check hosts/workstation/default.nix.

Important values to confirm:

  • networking.hostName
  • dubnium.hardware.presentGpus
  • dubnium.hardware.displayGpu
  • dubnium.hardware.computeGpus
  • dubnium.vllm.enable
  • dubnium.vllm.model

Current intended first live model:

  • Qwen/Qwen2.5-Coder-14B-Instruct

Current intended first hardware phase:

  • planned architecture: 2 GPUs
  • currently present: GPU 0
  • compute GPU set: [ 0 ]

4. Build Without Switching First

Do a dry build first:

sudo nixos-rebuild build --flake .#workstation

If this fails:

  • fix Nix evaluation issues first
  • do not jump into switch

Common first-failure areas:

  • hardware configuration mismatch
  • NVIDIA options
  • package evaluation problems
  • typos in host-local settings

5. Switch to the New Configuration

If the build succeeds:

sudo nixos-rebuild switch --flake .#workstation

6. Verify Core Pieces After Switch

Check the mode CLI:

mode status
mode current
mode desired

Check runtime state files:

sudo ls -la /run/mode-controller
sudo cat /run/mode-controller/desired
sudo cat /run/mode-controller/current
sudo cat /run/mode-controller/capability-placement.json
sudo cat /run/mode-controller/hardware-topology.json

Check systemd units:

systemctl status desktop.target
systemctl status compute.target
systemctl status studio-local-policy.service
systemctl status audio-priority.service
systemctl status vllm.service

Notes:

  • vllm.service should not be active in desktop
  • with default workstation settings, vllm.service should not exist until dubnium.vllm.enable = true
  • studio-local-policy.service and audio-priority.service should not be active unless studio-local is requested

7. Test desktop -> studio-local

sudo mode request studio-local
mode status
systemctl status studio-local-policy.service
systemctl status audio-priority.service

Expected result:

  • current mode becomes studio-local
  • studio-local-policy.service is active
  • audio-priority.service is active

Then return:

sudo mode request desktop
mode status

8. Test desktop -> compute

Before testing:

  • close REAPER
  • avoid active audio work
  • avoid long-running foreground development jobs
  • seed the local model bundle from USB
  • explicitly enable dubnium.vllm.enable = true if this test should exercise the vLLM service

Then:

sudo mode request compute
mode status
systemctl status compute.target
systemctl status vllm.service

Expected result:

  • graphical session is terminated
  • system converges to compute
  • if vLLM is enabled, vllm.service is started by compute.target

Important caveat:

Seed the local model bundle from USB before the first compute transition. If the bundle is absent, vLLM should fail clearly rather than relying on a first-run network download.


9. Test compute -> desktop

sudo mode request desktop
mode status
systemctl status vllm.service

Expected result:

  • vllm.service is stopped
  • system converges back to desktop

10. If Something Fails

Check:

mode status
sudo cat /run/mode-controller/last-transition.json
sudo cat /run/mode-controller/last-guards.json
journalctl -u 'mode-controller@*' -b
journalctl -u vllm.service -b

Most useful first diagnosis buckets:

  • guard blocked transition
  • graphical session did not terminate cleanly
  • GPU did not look released
  • vLLM service failed to start
  • model/runtime/CUDA issue

11. First Successful Milestone

You should consider first bring-up successful when all of the following are true:

  • nixos-rebuild switch --flake .#workstation succeeds
  • mode status works
  • desktop -> studio-local -> desktop works
  • desktop -> compute -> desktop works
  • last-transition.json and last-guards.json are useful for failures

At that point, the next iteration is:

  • tighten NVIDIA/vLLM runtime behavior
  • improve observe-current
  • tune audio-priority.service
  • refine slice policy
  • add second GPU when ready