Dubnium Documentation

Use this site as the operator entrypoint for installing, bringing up, and understanding the Dubnium workstation.

Primary target: the NixOS host named workstation. WSL is a headless validation target for shared modules and docs, not the deployed workstation.

Choose Your Path

Installing a workstation: start with Fresh Install and Custom Installer USB.
Bringing up an existing workstation: use First Bring-Up before transition testing.
Enabling local inference: seed the model with Model Seeding, then follow vLLM Runtime.
Working on persistent context: start with Persistent Context Memory Architecture, Memory Service, and the Memory Governance Contract.
Validating in WSL: use WSL Documentation Boundary and WSL Bring-Up.
Managing flake inputs: use Dubctl Flake Input Manager.

Current Defaults

dubnium.vllm.enable = false: vLLM is opt-in for explicit compute testing.
dubnium.plano.enable = false: Plano routing is opt-in until its runtime is installed and validated.
Runtime and user secrets stay outside Nix source. See Runtime Secrets.
User-level Home Manager configuration comes from external/dotfiles.
Generated documentation is committed under web/docs/.
Flake input operations use dubctl, exposed as nix run .#dubctl and installed by default on the workstation.

Install Source Contract

Installer media labels are DUB-ISO and DUB-SEED.
Install bootstrap uses local source from media or checkout state, not an install-time GitHub token.
Post-install source reconciliation is explicit. See Post-Install Source Reconciliation.

Ownership Boundaries

Owner	Responsibility
Dubnium	NixOS system config, workstation services, install media, runtime units
Dotfiles	Home Manager user config, user shell, user-level tool configuration
Runtime secret provider	Host and user secrets outside the Git-tracked Nix source
Model/router repos	Client policy, routing schemas, and model-router behavior

Sanity Checks

nix flake check
sudo nixos-rebuild build --flake .#workstation
mdbook build

When building docs from Windows, run mdbook build inside the NixOS WSL distro with mdbook and mdbook-mermaid in the shell.

Known Warnings

mdbook-mermaid may warn about a minor mdBook version mismatch; that warning is non-fatal when the HTML backend finishes successfully.
vLLM runtime setup should avoid broad PyTorch, audio, JAX, or TPU extras unless they are explicitly required.

Local Inference

Model Seeding
vLLM Runtime
Plano Routing Gateway

Memory System

Persistent Context Memory Architecture
vLLM Persistent Memory Prototype
Memory Service
Memory Data Model Specification
Memory Governance Contract
Anthesis Memory Envelope Examples
vLLM Memory Phase 1 Plan
Memory Phase 2: Governed Structured Memory

Architecture

Architecture Overview
Control Plane
Runtime Behavior
Diagrams

External Sources

ryjen/dotfiles feat/nix-migration is checked out at external/dotfiles and owns user-level Home Manager configuration.

Decisions

ADR-0001: Runtime Switching First
ADR-0002: Studio-Local Is a Desktop Overlay
ADR-0003: vLLM Is Compute-Only in V1
ADR-0004: Boot Defaults to Desktop
ADR-0005: k3s Stays Stable Across Modes in V1
ADR-0006: Tailscale Platform Connectivity
ADR-0007: WSL Is a Headless Validation Target
ADR-0008: Seed Local vLLM Model Bundles
ADR-0009: Manage Runtime Secrets Outside Nix Source
ADR-0010: Keep Persistent Memory Separate From vLLM Runtime
ADR-0011: External Ownership Boundaries

Runbooks

Custom Installer USB
First Bring-Up
Fresh Install
Post-Install Source Reconciliation
Laboratory Bootstrap
Model Seeding
Runtime Secrets
vLLM Runtime
vLLM Persistent Memory Prototype
Memory Service
Tailscale
Transition Testing
Failed Transition Recovery
Dubctl Flake Input Manager

WSL

WSL Documentation Boundary
Build Installer Artifacts From WSL
WSL Bring-Up
ADR-0007: WSL Is a Headless Validation Target

Runbook: Fresh Install

Status: living

Use this when installing Dubnium from a NixOS live USB onto a fresh machine.

Primary checklist:

../fresh-install-checklist.md

Key Rules

Decide disk layout before writing partitions.
After booting from the USB installer, verify the tools needed to inspect or extract the prepared repo source are available.
Because dubnium is private, use the custom installer USB as the preferred source path instead of assuming live GitHub access will work.
The custom installer USB bakes a source export into the live image; use unpack-dubnium to extract it to ~/local/src/dubnium.
The same physical USB should carry the materialized model bundle on DUB-SEED, so first boot does not depend on a model-provider download.
Generate hosts/workstation/hardware-configuration.nix from the real target mount layout.
After first boot, reconcile install-time source changes into a normal Git checkout before treating them as repo history.
Review host options before install.
Boot into a desktop-default system first.
Validate mode status before testing transitions.

First Boot Expectations

current mode should classify as desktop
the selected user’s Home Manager configuration should be present from the Dubnium dotfiles profile
vLLM should not be active
studio-local overlay services should not be active unless requested
/run/mode-controller should exist

Do not start compute testing until the desktop baseline is observable and repeatable.

Custom Installer Quick Path

If booted from the Dubnium custom installer USB:

install-dubnium-from-usb

The one-shot command partitions and formats the selected disk, unpacks the baked source snapshot, generates the workstation hardware config, and runs nixos-install. By default it then sets the normal user’s password inside the installed system with passwd; use --password-mode hash for the older host-local hash flow or --password-mode skip when another login path already exists. With no arguments it prints lsblk, prompts for the target whole disk, defaults to btrfs, copies the install snapshot into the installed system, and requires final y/N confirmation unless --yes is passed.

Manual path:

unpack-dubnium
cd ~/local/src/dubnium

Then follow the fresh-install checklist from the partitioning step onward and install with:

sudo nixos-install --flake .#workstation

After first boot, restore the selected model seed from the USB model bundle as described in Model Seeding.

If the install used the custom source snapshot or another export without .git history, follow Post-Install Source Reconciliation before committing or pushing install-time changes.

Runbook: First Bring-Up

Status: living

Use this when the target machine already runs NixOS or can build/switch from the repo.

Primary checklist:

../first-bring-up-checklist.md

Success Criteria

nixos-rebuild build --flake .#workstation succeeds.
nixos-rebuild switch --flake .#workstation succeeds.
configctl doctor succeeds.
mode status, mode current, and mode desired work.
/run/mode-controller exists and contains live state files.
desktop.target and compute.target exist.
vLLM is inactive in desktop.
studio-local can be requested and removed as a desktop overlay.

Immediate Failure Buckets

generated hardware configuration does not match the host
NVIDIA/CUDA evaluation or runtime issue
graphical target/session mismatch
mode controller tools not installed
observer reports false success or conflicting state

If mode state looks wrong, prefer fixing observation before adding transition logic.

Runbook: Custom Installer USB

Status: living

Use this when installing Dubnium from private installer media without relying on GitHub credentials during the live install.

The current Dubnium installer flow writes the custom ISO to one physical USB stick as a raw disk image, matching Rufus “DD image mode” behavior:

dubnium-installer.iso -> whole USB disk

The installer image bakes an exported source snapshot of this repo and the external/dotfiles submodule into the live system. The snapshot excludes .git directories, so it is source content rather than a Git working copy with history. Treat the USB as private media because it contains the private Dubnium source.

Model seed bundles are separate from raw USB writing. Put the materialized bundle on separate media, or build it into a future image format explicitly.

What This Provides

no GitHub token during install
no install-time private GitHub clone
git, jq, rsync, vim, and install helpers in the live environment
unpack-dubnium, which unpacks the baked source snapshot to:

~/local/src/dubnium

raw whole-disk USB writing for the custom installer ISO

Build The Installer ISO

Before baking, make sure the repo and submodule state are intentionally clean or intentionally staged. The flake source snapshot only sees tracked files.

git status --short
git -C external/dotfiles status --short

scripts/build-installer-iso.sh \
  --iso ./dubnium-installer.iso

By default the script ensures the current Dubnium default seed bundle idempotently for separate seed media. The seed contract is model-agnostic: the seed must be a materialized model directory with config.json and SHA256SUMS.

Detection first checks DUBNIUM_SEED_MODEL, then common paths beside the repo for the current default bundle.

Use --seed-model to override detection, --no-seed-download to require a pre-existing bundle, or --no-seed-model to build installer-only media.

The script is a wrapper around this build:

nix --extra-experimental-features 'nix-command flakes' \
  build .#nixosConfigurations.installer.config.system.build.isoImage

The ISO appears under:

result/iso/

The ISO build uses Nix’s flake source snapshot and bakes that source into the installer image. This is an export-style payload: no .git directories and no Git history.

Create A Standalone Git Export Payload

If you want a source artifact separate from the ISO, use the git-export helper:

scripts/export-installer-source.sh dubnium-installer-source.tar.gz

The helper requires the main repo and external/dotfiles submodule to be clean. It uses git archive for both sources and writes a payload shaped like:

dubnium/
└── external/
    └── dotfiles/

This payload is useful for inspection, offline transfer, or alternate installer media. The custom ISO still bakes its own payload from the same flake source that Nix evaluates.

Verify The Baked Payload

The built payload should contain the workstation host, dotfiles submodule source, and USB helpers:

payload="$(find /nix/store -maxdepth 1 -name '*-dubnium-installer-source.tar.gz' | head -n 1)"

tar -tzf "$payload" | grep -E \
  '^dubnium/(flake.nix|hosts/workstation/default.nix|external/dotfiles/flake.nix|scripts/build-installer-iso.sh|scripts/export-installer-source.sh|scripts/write-installer-usb.ps1|scripts/write-installer-usb.sh)$'

if tar -tzf "$payload" | grep -q '/\.git/'; then
  echo "unexpected .git directory in payload"
  exit 1
fi

Prepare The USB From Windows PowerShell

After building dubnium-installer.iso, use this helper only when preparing the USB from Windows PowerShell. It writes the ISO bytes directly to the whole USB disk, like Rufus DD image mode:

.\scripts\write-installer-usb.ps1 `
  -IsoPath .\dubnium-installer.iso `
  -DiskNumber 7 `
  -ExpectedFriendlyName "USB SanDisk 3.2Gen1"

The script refuses to continue unless the selected disk is the expected USB device. It overwrites the whole disk with the ISO image. -SeedModelPath is intentionally rejected in raw mode because there is no separate writable seed partition to copy into.

After writing, eject and reinsert the USB if Windows does not refresh the new ISO layout immediately. Verify the installer media from whichever drive letter Windows assigns:

Test-Path I:\EFI\BOOT\BOOTX64.EFI
Test-Path I:\nix-store.squashfs
Get-Volume -DriveLetter I

Prepare The USB From macOS Or Linux

The Bash helper performs the same raw whole-disk image write on macOS or Linux. Pass the whole USB disk, not a partition.

Linux example:

lsblk -o NAME,SIZE,MODEL,TRAN,TYPE,MOUNTPOINTS

scripts/write-installer-usb.sh \
  --iso ./dubnium-installer.iso \
  --disk /dev/sdX \
  --expected SanDisk

macOS example:

diskutil list
diskutil info /dev/diskN

scripts/write-installer-usb.sh \
  --iso ./dubnium-installer.iso \
  --disk /dev/diskN \
  --expected SanDisk

The script refuses to write non-removable media, requires the selected device identity to contain --expected when provided, and asks for y at the Proceed? [y/N]: prompt before erasing the disk unless --yes is passed.

Optional One-Shot Wrappers

The older wrappers still exist for convenience, but they are not the preferred boundary:

.\scripts\build-installer-usb.ps1 `
  -DiskNumber 7 `
  -ExpectedFriendlyName "USB SanDisk 3.2Gen1"

bash scripts/build-installer-usb.sh \
  --disk /dev/sdX \
  --expected SanDisk

Use the Bash one-shot path only when the whole USB disk is visible inside the Linux environment.

Seamless USB Acceptance Check

Before leaving the build machine, verify the USB contains everything needed for a token-free install:

EFI/BOOT/BOOTX64.EFI
nix-store.squashfs

The install path should not require:

a GitHub token
a private SSH key
a Hugging Face download during install when separate model seed media is used
copying model weights into the Dubnium Git tree

Keep the USB physically private. It contains private source code in the installer payload.

Add The Model Seed Bundle

Do not copy the raw Hugging Face cache directory as the seed. The cache uses refs, blobs, snapshots, and symlinks. Seed media should contain a normal local model bundle.

Use separate writable media for the model seed bundle. Mount that media and copy a materialized model directory:

sudo mkdir -p /mnt/e
sudo mount -t drvfs E: /mnt/e

sudo mkdir -p /mnt/e/models
sudo rsync -a --info=progress2 \
  /path/to/selected-model-bundle/ \
  /mnt/e/models/selected-model-bundle/

Expected seed path:

models/selected-model-bundle/

Lightweight bundle check:

test -f /mnt/e/models/selected-model-bundle/config.json
test -f /mnt/e/models/selected-model-bundle/SHA256SUMS

See Model Seeding for creating the bundle and checksum manifest.

Install From The USB

Boot the target machine from the USB. Prefer the UEFI entry for the Dubnium installer USB.

For the guarded one-shot path, run the helper with no arguments:

install-dubnium-from-usb

The helper prints lsblk, prompts for the target whole disk, and then prompts for install options. Defaults are btrfs for the root filesystem, dubnium for the Home Manager machine profile, passwd for password setup, and copying the install snapshot to /root/dubnium-install-snapshot in the installed system.

This command erases the selected whole disk, unpacks the baked source snapshot, generates hosts/workstation/hardware-configuration.nix, and runs:

sudo nixos-install --flake .#workstation

Use --dry-run to print the plan without touching disks. Use --user USER to write hosts/workstation/user.nix before install. Use --home-profile dubnium|technetium to select the Home Manager machine profile that installs the matching ~/.config/hypr/adopted.d/machine.conf. Use --password-mode hash to write a host-local initial password hash before install, or --password-mode skip when another login path already exists; the default passwd mode sets the password inside the installed system after nixos-install. Use --no-copy-source if you do not want the install snapshot preserved for post-install reconciliation.

The one-shot command still prints the plan and requires final confirmation:

Proceed? [y/N]:

Use --yes only for rehearsed installs where the disk identity was already verified.

Manual path:

In the live installer terminal:

unpack-dubnium
cd ~/local/src/dubnium

Confirm the baked source exists:

test -f flake.nix
test -f hosts/workstation/default.nix
test -f external/dotfiles/flake.nix

Then continue the fresh-install flow from the local checkout:

sudo nixos-install --flake .#workstation

The workstation target imports the Dubnium Home Manager module from external/dotfiles, so the Dubnium dotfiles profile is applied to the selected normal user as part of the system install.

Install For Another User

To choose the installed normal user, create hosts/workstation/user.nix in the unpacked source before running nixos-install:

{
  dubnium.user.name = "alice";
  dubnium.user.description = "Example User";
}

Then install normally:

sudo nixos-install --flake .#workstation

The same dotfiles Dubnium Home Manager profile is applied to the selected user. The profile source lives in the dotfiles submodule, but the username and home directory are supplied by dubnium.user.name.

unpack-dubnium --user USER only changes where the source is unpacked in the live installer session. It does not change the installed NixOS user; use dubnium.user.name for that.

After First Boot

After the installed system boots, seed the vLLM model store from the bundle on separate seed media and verify the checksum manifest before starting compute mode. See Model Seeding for the exact restore commands.

What Not To Put On The USB

Avoid storing:

long-lived private SSH keys
reusable GitHub credentials
generated age identity files
decrypted SOPS files
model weights inside the Git repo or ISO payload
raw Hugging Face cache directories as the seed shape

The source snapshot and a separate materialized model bundle are enough for this installer flow.

Model Seeding

Dubnium keeps model weights out of Git and out of the Nix store. Nix owns the runtime policy and vLLM service definition; model bytes are runtime data under /var/lib/dubnium/models.

The workstation configuration selects a vLLM model, but the USB seed format does not depend on one specific model. Use the configured model’s local bundle name where the examples say selected-model-bundle.

The installed workstation serves a local model bundle from:

/var/lib/dubnium/models/selected-model-bundle

This avoids depending on the Hugging Face hub cache layout at runtime. The USB seed carries a normal directory of model files plus a checksum manifest.

Runtime Model Store

Dubnium creates:

/var/lib/dubnium/models

The workstation vLLM service passes this local path to vllm serve:

/var/lib/dubnium/models/selected-model-bundle

Do not commit model weights to the Dubnium repo. Do not put model weights inside the Nix store or custom ISO payload.

USB Seed Layout

Use a stable USB layout so the same seed can be used during fresh install, recovery, or rebuild:

DUB-SEED/
└── models/
    └── selected-model-bundle/
        ├── config.json
        ├── generation_config.json
        ├── model-00001-of-000NN.safetensors
        ├── model-00002-of-000NN.safetensors
        ├── model.safetensors.index.json
        ├── tokenizer.json
        ├── tokenizer_config.json
        ├── vocab.json
        ├── merges.txt
        ├── LICENSE
        ├── README.md
        └── SHA256SUMS

The exact file set may vary by model revision, but the directory must be a materialized model snapshot, not a Hugging Face refs / blobs / snapshots cache tree.

Create A Local Bundle

If the source already exists as a normal model directory, copy it directly to the seed partition:

mkdir -p /run/media/$USER/DUB-SEED/models
rsync -a --info=progress2 \
  /path/to/selected-model-bundle/ \
  /run/media/$USER/DUB-SEED/models/selected-model-bundle/

Preferred source: a materialized model directory from a trusted local store or previously prepared artifact. Do not make the fresh install depend on Hugging Face availability.

Legacy fallback: if the only available source is an existing Hugging Face cache on the build machine, materialize the current snapshot once by following symlinks. This is a build-machine preparation step, not an install-time dependency:

MODEL_CACHE=/var/lib/vllm/.cache/huggingface/hub/models--OWNER--MODEL
REVISION="$(cat "$MODEL_CACHE/refs/main")"

mkdir -p /run/media/$USER/DUB-SEED/models/selected-model-bundle
rsync -aL --info=progress2 \
  "$MODEL_CACHE/snapshots/$REVISION/" \
  /run/media/$USER/DUB-SEED/models/selected-model-bundle/

Then create the checksum manifest:

cd /run/media/$USER/DUB-SEED/models/selected-model-bundle
find . -type f ! -name SHA256SUMS -print0 \
  | sort -z \
  | xargs -0 sha256sum \
  > SHA256SUMS

Seed From USB

After the workstation has booted into NixOS and the USB is mounted, copy the bundle into the Dubnium model store:

sudo mkdir -p /var/lib/dubnium/models
sudo rsync -a --info=progress2 \
  /run/media/$USER/DUB-SEED/models/selected-model-bundle/ \
  /var/lib/dubnium/models/selected-model-bundle/
sudo chown -R root:root /var/lib/dubnium/models/selected-model-bundle

Adjust the mount path if the USB is mounted somewhere else.

Verify the checksum manifest:

cd /var/lib/dubnium/models/selected-model-bundle
sudo sha256sum -c SHA256SUMS

Then verify the local model path exists:

test -f /var/lib/dubnium/models/selected-model-bundle/config.json
test -f /var/lib/dubnium/models/selected-model-bundle/model.safetensors.index.json

Acceptance Check

After seeding, switch to compute only when normal bring-up preconditions are satisfied:

sudo mode request compute
systemctl status vllm.service
journalctl -u vllm.service -b

The first start should load the local model path. If vLLM tries to fetch model files from the network, the model argument or bundle location is wrong.

Runbook: vLLM Runtime

Status: living

Use this when Dubnium’s NixOS configuration manages vllm.service, but the vLLM Python/CUDA runtime is installed outside the Nix store.

NixOS owns:

vllm.service
/var/lib/vllm
/var/lib/dubnium/models
CUDA_VISIBLE_DEVICES
ai.dubnium
Tailscale-only firewall exposure

The external runtime owns:

/var/lib/vllm/venv
Python, PyTorch, vLLM, and CUDA wheel packages inside that venv

This keeps rebuilds fast and avoids compiling PyTorch, CUDA, CuPy, MAGMA, OpenCV CUDA, or vLLM during nixos-rebuild.

Scope

This runbook covers the current hybrid-Nix phase. NixOS is authoritative for the service contract, host alias, firewall exposure, users, directories, environment, and health checks. The Python/CUDA package runtime is mutable operator-managed state under /var/lib/vllm/venv.

A pure-Nix vLLM runtime is a separate later phase. That phase should be treated as build-infrastructure work: it likely needs a dedicated CUDA builder, an Attic/Cachix/nix-serve cache, or an upstream Nixpkgs packaging path that avoids rebuilding the full CUDA/PyTorch/vLLM stack on every workstation.

Preconditions

the host has been switched to a Dubnium generation with dubnium.vllm.runtime = "external"
uv is available in the operator shell
NVIDIA GPU access works on the host
model weights are already seeded under /var/lib/dubnium/models

Check GPU visibility first:

nvidia-smi

1. Create The Runtime Directory

sudo install -d -m 0755 -o root -g root /var/lib/vllm
sudo install -d -m 0755 -o root -g root /var/lib/dubnium/models

The NixOS module also declares these directories. These commands are safe to run before or after nixos-rebuild switch.

2. Install vLLM Into The Managed venv

Create a fresh venv:

sudo uv venv --python /run/current-system/sw/bin/python3.12 --python-preference only-system /var/lib/vllm/venv

Install vLLM with CUDA/PyTorch wheels selected by uv:

sudo env UV_TORCH_BACKEND=auto uv pip install --python /var/lib/vllm/venv/bin/python vllm

This is intentionally the only default install command. Do not install audio, JAX, TPU, or broad framework extras during workstation bring-up. In particular, avoid commands that reinstall torchvision, torchaudio, or jax unless a specific workload requires them and the host has enough memory to resolve, download, install, and import that dependency set. The default Dubnium vLLM path is text inference against a local model bundle.

The upstream vLLM GPU install docs recommend uv pip install vllm --torch-backend=auto so uv can select the PyTorch backend from the installed CUDA driver. If that flag is not supported by the installed uv, use the environment variable form above or update uv.

If the installed uv supports newer PyTorch backends, use a specific CUDA backend that matches the host driver. For CUDA 13.0:

sudo uv pip install --python /var/lib/vllm/venv/bin/python --torch-backend=cu130 vllm

Some packaged uv versions may not list cu130 yet. On those versions, keep the default install command above, or upgrade uv to a version that supports the host CUDA backend. Do not use a broad PyTorch-family reinstall as a workstation bring-up workaround; it can pull optional packages such as torchaudio and exceed available memory.

If PyTorch CUDA selection is wrong after the default install, recreate the venv and rerun the vLLM install with a supported UV_TORCH_BACKEND or --torch-backend value rather than layering more framework packages into the same environment.

Host config adds the venv’s PyTorch and NVIDIA wheel library directories to LD_LIBRARY_PATH. That is required because the external venv is outside the Nix store and vLLM’s CUDA extension must be able to find libtorch, libcudart, and the CUDA wheel libraries at runtime.

The service also sets CC to Nix’s C compiler wrapper. Triton may compile a small runtime helper during vLLM startup even when vLLM itself is installed in the external venv.

Keep dubnium.vllm.runtime = "package" available for the future pure-Nix phase, but do not use it for this external-runtime path.

3. Verify The Runtime

Check the executable:

/var/lib/vllm/venv/bin/vllm --version

Check CUDA through PyTorch:

/var/lib/vllm/venv/bin/python -c "import torch; print(torch.cuda.is_available())"

Expected:

True

If this prints False, fix the venv/PyTorch/CUDA wheel selection before debugging Dubnium’s systemd service.

4. Verify The Local Model Bundle

Dubnium keeps model weights out of Git and out of the Nix store. The vLLM service should point at a local model bundle.

MODEL_DIR=/var/lib/dubnium/models/qwen2.5-coder-14b-instruct

If the model bundle was seeded from removable media, verify that the local bundle exists:

test -f "$MODEL_DIR/config.json"
test -f "$MODEL_DIR/model.safetensors.index.json" || test -f "$MODEL_DIR/model.safetensors"

If SHA256SUMS exists, verify it:

cd "$MODEL_DIR"
sudo sha256sum -c SHA256SUMS

If vLLM tries to download model files on first start, the configured model path or local bundle is wrong.

5. Start The Service

Start compute mode or restart the service directly:

sudo systemctl start compute.target
sudo systemctl restart vllm.service

Inspect service state:

systemctl status vllm --no-pager
journalctl -u vllm -n 100 --no-pager
systemctl show vllm.service -p ExecStart --value
systemctl show vllm.service -p Environment --value

If /var/lib/vllm/venv/bin/vllm does not exist or is not executable, vllm.service should fail before startup with an executable check error. That means the NixOS service contract is present but the external runtime has not been installed yet.

6. Verify The API

From the Dubnium host:

getent hosts ai.dubnium
curl http://ai.dubnium:8000/v1/models

From another tailnet machine:

curl http://<dubnium-tailnet-name>:8000/v1/models

ai.dubnium is host-local unless the tailnet DNS or client hosts file also maps that name to the Dubnium node’s Tailscale IP.

References

vLLM GPU installation docs: https://docs.vllm.ai/en/latest/getting_started/installation/gpu/
Model seeding policy: ADR-0008
Tailscale exposure: Tailscale

Plano routing gateway

Dubnium owns the system/runtime side of Plano. User-level client configuration lives in ryjen/dotfiles through Home Manager modules.

Boundary

Dubnium
  systemd service lifecycle
  compute target integration
  vLLM/Ollama local model endpoint
  ai.slice placement
  runtime state under /var/lib and /var/cache

ryjen/dotfiles
  Home Manager user config
  ~/.config/planoai/dubnium.yaml
  ~/.config/model-router/profiles/local-first-dev.yaml
  shell environment and helper scripts

ryjen/model-router
  source policy schemas
  route-decision record semantics
  governance-oriented model-router design

Service model

The Plano workload module is defined at:

modules/workloads/plano.nix

It creates:

plano.service

When enabled, the service is attached to:

compute.target
ai.slice

It is intentionally disabled by default in hosts/workstation/default.nix.

Defaults

dubnium.plano = {
  enable = false;
  runtime = "external";
  externalExecutable = "/var/lib/plano/venv/bin/planoai";
  host = "127.0.0.1";
  port = 12000;
  localBaseUrl = "http://127.0.0.1:8000/v1";
  exposeOnTailscale = false;
};

The default local model endpoint assumes vllm.service is serving an OpenAI-compatible API on port 8000.

Enablement

Enable once the Plano executable exists:

dubnium.plano.enable = true;

For the current external runtime default, verify:

test -x /var/lib/plano/venv/bin/planoai

If Plano becomes available as a Nix package or overlay, switch to:

dubnium.plano = {
  enable = true;
  runtime = "package";
  package = pkgs.<plano-package>;
};

Validation

Dry-build the workstation target:

sudo nixos-rebuild build --flake .#workstation

Then inspect the generated unit:

systemctl cat plano.service

When enabled and in compute mode:

sudo mode request compute
systemctl status vllm.service
systemctl status plano.service

Check the gateway endpoint:

curl http://127.0.0.1:12000

The exact health endpoint may differ depending on Plano’s runtime API.

Security notes

Keep exposeOnTailscale = false until the gateway behavior is validated
Do not store cloud provider secrets in the generated config
Prefer environment files managed by sops-nix or another host secret provider
Treat Plano as routing infrastructure, not an authorization layer
Privacy and route policy belong above the gateway in model-router/Anthesis semantics

Failure behavior

The service fails closed if the configured Plano executable is missing because ExecStartPre checks that the executable exists.

Fallback between models must not bypass privacy, budget, safety, or approval failures. Those are policy failures, not operational retry events.

Persistent Context Memory Architecture

Status: planning

This document describes the long-term persistent context memory architecture for Dubnium’s local vLLM runtime.

Goals

The architecture should:

support long-lived conversational and agentic workflows
preserve low-latency vLLM inference characteristics
separate inference runtime concerns from memory persistence
expose enough structure for replay, audit, and policy enforcement
operate efficiently on constrained local GPU hardware
leave room for Anthesis-style governed agent systems

Future Governance Boundary

A future governance layer remains external to this memory/runtime architecture.

The memory/runtime layer stores, retrieves, summarizes, compacts, and serves context. It records structured metadata and lifecycle events so another layer can inspect, constrain, attest, or replay behavior later.

The future governance layer evaluates policy, provenance, trust, retention, audit, and replay concerns. This document does not define that governance authority.

Dubnium memory/runtime layer
    = stores, retrieves, summarizes, compacts, and serves context

Future governance layer
    = evaluates policy, provenance, trust, retention, audit, and replay concerns

Design implication: memory records, artifacts, retrieval events, and runtime transitions must be structured and externally observable, but vLLM, vector stores, artifact stores, and MemGPT-style runtimes must not depend directly on a future governance substrate.

Core Principle

vLLM is the inference runtime.

Persistent memory is a separate subsystem.

Do not persist transformer KV state as durable memory. KV state can remain an inference optimization inside vLLM. Durable memory must be reconstructable from stored events, summaries, artifacts, metadata, and retrieval records.

flowchart TD
    U[User or Agent] --> O[Orchestrator]
    O --> W[Working Context Buffer]
    O --> R[Retriever]
    O --> T[Task State Store]
    R --> V[(Vector Store)]
    R --> M[(Structured Memory Store)]
    O --> L[vLLM]
    L --> S[Summarizer]
    S --> E[Embedding Pipeline]
    E --> V
    S --> M

Layers

Inference

Responsibilities:

token generation
batching
prefix caching
streaming
model lifecycle management

Recommended components:

Component	Recommendation
Inference runtime	vLLM
Primary models	Qwen, DeepSeek, Llama-family
Embeddings	bge-small or nomic-embed
Quantization	AWQ or GPTQ initially

Inference nodes should remain stateless where possible. Durable memory logic does not belong inside inference workers.

Working Context

Working context maintains immediate conversational and task continuity.

It contains recent messages, tool outputs, current objectives, active plans, and unresolved references.

Storage options:

Option	Use
Redis	fast transient sessions
SQLite	single-user local setups
Postgres	unified durable stack

Recommended strategy:

keep the last N conversational turns verbatim
keep a rolling summary for older turns
keep external references outside the prompt

Episodic Memory

Episodic memory stores meaningful historical interactions, such as debugging sessions, deployment history, design discussions, incidents, and user preferences.

Example shape:

{
  "id": "uuid",
  "timestamp": "ISO8601",
  "session_id": "uuid",
  "memory_type": "episodic",
  "summary": "Condensed interaction summary",
  "importance": 0.82,
  "ttl": null,
  "source": "conversation",
  "provenance": {
    "model": "qwen",
    "extractor_version": "1"
  }
}

Semantic Memory

Semantic memory stores normalized stable facts and reusable knowledge: infrastructure topology, user preferences, architecture decisions, project conventions, and coding standards.

Semantic memory is not raw transcript storage.

Instead of storing “user mentioned NixOS several times”, store:

{
  "fact": "Primary workstation uses NixOS",
  "confidence": 0.94,
  "scope": "personal-preference"
}

Task State

Task state is active execution state, not conversational memory.

Examples:

queued work
workflow checkpoints
active RFC generation
agent plans
unresolved actions
execution graphs

Task state should be strongly structured. Do not embed executable workflow state inside vector stores.

Component	Recommendation
Structured store	Postgres
Queueing	RabbitMQ or Redis Streams
Workflow engine	Temporal later

Retrieval

Retrieval responsibilities:

semantic search
scoped retrieval
ranking
filtering
relevance compression

flowchart LR
    Q[Query] --> E[Embed Query]
    E --> S[Vector Search]
    S --> R[Re-ranker]
    R --> C[Context Builder]

Retrieval constraints:

Constraint	Example
Session scope	only current project
TTL	exclude expired memories
Agent boundary	isolate agents
Recency weighting	prioritize recent events

The orchestrator constrains retrieval scope and memory assembly. Future governance can inspect the retrieval event stream and stored metadata, but the retriever must remain useful without embedding a governance engine.

Minimal Stack

Concern	Technology
Inference	vLLM
Structured data	Postgres
Vector search	pgvector
Session cache	Redis
Object storage	local filesystem first, MinIO later
Queueing	Redis Streams first, RabbitMQ later

Artifact And Binary Memory

Artifacts and memory are distinct concepts.

Concept	Meaning
Memory	semantic or cognitive abstraction
Artifact	raw external object
Evidence	immutable referenced source
Context	transient prompt state
Knowledge	validated normalized facts

Raw binaries should not be first-class prompt memory. Binaries remain externalized, semantic extraction feeds retrieval systems, agents retrieve references and derived context, and multimodal inference runs on demand.

Initial artifact types:

Type	Examples
Images	screenshots, whiteboards, diagrams
Documents	PDFs, Office docs
Audio	recordings, meetings
Video	demos, walkthroughs
Source bundles	archives, repos
Logs	runtime and system logs
Structured data	CSV, JSON, YAML

flowchart TD
    A[Artifact Upload] --> B[Object Storage]
    A --> C[Extraction Pipeline]
    C --> D[OCR]
    C --> E[Captioning]
    C --> F[Metadata Extraction]
    C --> G[Embedding Generation]
    D --> H[Semantic Records]
    E --> H
    F --> H
    G --> H
    H --> I[(Vector Store)]
    H --> J[(Structured Metadata Store)]

Artifact metadata should include content hashes, storage URIs, MIME type, derived captions or OCR, embedding references, provenance, trust hints, and sensitivity hints.

Binary artifacts create operational risk: screenshots can contain credentials, EXIF metadata can leak location, visual data can be sensitive, retrieved artifacts can amplify exposure, and malicious files can poison extraction pipelines. Those controls belong in the external governance/security layer, but the memory layer must expose enough metadata and hooks for them.

Multimodal Retrieval

For normal text prompts, retrieve captions, OCR text, semantic embeddings, metadata, and artifact references rather than injecting raw binaries.

When multimodal reasoning is required:

Semantic retrieval locates relevant artifacts.
Artifact references are resolved.
Binaries are attached to VLM requests.
Multimodal inference runs on demand.

Candidate model classes:

Model	Purpose
Qwen-VL	local multimodal reasoning
CLIP or SigLIP	image-text embeddings
Whisper	audio transcription
OCR pipelines	document extraction

OCI-Compatible Future

Dubnium should stay compatible with OCI-style cognition and artifact distribution.

OCI registries are a strong long-term fit for content addressing, distribution, deduplication, signing, provenance layering, immutable references, artifact versioning, and registry federation.

Candidate future artifact classes:

Artifact class	Example
Model artifacts	GGUF, safetensors
Embedding indexes	vector snapshots
Prompt bundles	governed prompts and system policies
Memory bundles	exported episodic memory sets
Workflow definitions	agent workflows
Execution traces	replayable sessions
Multimodal artifacts	image, document, and audio evidence
Tool contracts	MCP capability manifests

Long-term direction:

OCI artifact
    = versioned governed cognition object

This allows Dubnium to evolve toward replayable cognition, portable agent state, attestable workflows, signed memory exports, reproducible multimodal sessions, and distributed cognition registries without coupling cognition storage to one database implementation.

MemGPT-Style Runtime Evolution

MemGPT-style runtimes remain an incremental upgrade path after the persistent memory substrate is stable. Current Letta documentation describes this lineage as agents with in-context core memory, recall memory, archival memory, and self-editing memory tools.

Do not couple Dubnium directly to Letta or MemGPT internals early. Define stable interfaces first:

class MemoryRuntime:
    def retrieve(...): ...
    def summarize(...): ...
    def compact(...): ...
    def promote(...): ...
    def classify(...): ...

Evolution path:

Phase	Capability
1	governed retrieval with explicit schemas
2	rolling summaries, compaction, and bounded working context
3	reflection, summarization loops, memory promotion, relevance scoring
4	adaptive retrieval, workflow-aware recall, retrieval planning
5	portable cognitive runtime artifacts and OCI-packaged memory overlays

Preserve the distinction between runtime cognition and durable external state. MemGPT-style runtimes should remain replaceable, capability-scoped, inspectable, and externally configurable.

Phases

Phase 1: Minimal Viable Memory

Deliver durable conversation storage, semantic retrieval, basic summarization, Postgres plus pgvector, an embedding pipeline, retrieval API, and rolling conversation summaries.

Phase 2: Structured Memory

Deliver episodic and semantic separation, retrieval filtering, scoped namespaces, metadata tagging, and confidence scoring.

Phase 3: Multi-Agent Coordination

Deliver isolated agent memory, shared collaborative memory, workflow continuity, capability-scoped retrieval, memory federation, execution checkpoints, and task orchestration.

Non-Goals

Avoid initially:

serialized GPU KV persistence
distributed GPU cache coherence
infinite-context simulation
recurrent-memory transformer experimentation
fully autonomous self-modifying memory

These add substantial complexity and operational instability.

First Milestone

Build a local prototype with:

vLLM
Qwen coder model
Postgres
pgvector
Redis
bge-small embeddings
retrieval middleware
rolling summaries

Then validate latency, retrieval quality, memory drift, and hallucinated recall before expanding into multi-agent memory systems.

Runbook: vLLM Persistent Memory Prototype

Status: planning

Use this when designing or validating a Dubnium memory subsystem around the local vLLM runtime.

vLLM owns inference. The memory subsystem owns persistence, retrieval, summarization, compaction, artifact references, and replay inputs. Do not make durable memory depend on serialized transformer KV state.

Scope

This runbook covers the first prototype milestone:

durable conversation and event storage
rolling summaries
embeddings for retrieval
scoped retrieval
externally observable metadata on every stored memory
bounded prompt assembly for vLLM

It does not cover multi-agent federation, distributed workflow engines, cryptographic memory attestation, or a pure-Nix packaging path for all services. It also does not adopt Letta or another MemGPT-style agent framework in the first milestone; those belong after the local storage, retrieval, and governance contracts are proven.

Future governance remains external to this runbook. The prototype records metadata and lifecycle events so a later governance substrate can inspect, constrain, attest, or replay behavior, but the prototype does not implement the governance authority itself.

Target Shape

flowchart TD
    U[User or Agent] --> O[Orchestrator]
    O --> W[Working Context]
    O --> R[Retriever]
    O --> T[Task State]
    R --> V[(pgvector)]
    R --> M[(Postgres Memory Tables)]
    O --> L[vLLM]
    L --> S[Summarizer]
    S --> E[Embedding Worker]
    E --> V
    S --> M

Prototype Components

Use conservative local services first:

Concern	Prototype choice
Inference	existing `vllm.service`
Structured store	Postgres
Vector search	pgvector
Working context	Redis or Postgres
Queueing	Redis Streams initially
Object storage	local filesystem first, MinIO later
Embeddings	bge-small or nomic-embed

Keep large artifacts outside prompt assembly. Store references to files, logs, and generated outputs, then retrieve and compress only the relevant excerpts.

Data Classes

Working context is transient session state: recent messages, current objective, active plan, unresolved references, and recent tool outputs.

Episodic memory records meaningful historical interactions, such as debugging sessions, deployment history, design discussions, and operational incidents.

Semantic memory records normalized facts, preferences, project conventions, infrastructure topology, and architecture decisions. Do not treat raw transcripts as semantic memory.

Task state records active workflow state: queued work, checkpoints, execution graphs, pending validations, and unresolved actions.

Metadata records where a memory came from, how trusted it appears, how sensitive it appears, how long it should live, and which scopes may retrieve it. A later governance layer can evaluate that metadata, but the Phase 1 memory service only records and exposes it.

Minimum Schema Direction

The first schema should keep memory objects and embeddings separate so memory metadata can evolve without rewriting vector payloads.

Suggested tables:

sessions
memories
memory_embeddings
tasks
artifacts
provenance

Each memory row should include:

{
  "id": "uuid",
  "session_id": "uuid",
  "memory_type": "episodic",
  "summary": "Condensed interaction summary",
  "scope": "project:dubnium",
  "importance": 0.82,
  "confidence": 0.76,
  "sensitivity": "internal",
  "validation_status": "unverified",
  "ttl": null,
  "source": "conversation",
  "created_at": "ISO8601",
  "provenance": {
    "origin": "agent",
    "model": "qwen",
    "extractor_version": "1"
  }
}

Retrieval Contract

The retriever should take a scoped request from the orchestrator and return scoped context candidates, not final prompts.

Required filters:

project or session scope
agent namespace
TTL expiration
recency

Recommended ranking inputs:

vector similarity
keyword match
recency
importance
source authority
validation status

The context builder should compress results before prompt assembly and preserve citations, artifact references, retrieval event ids, or memory ids so a response can be audited later.

Storage Path

Capture a conversation, tool event, task event, or artifact reference.
Classify the event and reject data that should not become durable memory.
Redact secrets and sensitive payloads.
Summarize the event into a typed memory candidate.
Attach provenance, sensitivity, scope, confidence, and retention metadata.
Embed the memory summary.
Store structured memory and vector data.
Schedule expiration or revalidation when retention metadata requires it.

Retrieval Path

Receive a query and current task scope from the orchestrator.
Embed the query.
Search the vector index and any structured filters.
Apply scope, TTL, and sensitivity filters before re-ranking.
Re-rank by relevance, recency, importance, and source hints.
Compress selected context.
Return context candidates with ids, scope, and provenance.
Assemble the final vLLM prompt outside the retriever.

Validation Checks

Before treating the prototype as useful, test:

latency impact on vLLM request path
recall quality for prior sessions
false recall and hallucinated-memory rate
memory poisoning resistance
prompt-injection persistence resistance
cross-project and cross-agent isolation
secret redaction before storage
TTL expiration and revalidation behavior
replay from stored events and memory ids

Acceptance Criteria

The first milestone is complete when:

vLLM can answer with retrieved context without changing vllm.service
memory storage survives service restart
retrieval can be scoped to one project
expired or sensitive memories are excluded from prompt assembly
summaries can be traced back to source events or artifacts
a replay can reconstruct which memories were available to a response

Artifact Handling

Artifacts are not memory. Store raw binaries outside prompts and retrieve derived context by default:

captions
OCR text
extracted metadata
embeddings
content hashes
artifact references

Use on-demand multimodal inference only when a task needs the binary itself. The retrieval result should carry an artifact reference rather than copying the artifact into ordinary text prompt memory.

Incremental Upgrade: MemGPT / Letta

After the Phase 1 substrate is stable, evaluate MemGPT-style self-editing memory as an orchestration-layer upgrade. Use current Letta documentation when testing concrete framework integration; reserve “MemGPT” for the research pattern unless a legacy component explicitly uses that name.

The evaluation should answer:

whether Letta can use Dubnium’s Postgres/pgvector-backed memory stores without bypassing scope, sensitivity, TTL, validation, or provenance filters
whether agent-managed memory edits can be audited and replayed
whether archival and recall memory operations can preserve Dubnium memory ids and source lineage
whether the framework can call local vLLM without requiring model-hosted memory persistence
whether rejected, expired, or sensitive memories stay out of generated prompts

Do not adopt the framework if it requires storing ungoverned transcripts, credentials, or tool outputs in durable memory.

References

ADR-0010
Persistent Context Memory Architecture
ADR-0003
ADR-0008
ADR-0009
vLLM Runtime

Runbook: Memory Service

Status: prototype

Use this after explicitly enabling dubnium.memory.enable = true for the workstation host. The memory service is intentionally opt-in during Phase 1 so first bring-up does not automatically start additional persistent services.

The memory service is the local persistent context substrate for Dubnium. It does not govern agent behavior by itself. Anthesis or another orchestrator should authorize retrieval, inspect provenance, and decide whether retrieved memory may be injected into an agent prompt.

Service Boundary

Anthesis / orchestrator
  -> Dubnium memory API
  -> Postgres + pgvector
  -> Redis working context / queue substrate
  -> vLLM prompt assembly outside the memory service

The API must remain bound to 127.0.0.1 for the Phase 1 prototype.

Service Impact

Enabling dubnium.memory starts additional local services:

postgresql.service
redis-dubnium-memory.service
dubnium-memory-api.service

It also runs packaged memory-service migrations before the API starts. Validate the package and module evaluation before enabling this on the bare-metal workstation target.

Enable Locally

The default workstation target keeps the memory service disabled. Enable it through a host-local override such as hosts/workstation/user.nix:

{
  dubnium.memory = {
    enable = true;
    api.host = "127.0.0.1";
    api.port = 8090;
    retention.defaultTtlDays = null;
  };
}

Then build before switching:

nix --extra-experimental-features "nix-command flakes" build .#memory-service
sudo nixos-rebuild build --flake .#workstation

Verify Disabled Default

Without a host-local override, the workstation target should keep the prototype disabled:

nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.dubnium.memory.enable

Expected:

false

Verify Enabled Configuration

After enabling through hosts/workstation/user.nix, verify:

nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.dubnium.memory.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.dubnium.memory.api.host
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.services.postgresql.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.services.redis.servers.dubnium-memory.enable

Expected:

true
"127.0.0.1"
true
true

Verify Services

After switching an enabled configuration:

systemctl status postgresql
systemctl status redis-dubnium-memory
systemctl status dubnium-memory-api
ai-memory health

Expected health response:

{
  "status": "ok"
}

Raw HTTP is also available for debugging:

curl http://127.0.0.1:8090/healthz

Scope Convention

Use explicit scope prefixes for new memory rows:

personal:
project:
session:
agent:
workflow:

Examples:

project:dubnium
session:11111111-1111-4111-8111-111111111111
agent:anthesis-reviewer
workflow:memory-phase-2

The current implementation provides advisory scope helpers. Full runtime enforcement is intentionally deferred until existing callers and examples are migrated.

CLI Smoke Test

Store one memory:

ai-memory store --file docs/examples/memory-store-request.json

Retrieve scoped memory:

ai-memory retrieve \
  --query "What is Dubnium memory for?" \
  --scope project:dubnium \
  --require-verified \
  --purpose review \
  --actor-type agent \
  --actor-id anthesis-reviewer \
  --envelope-id env-manual-smoke-test

Inspect retrieval events:

ai-memory events

Expire old memories:

ai-memory expire --now 2026-05-28T00:00:00Z

Use a non-default API URL when needed:

ai-memory --url http://127.0.0.1:8090 health

API Smoke Test

The CLI is preferred for operator use. Raw HTTP examples are kept for debugging and automation parity.

Store one memory:

curl -sS http://127.0.0.1:8090/memory/store \
  -H 'Content-Type: application/json' \
  -d @docs/examples/memory-store-request.json

Retrieve scoped memory:

curl -sS http://127.0.0.1:8090/memory/retrieve \
  -H 'Content-Type: application/json' \
  -d '{
    "query": "What is Dubnium memory for?",
    "scope": "project:dubnium",
    "allowed_sensitivity": ["internal"],
    "require_verified": true,
    "limit": 8
  }'

Inspect retrieval events:

curl -sS http://127.0.0.1:8090/memory/retrieval-events

Retrieval Behavior

Normal retrieval excludes memory when:

scope does not match the request
sensitivity is not explicitly allowed
require_verified is true and memory is not verified
memory is expired by TTL
memory has validation_status = rejected

Rejected memory is excluded even when require_verified = false. Audit retrieval of rejected memory is future work and should use a separate endpoint or explicit audit mode.

Security Checks

API binds to 127.0.0.1
raw vLLM remains separate from durable memory
memory rows include scope, sensitivity, validation status, source, and provenance
expired memories are excluded from retrieval
rejected memories are excluded from normal retrieval
sensitive memories are excluded unless explicitly allowed
retrieval events record returned memory ids and artifact ids
logs must not contain raw token-like values
prompt assembly must happen outside the memory service

Anthesis Governance Hook

Phase 1 does not implement Anthesis directly. The intended integration contract is:

Anthesis classifies the task and authorizes retrieval scope
Anthesis calls /memory/retrieve with explicit scope, allowed_sensitivity, and require_verified
The memory service returns memories plus a retrieval event id
Anthesis records the retrieval event, memory ids, provider decision, and prompt assembly in an execution envelope
Anthesis decides whether retrieved memory may enter the model context

Memory may inform an agent, but governance decides whether it is allowed to do so.

Troubleshooting

journalctl -u dubnium-memory-api -b
journalctl -u postgresql -b
journalctl -u redis-dubnium-memory -b

Common failure buckets:

database role or socket mismatch
pgvector extension unavailable for the selected Postgres package
migration failure
API accidentally bound to a non-local address
malformed JSON payload
scope mismatch during retrieval

Validation Before Merge

git diff --check
nix --extra-experimental-features "nix-command flakes" build .#memory-service
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.dubnium.memory.enable
pytest pkgs/memory-service/tests

Default workstation expectation before opt-in:

false

If the full workstation build fails on host-specific hardware configuration, report that separately from the memory service package/module validation.

Memory Data Model Specification

Status: draft

This document is the canonical data model and requirements specification for the Dubnium memory service prototype. It reconciles the architecture direction, API/domain models, and current Postgres migration.

Implementation references:

pkgs/memory-service/src/dubnium_memory/models.py
pkgs/memory-service/src/dubnium_memory/embeddings.py
pkgs/memory-service/src/dubnium_memory/migrations/001_initial.sql
pkgs/memory-service/src/dubnium_memory/migrations/002_pgvector_embeddings.sql

Goals

The data model must support:

durable episodic, semantic, and working memory records
scoped retrieval for projects, sessions, and agents
externalized artifacts and evidence references
retrieval event capture for audit and replay
metadata needed by a future external governance layer
local Postgres and pgvector evolution without coupling to vLLM internals

Non-Goals

The data model does not define:

transformer KV-cache persistence
prompt assembly format
future governance authority behavior
autonomous memory mutation rules
object storage implementation details
a Letta or MemGPT internal schema

Trust Boundary

All stored content is untrusted when it enters the system and when it is retrieved later. This includes user input, agent output, model-generated summaries, tool output, artifact-derived text, and database rows.

Boundary requirements:

validate API payloads before constructing domain objects
redact secret-like values before persistence
use parameterized SQL for all request-derived values
keep secrets out of logs and durable memory summaries
store enough provenance, validation, sensitivity, scope, and TTL metadata for external policy systems to inspect later
return retrieval candidates and identifiers, not assembled prompts

Domain Objects

Memory

Memory is a normalized semantic or episodic record. It is not raw transcript storage and should not contain binary artifact data.

Required fields:

Field	Type	Requirement
`id`	UUID	Stable identifier generated before persistence
`memory_type`	enum	One of `working`, `episodic`, `semantic`
`summary`	string	Non-empty, max 8000 chars, redacted before persistence
`scope`	string	Non-empty, max 256 chars
`source`	string	Non-empty source label, max 128 chars
`provenance`	object	JSON object, empty object allowed

Optional or defaulted fields:

Field	Type	Default	Requirement
`session_id`	UUID or null	null	References `sessions.id` when durable
`importance`	float	0.0	Range 0.0 to 1.0
`confidence`	float	0.0	Range 0.0 to 1.0
`sensitivity`	string	`internal`	Non-empty, max 64 chars
`validation_status`	enum	`unverified`	One of `unverified`, `verified`, `rejected`
`ttl`	timestamp or null	null	Expired records excluded and removable
`artifact_refs`	list	empty	Each artifact scope must match memory scope

Durable table: memories.

Current gap: artifact refs are represented in domain/API objects but are not yet persisted as a relationship table.

Retrieved Memory

Retrieved memory is a context candidate returned by retrieval. It must contain only the fields needed by callers to decide whether and how to assemble context.

Fields:

id
summary
scope
sensitivity
validation_status
provenance
artifact_refs

Retrieval responses must not construct prompts. Prompt assembly remains outside the memory service.

Retrieve Request

Retrieve requests define caller intent and visibility constraints.

Fields:

Field	Type	Default	Requirement
`query`	string	none	Non-empty, max 4000 chars
`scope`	string	none	Non-empty, max 256 chars
`allowed_sensitivity`	string list	`["internal"]`	Must not be empty
`require_verified`	bool	false	Filters to verified memories when true
`limit`	int	8	Range 1 to 32

Retrieval Event

Retrieval events record what was available to a caller at retrieval time.

Fields:

Field	Type	Requirement
`id`	UUID	Generated for each retrieval
`scope`	string	Request scope
`query`	string	Request query
`returned_memory_ids`	UUID list	Ordered returned memory ids
`returned_artifact_ids`	UUID list	Artifact ids referenced by returned memories
`created_at`	timestamp	Durable database timestamp

Durable table: retrieval_events.

Replay requirements:

preserve returned memory ids
preserve returned artifact ids
preserve query and scope
preserve timestamp
later replay surfaces should reconstruct candidate availability from these identifiers and persisted records

Artifact Reference

Artifact refs are lightweight pointers from memory records to external evidence. They do not embed raw binary content.

Fields:

Field	Type	Requirement
`id`	UUID	Artifact identifier
`scope`	string	Must match containing memory scope
`sha256`	string	Content hash
`storage_uri`	string	External storage pointer
`artifact_type`	string	Type such as `image`, `document`, `log`

Durable table: artifacts.

Current gap: memory-to-artifact relationship persistence is not implemented.

Embedding

Embeddings are model-specific vector representations. They are separate from memory records so memory facts remain portable across embedding model changes.

Fields:

Field	Type	Requirement
`model`	string	Non-empty, max 128 chars
`dimensions`	int	Positive
`vector`	float list	Length must match `dimensions`

Current durable table: memory_embeddings.

Current durable fields:

memory_id
embedding_model
embedding_ref
embedding
embedding_dimensions
created_at

Current implementation can persist embedding references and pgvector values for a memory. The application service can embed stored summaries when configured with an embedder and an embedding-capable store. The Postgres store can query vectors behind the storage boundary.

Session

Sessions group conversational or agentic work under a scope.

Durable table: sessions.

Fields:

id
scope
created_at

Current gap: session creation and lookup APIs are not implemented.

Task State

Task state is active execution state, not memory. It should remain structured and queryable instead of being embedded in vector stores.

Durable table: tasks.

Fields:

id
scope
status
state
created_at
updated_at

Current gap: task-state domain objects and APIs are not implemented.

Provenance

Provenance records attach lineage to one memory, artifact, or retrieval event.

Durable table: provenance.

Fields:

id
memory_id
artifact_id
retrieval_event_id
source_identity
source_event
created_at

Constraint: exactly one of memory_id, artifact_id, or retrieval_event_id must be set.

Current gap: provenance has initial schema support but no write path beyond memory-local JSON metadata.

Durable Tables

Table	Purpose	Status
`sessions`	Session metadata	Schema only
`memories`	Normalized memory records	Implemented for store/retrieve/expire
`memory_embeddings`	Embedding references and vectors	Implemented for persistence
`tasks`	Active workflow state	Schema only
`artifacts`	Externalized artifact metadata	Schema only
`retrieval_events`	Retrieval audit/replay records	Implemented for retrieval event persistence
`provenance`	Lineage records	Schema only

API Requirements

The API boundary must:

reject non-JSON write requests
reject oversized payloads
validate UUIDs, timestamps, enum values, scores, and bounds
redact secret-like values before storing memory summaries
return JSON errors without stack traces
expose retrieval events for local replay/audit inspection
keep durable storage implementation behind the application service contract

Retrieval Requirements

Retrieval must filter by:

scope
allowed sensitivity
validation status when require_verified is true
TTL expiration

Retrieval should rank by:

lexical or vector relevance
importance
confidence
recency

Current implementation supports scope, sensitivity, verification, TTL, lexical matching, vector relevance in the Postgres store, importance, and confidence. Recency ranking is future work.

Evolution Requirements

Future changes should preserve:

vLLM runtime statelessness
memory/runtime separation from governance authority
external artifact references instead of binary prompt memory
replayable retrieval events
replaceable embedding providers
MemGPT/Letta integration above Dubnium memory APIs, not as source of truth

Before adding autonomous memory writes, durable storage, redaction, retrieval filters, provenance, expiration, and replay evidence must pass local validation.

Memory Governance Contract

Status: draft

This contract defines how orchestrators such as Anthesis may request memory from the Dubnium memory service without delegating governance authority to the memory service itself.

Boundary

Anthesis / orchestrator
  - classifies task risk
  - authorizes memory scope
  - chooses sensitivity filters
  - decides whether retrieved memory enters prompt context
  - records execution envelope

Dubnium memory service
  - stores memories
  - filters by scope, sensitivity, verification, rejection, and TTL
  - returns retrieval candidates
  - records retrieval events and metadata

The memory service must not assemble final prompts or decide whether a memory is safe to inject into an agent context.

Scope Convention

Memory scopes should use one of these prefixes:

personal:
project:
session:
agent:
workflow:

Examples:

project:dubnium
session:11111111-1111-4111-8111-111111111111
agent:anthesis-reviewer
workflow:memory-phase-2

The current scope helper is advisory. Runtime enforcement may be added after existing callers and examples are fully migrated.

Retrieval Request

A retrieval request may include governance metadata in addition to the Phase 1 filters.

{
  "query": "What changed in memory phase 2?",
  "scope": "project:dubnium",
  "allowed_sensitivity": ["internal"],
  "require_verified": true,
  "limit": 8,
  "purpose": "review",
  "requester": {
    "actor_type": "agent",
    "actor_id": "anthesis-reviewer"
  },
  "envelope_id": "env-20260528-001"
}

Required Fields

Field	Meaning
`query`	Retrieval query text
`scope`	Retrieval boundary, such as `project:dubnium`

Optional Fields

Field	Meaning	Default
`allowed_sensitivity`	Sensitivity labels allowed in results	`["internal"]`
`require_verified`	Whether only verified memory may return	`false`
`limit`	Maximum memory candidates	`8`
`purpose`	Orchestrator purpose: `ask`, `plan`, `patch`, `review`, `test`	omitted
`requester`	Actor requesting retrieval	omitted
`envelope_id`	Upstream Anthesis execution envelope id	omitted

Retrieval Event

Every retrieval returns an event.

{
  "id": "uuid",
  "scope": "project:dubnium",
  "query": "What changed in memory phase 2?",
  "returned_memory_ids": ["uuid"],
  "returned_artifact_ids": [],
  "metadata": {
    "allowed_sensitivity": ["internal"],
    "require_verified": true,
    "limit": 8,
    "purpose": "review",
    "requester": {
      "actor_type": "agent",
      "actor_id": "anthesis-reviewer"
    },
    "envelope_id": "env-20260528-001"
  }
}

The event is an audit hook. It is not proof that the memory entered a prompt. Anthesis must separately record prompt assembly and provider execution in its own envelope.

Normal Retrieval Rules

Normal retrieval excludes memory when:

scope does not match the request
sensitivity is not explicitly allowed
require_verified is true and memory is not verified
memory is expired by TTL
memory has validation_status = rejected

Rejected memory is excluded even when require_verified = false.

Audit retrieval of rejected memory is future work and should use a separate endpoint or explicit audit mode.

Memory Promotion

Memory should move through explicit states:

working -> episodic -> semantic -> repo doc / ADR / runbook

Promotion rules:

working memory may be generated inside a session
episodic memory must summarize a meaningful event or task
semantic memory must represent a stable fact, decision, convention, or invariant
repo docs, ADRs, and runbooks remain higher-authority than memory rows

Rejection Reasons

Memory candidates should be rejected or marked rejected when they contain:

secret-like content that redaction could not confidently sanitize
cross-scope contamination
unsupported or hallucinated claims
stale facts
prompt-injection residue
weak or missing provenance

Rejected memory must not appear in normal retrieval paths.

Anthesis Envelope Handoff

Anthesis should record:

retrieval request
retrieval event id
returned memory ids
returned artifact ids
prompt assembly decision
provider decision
model/provider response
validation result

The memory service only supplies retrieval candidates and metadata. Governance remains external.

Anthesis Memory Envelope Examples

Status: draft

This document shows how Dubnium memory retrieval evidence should appear inside an Anthesis execution envelope. It is intentionally contract-only: Dubnium does not implement Anthesis runtime orchestration here.

Boundary

Dubnium memory service
  - stores memories
  - filters retrieval candidates
  - records retrieval events
  - returns memory ids, artifact ids, and retrieval metadata

Anthesis
  - authorizes retrieval
  - assembles prompts
  - decides whether retrieved memory may be used
  - records provider decisions
  - records validation results

The memory service retrieval event proves that memory was fetched. It does not prove that memory entered the model prompt. Anthesis must record the prompt assembly decision separately.

Envelope Fragment

A governed Anthesis execution envelope should include a memory section shaped like this:

{
  "memory": {
    "retrieval_request": {
      "query": "What is the current Dubnium memory boundary?",
      "scope": "project:dubnium",
      "allowed_sensitivity": ["internal"],
      "require_verified": true,
      "limit": 8,
      "purpose": "review",
      "requester": {
        "actor_type": "agent",
        "actor_id": "anthesis-reviewer"
      },
      "envelope_id": "env-20260528-001"
    },
    "retrieval_event": {
      "id": "22222222-2222-4222-8222-222222222222",
      "scope": "project:dubnium",
      "query": "What is the current Dubnium memory boundary?",
      "returned_memory_ids": [
        "11111111-1111-4111-8111-111111111111"
      ],
      "returned_artifact_ids": [],
      "metadata": {
        "allowed_sensitivity": ["internal"],
        "require_verified": true,
        "limit": 8,
        "purpose": "review",
        "requester": {
          "actor_type": "agent",
          "actor_id": "anthesis-reviewer"
        },
        "envelope_id": "env-20260528-001"
      }
    },
    "prompt_assembly_decision": {
      "used_memory_ids": [
        "11111111-1111-4111-8111-111111111111"
      ],
      "excluded_memory_ids": [],
      "decision": "used",
      "reason": "Verified internal project memory matched the authorized scope and review purpose."
    }
  }
}

Provider Decision Fragment

Memory evidence should sit beside, not inside, the provider decision.

{
  "provider_decision": {
    "selected_provider": "vllm.local",
    "selected_model": "qwen2.5-coder-14b-instruct",
    "provider_class": "local",
    "cloud_escalation_allowed": false,
    "reason": "Review task used verified internal project memory and did not require external context."
  }
}

Validation Fragment

Validation should explicitly tie output review to the memory/context decision.

{
  "validation": {
    "status": "passed",
    "checks": [
      {
        "name": "memory_scope",
        "status": "passed",
        "details": "All retrieved memory was scoped to project:dubnium."
      },
      {
        "name": "rejected_memory_exclusion",
        "status": "passed",
        "details": "No rejected memories were returned or used."
      },
      {
        "name": "prompt_assembly_recorded",
        "status": "passed",
        "details": "Used and excluded memory ids were recorded."
      }
    ]
  }
}

Non-Use Case

If memory is retrieved but not used, Anthesis should record that explicitly:

{
  "memory": {
    "retrieval_event_id": "22222222-2222-4222-8222-222222222222",
    "returned_memory_ids": [
      "11111111-1111-4111-8111-111111111111"
    ],
    "returned_artifact_ids": [],
    "prompt_assembly_decision": {
      "used_memory_ids": [],
      "excluded_memory_ids": [
        "11111111-1111-4111-8111-111111111111"
      ],
      "decision": "excluded",
      "reason": "Memory was relevant but unverified; task required verified memory."
    }
  }
}

Rejected Memory Case

Rejected memory should not appear in normal retrieval events. If a future audit mode retrieves rejected memory, the envelope must make the audit mode explicit:

{
  "memory_audit": {
    "mode": "audit_rejected_memory",
    "normal_prompt_use_allowed": false,
    "retrieved_rejected_memory_ids": [
      "33333333-3333-4333-8333-333333333333"
    ],
    "reason": "Operator audit of previously rejected cross-scope memory."
  }
}

Audit-mode retrieval is future work. Normal prompt assembly must not use rejected memory.

Minimum Envelope Requirements

For any Anthesis-governed run that uses Dubnium memory, record:

retrieval request
retrieval event id
returned memory ids
returned artifact ids
prompt assembly decision
used memory ids
excluded memory ids
provider decision
validation result

This creates a replayable boundary between retrieval, prompt assembly, provider execution, and validation.

vLLM Memory Phase 1 Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Build a minimal local persistent memory prototype around Dubnium’s existing vLLM service without coupling durable memory to transformer KV state.

Architecture: vLLM remains the inference runtime. A separate memory workload provides Postgres/pgvector storage, optional Redis working context, summarization and embedding workers, and a scoped retrieval API that an orchestrator can use before calling vLLM. A future governance layer remains external; Phase 1 records metadata and lifecycle events but does not implement the governance authority.

Tech Stack: NixOS modules, Postgres, pgvector, Redis, Python service code, pytest, systemd services.

Scope

This plan implements the Phase 1 prototype described in ADR-0010 and vLLM Persistent Memory Prototype.

Do not implement multi-agent federation, Temporal, MinIO, cryptographic attestation, a production policy DSL, or durable KV-cache persistence in this phase.

Do not implement Letta or another MemGPT-style framework in Phase 1. Keep it as an incremental upgrade candidate after storage, retrieval filters, redaction, provenance, and replay checks are stable.

Do not implement MinIO, OCI artifact publishing, VLM artifact resolution, or binary artifact extraction in Phase 1. Store artifact references and metadata only where needed; binary artifact pipelines are a later architecture phase.

Trust Boundaries

Risk: medium.

Attacker-controlled inputs include user prompts, agent messages, model output, tool output, retrieved artifacts, imported documents, and model-generated summaries. Treat all of them as untrusted before storage and before prompt assembly.

The Phase 1 implementation must enforce:

validation at API boundaries
scoped retrieval before prompt assembly
redaction before durable storage
TTL filtering
sensitivity metadata and filters
provenance on every memory row and artifact reference
retrieval event logging for later replay
no secret values in logs or memory payloads

Planned Files

Create:

modules/workloads/memory.nix: NixOS workload module for Postgres, pgvector, Redis, memory API, and workers.
pkgs/memory-service/default.nix: package the local Python memory service.
pkgs/memory-service/pyproject.toml: Python package metadata.
pkgs/memory-service/src/dubnium_memory/__init__.py: package marker.
pkgs/memory-service/src/dubnium_memory/api.py: HTTP API boundary and input validation.
pkgs/memory-service/src/dubnium_memory/config.py: environment parsing.
pkgs/memory-service/src/dubnium_memory/db.py: database connection and migrations runner.
pkgs/memory-service/src/dubnium_memory/models.py: typed request and memory models.
pkgs/memory-service/src/dubnium_memory/filters.py: retrieval scope, TTL, and sensitivity filters.
pkgs/memory-service/src/dubnium_memory/redaction.py: secret and sensitive payload redaction.
pkgs/memory-service/src/dubnium_memory/retrieval.py: scoped query and ranking logic.
pkgs/memory-service/src/dubnium_memory/storage.py: memory persistence.
pkgs/memory-service/src/dubnium_memory/workers.py: summarization and embedding worker entrypoints.
pkgs/memory-service/migrations/001_initial.sql: schema for sessions, memories, embeddings, tasks, artifacts, retrieval events, and provenance.
pkgs/memory-service/tests/test_filters.py: retrieval filter tests.
pkgs/memory-service/tests/test_redaction.py: redaction tests.
pkgs/memory-service/tests/test_storage.py: storage contract tests.
pkgs/memory-service/tests/test_retrieval.py: retrieval filter tests.
docs/runbooks/memory-service.md: operator runbook for the prototype.

Modify:

modules/dubnium/options.nix: add dubnium.memory options and assertions.
hosts/workstation/default.nix: import and enable the memory workload for the workstation only after the module evaluates.
flake.nix: expose the memory-service package.
docs/README.md: link the memory service runbook.
docs/SUMMARY.md: link the memory service runbook.

Implementation Tasks

Task 1: Add Memory Options

Files:

Modify: modules/dubnium/options.nix
Step 1: Add a disabled-by-default dubnium.memory option set

Add this next to the existing dubnium.vllm and dubnium.k3s options:

memory = {
  enable = mkEnableOption "persistent memory services for local vLLM orchestration";

  api = {
    host = mkOption {
      type = types.str;
      default = "127.0.0.1";
      description = "Host address bound by the Dubnium memory API.";
    };

    port = mkOption {
      type = types.port;
      default = 8090;
      description = "Port bound by the Dubnium memory API.";
    };
  };

  database = {
    name = mkOption {
      type = types.str;
      default = "dubnium_memory";
      description = "Postgres database used by the Dubnium memory subsystem.";
    };

    user = mkOption {
      type = types.str;
      default = "dubnium_memory";
      description = "Postgres role used by the Dubnium memory service.";
    };
  };

  redis = {
    enable = mkOption {
      type = types.bool;
      default = true;
      description = "Whether Redis is enabled for transient working context and worker queues.";
    };
  };

  retention = {
    defaultTtlDays = mkOption {
      type = types.nullOr types.int;
      default = null;
      description = "Default TTL in days for memory objects without an explicit TTL.";
    };
  };
};

Step 2: Add assertions for safe local defaults

Add these to the existing assertions list:

{
  assertion = (!config.dubnium.memory.enable) || (config.dubnium.memory.api.host == "127.0.0.1");
  message = "dubnium.memory.api.host must stay local-only for the Phase 1 prototype";
}
{
  assertion =
    (config.dubnium.memory.retention.defaultTtlDays == null)
    || (config.dubnium.memory.retention.defaultTtlDays > 0);
  message = "dubnium.memory.retention.defaultTtlDays must be positive when set";
}

Step 3: Verify option evaluation

Run:

nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.dubnium.memory.enable

Expected:

false

Task 2: Package The Memory Service Skeleton

Files:

Create: pkgs/memory-service/default.nix
Create: pkgs/memory-service/pyproject.toml
Create: pkgs/memory-service/src/dubnium_memory/__init__.py
Create: pkgs/memory-service/src/dubnium_memory/config.py
Create: pkgs/memory-service/src/dubnium_memory/api.py
Modify: flake.nix
Step 1: Create package metadata

Create pkgs/memory-service/pyproject.toml:

[project]
name = "dubnium-memory"
version = "0.1.0"
description = "Local persistent memory service for Dubnium vLLM orchestration"
requires-python = ">=3.12"
dependencies = [
  "fastapi",
  "pydantic",
  "psycopg[binary]",
  "uvicorn",
]

[project.scripts]
dubnium-memory-api = "dubnium_memory.api:main"

Step 2: Create the Nix package

Create pkgs/memory-service/default.nix:

{ python312Packages }:

python312Packages.buildPythonApplication {
  pname = "dubnium-memory";
  version = "0.1.0";
  pyproject = true;

  src = ./.;

  build-system = [
    python312Packages.setuptools
    python312Packages.wheel
  ];

  dependencies = [
    python312Packages.fastapi
    python312Packages.pydantic
    python312Packages.psycopg
    python312Packages.uvicorn
  ];
}

Step 3: Add minimal app entrypoint

Create pkgs/memory-service/src/dubnium_memory/__init__.py:

"""Dubnium persistent memory service."""

Create pkgs/memory-service/src/dubnium_memory/config.py:

from pydantic import BaseModel


class Settings(BaseModel):
    database_url: str
    host: str = "127.0.0.1"
    port: int = 8090

Create pkgs/memory-service/src/dubnium_memory/api.py:

import os

from fastapi import FastAPI
import uvicorn

from dubnium_memory.config import Settings


app = FastAPI(title="Dubnium Memory API")


@app.get("/healthz")
def healthz() -> dict[str, str]:
    return {"status": "ok"}


def settings_from_env() -> Settings:
    return Settings(
        database_url=os.environ["DATABASE_URL"],
        host=os.environ.get("DUBNIUM_MEMORY_HOST", "127.0.0.1"),
        port=int(os.environ.get("DUBNIUM_MEMORY_PORT", "8090")),
    )


def main() -> None:
    settings = settings_from_env()
    uvicorn.run(app, host=settings.host, port=settings.port)

Step 4: Expose the package from the flake

Modify flake.nix under packages.${system}:

memory-service = pkgs.callPackage ./pkgs/memory-service { };

Step 5: Verify package build

Run:

nix --extra-experimental-features "nix-command flakes" build .#memory-service

Expected:

result/bin/dubnium-memory-api exists

Task 3: Add Schema And Storage Contracts

Files:

Create: pkgs/memory-service/migrations/001_initial.sql
Create: pkgs/memory-service/src/dubnium_memory/models.py
Create: pkgs/memory-service/src/dubnium_memory/storage.py
Create: pkgs/memory-service/tests/test_storage.py
Step 1: Create the first migration

Create pkgs/memory-service/migrations/001_initial.sql:

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE IF NOT EXISTS sessions (
  id uuid PRIMARY KEY,
  scope text NOT NULL,
  created_at timestamptz NOT NULL DEFAULT now()
);

CREATE TABLE IF NOT EXISTS memories (
  id uuid PRIMARY KEY,
  session_id uuid REFERENCES sessions(id),
  memory_type text NOT NULL CHECK (memory_type IN ('working', 'episodic', 'semantic')),
  summary text NOT NULL,
  scope text NOT NULL,
  importance double precision NOT NULL DEFAULT 0.0,
  confidence double precision NOT NULL DEFAULT 0.0,
  sensitivity text NOT NULL DEFAULT 'internal',
  validation_status text NOT NULL DEFAULT 'unverified',
  ttl timestamptz,
  source text NOT NULL,
  provenance jsonb NOT NULL DEFAULT '{}'::jsonb,
  created_at timestamptz NOT NULL DEFAULT now()
);

CREATE TABLE IF NOT EXISTS memory_embeddings (
  memory_id uuid PRIMARY KEY REFERENCES memories(id) ON DELETE CASCADE,
  embedding vector(384) NOT NULL,
  model text NOT NULL,
  created_at timestamptz NOT NULL DEFAULT now()
);

CREATE TABLE IF NOT EXISTS tasks (
  id uuid PRIMARY KEY,
  scope text NOT NULL,
  status text NOT NULL,
  state jsonb NOT NULL DEFAULT '{}'::jsonb,
  created_at timestamptz NOT NULL DEFAULT now(),
  updated_at timestamptz NOT NULL DEFAULT now()
);

CREATE TABLE IF NOT EXISTS artifacts (
  id uuid PRIMARY KEY,
  scope text NOT NULL,
  uri text NOT NULL,
  media_type text,
  sensitivity text NOT NULL DEFAULT 'internal',
  provenance jsonb NOT NULL DEFAULT '{}'::jsonb,
  created_at timestamptz NOT NULL DEFAULT now()
);

CREATE TABLE IF NOT EXISTS provenance (
  id uuid PRIMARY KEY,
  memory_id uuid REFERENCES memories(id) ON DELETE CASCADE,
  source_identity text NOT NULL,
  source_event jsonb NOT NULL,
  created_at timestamptz NOT NULL DEFAULT now()
);

CREATE TABLE IF NOT EXISTS retrieval_events (
  id uuid PRIMARY KEY,
  scope text NOT NULL,
  query text NOT NULL,
  returned_memory_ids uuid[] NOT NULL DEFAULT '{}',
  returned_artifact_ids uuid[] NOT NULL DEFAULT '{}',
  created_at timestamptz NOT NULL DEFAULT now()
);

CREATE INDEX IF NOT EXISTS memories_scope_created_at_idx
  ON memories (scope, created_at DESC);

CREATE INDEX IF NOT EXISTS memories_ttl_idx
  ON memories (ttl);

Step 2: Define typed storage input

Create pkgs/memory-service/src/dubnium_memory/models.py:

from datetime import datetime
from typing import Literal
from uuid import UUID

from pydantic import BaseModel, Field


MemoryType = Literal["working", "episodic", "semantic"]
ValidationStatus = Literal["unverified", "verified", "rejected"]


class MemoryIn(BaseModel):
    id: UUID
    session_id: UUID | None = None
    memory_type: MemoryType
    summary: str = Field(min_length=1, max_length=8000)
    scope: str = Field(min_length=1, max_length=256)
    importance: float = Field(default=0.0, ge=0.0, le=1.0)
    confidence: float = Field(default=0.0, ge=0.0, le=1.0)
    sensitivity: str = Field(default="internal", max_length=64)
    validation_status: ValidationStatus = "unverified"
    ttl: datetime | None = None
    source: str = Field(min_length=1, max_length=128)
    provenance: dict

Step 3: Implement storage with parameterized SQL

Create pkgs/memory-service/src/dubnium_memory/storage.py:

from psycopg import Connection

from dubnium_memory.models import MemoryIn


def store_memory(conn: Connection, memory: MemoryIn) -> None:
    conn.execute(
        """
        INSERT INTO memories (
          id, session_id, memory_type, summary, scope, importance, confidence,
          sensitivity, validation_status, ttl, source, provenance
        )
        VALUES (
          %(id)s, %(session_id)s, %(memory_type)s, %(summary)s, %(scope)s,
          %(importance)s, %(confidence)s, %(sensitivity)s, %(validation_status)s,
          %(ttl)s, %(source)s, %(provenance)s
        )
        """,
        memory.model_dump(),
    )

Step 4: Add a storage test

Create pkgs/memory-service/tests/test_storage.py:

from uuid import uuid4

from dubnium_memory.models import MemoryIn


def test_memory_requires_summary() -> None:
    payload = {
        "id": uuid4(),
        "memory_type": "episodic",
        "summary": "",
        "scope": "project:dubnium",
        "source": "conversation",
        "provenance": {"origin": "test"},
    }

    try:
        MemoryIn(**payload)
    except Exception as exc:
        assert "summary" in str(exc)
    else:
        raise AssertionError("empty summary should be rejected")

Task 4: Add Redaction And Retrieval Filters

Files:

Create: pkgs/memory-service/src/dubnium_memory/redaction.py
Create: pkgs/memory-service/src/dubnium_memory/filters.py
Create: pkgs/memory-service/tests/test_redaction.py
Create: pkgs/memory-service/tests/test_filters.py
Step 1: Implement conservative redaction

Create pkgs/memory-service/src/dubnium_memory/redaction.py:

import re


SECRET_PATTERNS = [
    re.compile(r"(?i)(api[_-]?key|token|secret|password)\s*[:=]\s*([^\s]+)"),
]


def redact_text(value: str) -> str:
    redacted = value
    for pattern in SECRET_PATTERNS:
        redacted = pattern.sub(r"\1=[REDACTED]", redacted)
    return redacted

Step 2: Test redaction

Create pkgs/memory-service/tests/test_redaction.py:

from dubnium_memory.redaction import redact_text


def test_redacts_api_key_like_values() -> None:
    text = "OPENAI_API_KEY=sk-test-value"

    assert redact_text(text) == "OPENAI_API_KEY=[REDACTED]"

Step 3: Implement retrieval filtering

Create pkgs/memory-service/src/dubnium_memory/filters.py:

from datetime import datetime, timezone
from typing import TypedDict


class MemoryCandidate(TypedDict):
    id: str
    scope: str
    sensitivity: str
    validation_status: str
    ttl: datetime | None


def is_retrievable(
    memory: MemoryCandidate,
    *,
    scope: str,
    allowed_sensitivity: set[str],
    require_verified: bool,
) -> bool:
    if memory["scope"] != scope:
        return False
    if memory["sensitivity"] not in allowed_sensitivity:
        return False
    if require_verified and memory["validation_status"] != "verified":
        return False
    if memory["ttl"] is not None and memory["ttl"] <= datetime.now(timezone.utc):
        return False
    return True

Step 4: Test scope and TTL enforcement

Create pkgs/memory-service/tests/test_filters.py:

from datetime import datetime, timedelta, timezone

from dubnium_memory.filters import is_retrievable


def test_rejects_cross_scope_memory() -> None:
    memory = {
        "id": "m1",
        "scope": "project:other",
        "sensitivity": "internal",
        "validation_status": "verified",
        "ttl": None,
    }

    assert not is_retrievable(
        memory,
        scope="project:dubnium",
        allowed_sensitivity={"internal"},
        require_verified=True,
    )


def test_rejects_expired_memory() -> None:
    memory = {
        "id": "m1",
        "scope": "project:dubnium",
        "sensitivity": "internal",
        "validation_status": "verified",
        "ttl": datetime.now(timezone.utc) - timedelta(days=1),
    }

    assert not is_retrievable(
        memory,
        scope="project:dubnium",
        allowed_sensitivity={"internal"},
        require_verified=True,
    )

Task 5: Add Retrieval API Boundary

Files:

Modify: pkgs/memory-service/src/dubnium_memory/api.py
Create: pkgs/memory-service/src/dubnium_memory/retrieval.py
Create: pkgs/memory-service/tests/test_retrieval.py
Step 1: Add request and response models

Add to models.py:

class RetrieveRequest(BaseModel):
    query: str = Field(min_length=1, max_length=4000)
    scope: str = Field(min_length=1, max_length=256)
    allowed_sensitivity: list[str] = Field(default_factory=lambda: ["internal"])
    require_verified: bool = False
    limit: int = Field(default=8, ge=1, le=32)


class RetrievedMemory(BaseModel):
    id: UUID
    summary: str
    scope: str
    sensitivity: str
    validation_status: ValidationStatus
    provenance: dict

Step 2: Implement retrieval query contract

Create pkgs/memory-service/src/dubnium_memory/retrieval.py:

from psycopg import Connection

from dubnium_memory.models import RetrieveRequest, RetrievedMemory


def retrieve_memories(conn: Connection, request: RetrieveRequest) -> list[RetrievedMemory]:
    rows = conn.execute(
        """
        SELECT id, summary, scope, sensitivity, validation_status, provenance
        FROM memories
        WHERE scope = %(scope)s
          AND sensitivity = ANY(%(allowed_sensitivity)s)
          AND (%(require_verified)s = false OR validation_status = 'verified')
          AND (ttl IS NULL OR ttl > now())
        ORDER BY importance DESC, created_at DESC
        LIMIT %(limit)s
        """,
        request.model_dump(),
    ).fetchall()
    return [RetrievedMemory.model_validate(dict(row)) for row in rows]

Step 3: Add API endpoint

Add to api.py:

from dubnium_memory.models import RetrieveRequest, RetrievedMemory


@app.post("/memory/retrieve")
def retrieve(request: RetrieveRequest) -> list[RetrievedMemory]:
    raise NotImplementedError("database connection wiring is added in the service module task")

Keep this endpoint local-only until the database dependency is wired. Do not expose it on the network in Phase 1.

Task 6: Add NixOS Workload Module

Files:

Create: modules/workloads/memory.nix
Modify: hosts/workstation/default.nix
Step 1: Create the workload module

Create modules/workloads/memory.nix:

{ lib, config, pkgs, ... }:
let
  cfg = config.dubnium.memory;
  memoryPackage = pkgs.callPackage ../../pkgs/memory-service { };
in
{
  config = lib.mkIf cfg.enable {
    services.postgresql = {
      enable = true;
      extensions = ps: [ ps.pgvector ];
      ensureDatabases = [ cfg.database.name ];
      ensureUsers = [
        {
          name = cfg.database.user;
          ensureDBOwnership = true;
        }
      ];
    };

    services.redis.servers.dubnium-memory = lib.mkIf cfg.redis.enable {
      enable = true;
      bind = "127.0.0.1";
      port = 6379;
    };

    systemd.services.dubnium-memory-api = {
      description = "Dubnium persistent memory API";
      wantedBy = [ "multi-user.target" ];
      after = [ "postgresql.service" ];
      requires = [ "postgresql.service" ];
      environment = {
        DUBNIUM_MEMORY_HOST = cfg.api.host;
        DUBNIUM_MEMORY_PORT = toString cfg.api.port;
        DATABASE_URL = "postgresql:///${cfg.database.name}?host=/run/postgresql";
      };
      serviceConfig = {
        Type = "simple";
        ExecStart = "${memoryPackage}/bin/dubnium-memory-api";
        Restart = "always";
        RestartSec = "5s";
        NoNewPrivileges = true;
        PrivateTmp = true;
        ProtectHome = true;
        Slice = "platform.slice";
      };
    };
  };
}

Step 2: Import the module without enabling it

Modify hosts/workstation/default.nix imports:

../../modules/workloads/memory.nix

Do not set dubnium.memory.enable = true until package build and module eval pass.

Step 3: Verify disabled module eval

Run:

nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.systemd.services.dubnium-memory-api.enable

Expected: the attribute should be absent or evaluation should show the service is not defined while dubnium.memory.enable = false.

Task 7: Enable Prototype Locally

Files:

Modify: hosts/workstation/default.nix
Step 1: Enable the memory workload

Add under dubnium:

memory = {
  enable = true;
  api.host = "127.0.0.1";
  api.port = 8090;
};

Step 2: Verify generated service contracts

Run:

nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.services.postgresql.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.services.redis.servers.dubnium-memory.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.systemd.services.dubnium-memory-api.environment.DUBNIUM_MEMORY_HOST

Expected:

true
true
"127.0.0.1"

Task 8: Add Operator Runbook

Files:

Create: docs/runbooks/memory-service.md
Modify: docs/README.md
Modify: docs/SUMMARY.md
Step 1: Create the runbook

Create docs/runbooks/memory-service.md with:

# Runbook: Memory Service

Status: prototype

Use this after `dubnium.memory.enable = true`.

## Verify Services

```bash
systemctl status postgresql
systemctl status redis-dubnium-memory
systemctl status dubnium-memory-api
curl http://127.0.0.1:8090/healthz

Expected:

{"status":"ok"}

Security Checks

the API binds to 127.0.0.1
memories include scope, sensitivity, validation status, and provenance
expired memories are not returned
sensitive memories are not returned unless explicitly allowed
retrieval events are logged with memory ids and artifact references
logs do not contain raw token-like values


- [ ] **Step 2: Link the runbook**

Add `Memory Service` to the Runbooks lists in `docs/README.md` and
`docs/SUMMARY.md`.

- [ ] **Step 3: Build docs**

Run:

```bash
mdbook build

Expected: docs build succeeds. Generated web/docs changes may be reverted if the review scope is source docs only.

Final Verification

Before committing Phase 1 implementation:

git diff --check
nix --extra-experimental-features "nix-command flakes" build .#memory-service
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.dubnium.memory.enable
pytest pkgs/memory-service/tests
mdbook build

If a full workstation build still fails on the known placeholder hardware configuration, report that separately from targeted memory-module evaluation.

Follow-Up: MemGPT-Style Agent Upgrade

After Phase 1 is stable, create a separate ADR or spike plan for evaluating Letta as the maintained framework lineage from MemGPT. That spike should be read-only against existing memory rows at first, then test controlled agent-managed memory writes only after external governance hooks and replay evidence are in place.

Follow-Up: Artifact And OCI Architecture

After Phase 1 is stable, create a separate implementation plan for artifact handling. That work should start with filesystem content-addressed storage and metadata extraction, then evaluate MinIO and OCI-style exported cognition artifacts only after memory rows, retrieval events, and artifact references have stable ids.

Memory Phase 2: Governed Structured Memory

Status: planning

Phase 1 prepared the local memory substrate: package, API, Postgres/pgvector schema, Redis support, retrieval events, tests, and an opt-in workstation runbook.

Phase 2 makes memory useful for governed agent workflows without turning the memory service into the governance authority.

Goal

Build a structured, policy-aware memory layer that Anthesis or another orchestrator can govern explicitly.

The Phase 2 target is:

Anthesis decides what memory may be used.
Dubnium stores, retrieves, filters, and records memory events.
vLLM remains the inference runtime.

Non-Goals

Do not implement these in Phase 2:

autonomous self-editing memory
global always-on personal memory injection
durable transformer KV-cache persistence
multi-agent memory federation
Temporal or complex workflow orchestration
MinIO or OCI memory bundles
raw artifact extraction pipelines
public or Tailscale-exposed memory API
Anthesis itself inside the Dubnium memory service

Boundary

flowchart TD
    A[Anthesis / Orchestrator] --> B[Memory Policy Decision]
    B --> C[Dubnium Memory API]
    C --> D[(Postgres)]
    C --> E[(pgvector)]
    C --> F[(Redis)]
    C --> G[Retrieval Event]
    G --> A
    A --> H[Execution Envelope]
    A --> I[vLLM / Agent Prompt]

Dubnium must expose enough structure for Anthesis to audit and replay memory use, but Dubnium must not silently decide that retrieved memory belongs in a prompt.

Phase 2 Capabilities

1. Memory Namespaces

Add explicit namespace concepts on top of the existing scope field.

Suggested namespace shape:

personal:<name>
project:<repo-or-system>
session:<uuid>
agent:<agent-id>
workflow:<workflow-id>

The existing scope field can remain the primary filter, but Phase 2 should document and validate accepted scope patterns.

2. Memory Classes

Keep the current memory types:

working
episodic
semantic

Add operational guidance:

Type	Meaning	Default retention
`working`	transient task/session context	short TTL
`episodic`	event/session summaries	medium or explicit TTL
`semantic`	normalized stable facts/decisions	long-lived but reviewable

Semantic memory should require stronger provenance and confidence than working memory.

3. Governance Metadata

Each memory row already carries sensitivity, validation_status, source, provenance, and ttl. Phase 2 should standardize expected provenance fields.

Recommended provenance shape:

{
  "origin": "agent|operator|system|import",
  "source_uri": "optional source reference",
  "source_event_id": "optional event id",
  "extractor": "manual|summary-worker|agent",
  "extractor_version": "1",
  "governance": "manual|anthesis|none",
  "envelope_id": "optional Anthesis envelope id"
}

4. Retrieval Policy Contract

Add a policy-facing retrieval request contract:

{
  "query": "string",
  "scope": "project:dubnium",
  "allowed_sensitivity": ["internal"],
  "require_verified": false,
  "limit": 8,
  "purpose": "ask|plan|patch|review|test",
  "requester": {
    "actor_type": "human|agent|system",
    "actor_id": "string"
  },
  "envelope_id": "optional Anthesis envelope id"
}

The existing API can continue accepting the Phase 1 shape, but Phase 2 should add optional fields and preserve backward compatibility.

5. Retrieval Event Completeness

Retrieval events should eventually record:

query
scope
returned memory ids
returned artifact ids
allowed sensitivities
require_verified
requester
purpose
envelope id
timestamp

This is the key replay hook for Anthesis.

6. Memory Promotion

Add an explicit promotion workflow:

working -> episodic -> semantic -> repo doc / ADR / runbook

Rules:

working memory can be generated freely inside a session
episodic memory requires summarization and provenance
semantic memory requires confidence, review status, and scope
repo docs remain the highest-authority source for durable project truth

7. Memory Rejection

Add a clear rejection path:

candidate memory -> rejected -> never retrieved unless explicitly requested for audit

Rejection reasons should include:

secret-like content
cross-scope contamination
hallucinated or unsupported claim
stale fact
prompt-injection residue
unsupported provenance

8. Prompt Assembly Boundary

The memory service should never return a final prompt. It should return candidates and event metadata.

The orchestrator owns:

prompt assembly
context ordering
final redaction
policy enforcement
provider selection
execution envelope capture

Implementation Tasks

Task 1: Add Governance-Oriented Request Metadata

Files:

pkgs/memory-service/src/dubnium_memory/models.py
pkgs/memory-service/src/dubnium_memory/serialization.py
pkgs/memory-service/tests/test_models.py
pkgs/memory-service/tests/test_api.py

Add optional fields to retrieval requests:

purpose
requester
envelope_id

Keep them optional so Phase 1 clients do not break.

Task 2: Extend Retrieval Events

Files:

pkgs/memory-service/src/dubnium_memory/migrations/003_retrieval_event_metadata.sql
pkgs/memory-service/src/dubnium_memory/postgres.py
pkgs/memory-service/tests/test_migrations.py
pkgs/memory-service/tests/test_postgres.py

Add nullable metadata columns or a metadata jsonb field to retrieval_events.

Recommended initial shape:

ALTER TABLE retrieval_events
  ADD COLUMN IF NOT EXISTS metadata jsonb NOT NULL DEFAULT '{}'::jsonb;

This avoids premature schema churn while keeping replay metadata available.

Task 3: Add Scope Validation Helpers

Files:

pkgs/memory-service/src/dubnium_memory/scopes.py
pkgs/memory-service/tests/test_scopes.py

Add validation for scope prefixes:

personal:
project:
session:
agent:
workflow:

Do not enforce globally until existing tests and callers are migrated.

Task 4: Add Promotion/Rejection Contract Docs

Files:

docs/specs/memory-governance-contract.md
docs/runbooks/memory-service.md

Document:

memory promotion rules
rejection reasons
semantic memory expectations
Anthesis envelope handoff

Task 5: Add Policy Examples

Files:

docs/examples/memory-policy.project-dubnium.json
docs/examples/memory-retrieval-request.json
docs/examples/memory-retrieval-event.json

These examples should be data contracts, not active enforcement.

Acceptance Criteria

Phase 2 is complete when:

retrieval requests can carry optional governance metadata
retrieval events preserve that metadata for replay
scope conventions are documented and testable
memory promotion/rejection rules are documented
Anthesis can use memory ids and retrieval event ids in execution envelopes
no prompt assembly happens inside the memory service
memory remains opt-in on the workstation host

Risks

Risk	Mitigation
Memory poisoning	require scope, provenance, validation status, and retrieval event logging
Cross-project leakage	enforce scoped retrieval and explicit sensitivity filters
Silent context injection	keep prompt assembly outside memory service
Governance coupling	expose metadata; let Anthesis decide policy
Schema churn	prefer additive migrations and metadata JSON for early governance fields
Stale semantic facts	use confidence, validation status, TTL, and promotion workflow

Recommended First PR

The first Phase 2 PR should be small:

Add metadata jsonb to retrieval_events
Add optional retrieval request metadata fields
Preserve metadata in retrieval event responses
Add tests
Add governance contract docs

Do not add Anthesis runtime wiring yet.

Architecture Overview

Status: living

This is the arc42-lite entrypoint for Dubnium. It describes the system shape, constraints, building blocks, runtime behavior, deployment view, and current risks without replacing lower-level implementation docs.

Purpose

Dubnium is a policy-driven NixOS workstation and AI node. It supports multiple host-local operational contracts on one physical machine:

desktop: interactive Hyprland workstation and development mode.
studio-local: conditional low-latency audio overlay on desktop.
compute: headless throughput-oriented AI/platform mode.

The architecture exists to make mode transitions explicit, observable, guard-driven, auditable, and reversible.

Constraints

Desired state is not current state.
Current state must be derived from runtime observation.
Runtime reconciliation is mandatory for mode changes.
systemd targets, services, and slices are the enforcement mechanism.
Runtime switching comes before NixOS specialisations.
studio-local is conditional and must not dominate the architecture.
Host-local modes must remain separate from capability placement.
Failure, degraded, and blocked states must be modeled explicitly.

System Context

Actors and adjacent systems:

Local operator: requests mode changes, checks status, recovers failures.
NixOS host: owns systemd enforcement, hardware, services, and runtime state.
GPU/display/audio hardware: shared resources with conflicting latency and throughput requirements.
vLLM: compute workload, active only in compute for v1.
k3s: platform workload, stable across modes for v1.
Possible external studio host: future placement for audio/studio capability.

Building Blocks

Nix flake: declares the host configuration and packaged tools.
modules/dubnium: mode policy, options, targets, slices, controller units, state files, and guard installation.
modules/workloads: workload-specific service definitions such as Hyprland, audio, NVIDIA, vLLM, and k3s.
mode CLI: operator surface for requests, status, desired/current state, and explanation.
Reconciler: privileged transition executor.
Observer: evidence-based classifier for current mode.
Guards: small checks that return pass, policy block, or execution error.
systemd: target, service, slice, and cgroup enforcement layer.

Runtime View

All mode changes follow the same control-loop shape:

Authorize the request.
Write desired state.
Acquire the controller lock.
Observe current state from runtime facts.
Validate target and capability placement.
Run transition guards.
Execute bounded actions through systemd and helper scripts.
Re-observe.
Classify success, degraded state, blocked state, or failure.
Write transition and guard records.

Success is never inferred from attempted actions. Success requires post-transition observation that satisfies the target mode predicates.

Deployment View

Primary deployment target:

one x86_64-linux NixOS workstation host named workstation
Hyprland desktop
NVIDIA/CUDA runtime
planned dual-GPU topology, with hardware-tolerant transitional config
vLLM model/cache state outside the Nix store
k3s control-node duties

Runtime state:

live state under /run/mode-controller
future persistent audit history under /var/lib/mode-controller or /persist/var/lib/mode-controller when impermanence lands

Cross-Cutting Concerns

Safety: guards block destructive transitions and distinguish policy blocks from execution errors.
Observability: status must show desired state, observed state, conflicts, guard failures, and latest transition result.
Auditability: every reconciliation attempt should produce structured records.
Resource ownership: GPU, CPU, memory, I/O, audio, AI, and platform planes must not silently overlap in conflicting ways.
Security: unprivileged users must not forge desired/current state or transition success.

Current Risks

NVIDIA/Wayland GPU release may not be reliable enough for runtime-only compute promotion.
Mixed runtime states may confuse a shell observer unless conflicts are handled conservatively.
systemctl isolate can stop required services if target dependencies are not explicit enough.
Rollback must prove restored desktop behavior through observation, not just successful systemd commands.

Control Plane

Status: living

The control plane reconciles requested mode intent with runtime facts. It is a local privileged authority, not a convenience shell script.

Authority Model

V1 decision:

transition execution is privileged
the initial operator path is sudo mode request <mode> or a root-owned mode-controller@.service
unprivileged users must not mutate observed state or forge transition success

Future options:

polkit-mediated request path
local service endpoint
richer automation integration

Those options should not be added until the root/sudo path proves the control loop on the target host.

State Model

Live state lives under /run/mode-controller:

desired
current
lock
last-transition.json
last-guards.json
capability-placement.json
hardware-topology.json

Persistent transition history lives under:

/var/lib/mode-controller/events.jsonl

Each line is an append-only JSON event emitted by the reconciler. Initial event types:

transition

Initial event fields:

timestamp
requested
prior
final
success
reason

This event stream is intended to become the basis for:

audit history
degraded transition diagnosis
future reconciliation analytics
operator replay/debug tooling
higher-level memory/context systems

V1 accepts plain desired and current files for the first bootable milestone. The hardening path is either:

migrate to desired.json and current.json, or
explicitly document the plain-text files as stable interface and keep structured metadata in transition records.

When impermanence is introduced, the persistent event path can be mapped to:

/persist/var/lib/mode-controller/events.jsonl

Reconciliation Sequence

Every requested transition follows this sequence:

acquire lock
observe current state
validate requested target
validate capability placement
run guards
execute bounded actions
re-observe
classify final state
record guard, action, timing, and outcome data
release lock

If target predicates fail after mutation, the controller must attempt rollback, classify a degraded state, or report failed-transition.

Observer Contract

The observer must derive current state from evidence only. It must not trust desired state as proof of success.

Required output fields for JSON mode:

{
  "observed_state": "desktop",
  "confidence": "high",
  "degraded": false,
  "signals": {},
  "conflicts": [],
  "timestamp": "..."
}

Required signal families:

graphical session presence
compositor/display-manager state
compute.target
vllm.service
studio-local-policy.service
PipeWire/JACK/REAPER indicators
GPU process/VRAM evidence when available
controller lock/transition marker
latest failed transition marker

Conservative rule: report unknown, transitioning, degraded-*, or failed-transition instead of pretending a stable target has been reached.

Guard Contract

Guards are small deterministic checks with stable exit classes:

0     pass
10-19 policy block
20+   execution error

Initial guard set:

check_target_reachable
check_audio_idle
check_graphical_session_terminable
check_gpu_display_released
check_vllm_drainable
check_compute_capability_local
check_studio_capability_local
check_memory_headroom
check_persistence_paths_ready

Each guard should emit a reason code and evidence payload suitable for mode explain and transition logs.

Failure Semantics

Blocked transition:

a guard returns a policy block
desired state may remain requested
current state must not be rewritten to target

Execution error:

a guard or action could not run reliably
target should not be considered safe

Degraded state:

system is usable but does not satisfy all target guarantees
must be surfaced directly in status

Failed transition:

no stable or acceptable degraded contract could be established
rollback failed or final observation remained unsafe/conflicted

Runtime Behavior

Status: living

This document describes how Dubnium behaves while switching between host-local modes.

Modes

desktop

Intent:

interactive workstation and development mode

Expected runtime facts:

graphical session available
ordinary audio available
display GPU protected for UI
vLLM inactive in v1
k3s may remain active with bounded platform pressure

studio-local

Intent:

low-latency local audio profile when studio capability remains on this host

V1 representation:

overlay on desktop
studio-local-policy.service
audio-priority.service
no first-class studio-local.target

Expected runtime facts:

graphical session available
audio-priority policy active
AI suppressed or inactive
heavy background pressure reduced

compute

Intent:

headless throughput mode for AI/platform work

Expected runtime facts:

graphical session absent or non-authoritative
compute target active
vLLM active when enabled
AI resources assigned according to configured compute GPU profile
k3s remains active with mode-appropriate platform budget

Supported V1 Transitions

desktop -> studio-local
studio-local -> desktop
desktop -> compute
compute -> desktop

studio-local -> compute should route through desktop policy unless a future transition contract explicitly allows direct promotion.

desktop -> studio-local

Actions:

validate studio capability is local
stop vLLM if active
verify or isolate desktop.target
start studio-local-policy.service
start audio-priority.service
re-observe

Success predicates:

observer reports studio-local
graphical session is available
studio policy marker is active
audio-priority overlay is active
vLLM is inactive

studio-local -> desktop

Actions:

stop audio-priority.service
stop studio-local-policy.service
isolate or verify desktop.target
re-observe

Success predicates:

observer reports desktop
studio policy marker is inactive
audio-priority overlay is inactive
graphical session remains available

desktop -> compute

Actions:

observe source state
validate local compute capability
check audio idle
check graphical session is terminable
notify or terminate graphical session when configured
wait for session exit
check GPU display release predicate
stop studio-local overlay services if active
isolate compute.target
start or verify vllm.service
re-observe

Success predicates:

observer reports compute
graphical session is absent or non-authoritative
compute target is active
vLLM is active when enabled
GPU ownership evidence satisfies compute profile

Acceptable degraded compute examples:

vLLM active on a reduced GPU profile while meeting minimum compute policy
non-critical desktop service remains but does not conflict with compute
residual display allocation is below configured threshold

Failed transition examples:

source cannot be classified
audio guard blocks transition
graphical session cannot terminate
GPU release predicate returns execution error or unsafe conflict
compute target starts but observer remains conflicted

compute -> desktop

Actions:

observe source state
check vLLM drainability
stop vllm.service
isolate desktop.target
start or verify graphical/session path
re-observe

Success predicates:

observer reports desktop
vLLM is inactive
graphical session is available
no compute-only conflict remains

Rollback must be validated through the same post-action observation rules.

ConfigCTL Home Layering Implementation Plan

Purpose

configctl is a generic home-configuration reconciliation CLI.

Dubnium may package and invoke it, but the CLI must not be Dubnium-specific. It should be usable on:

Dubnium bare metal
laptops
WSL
future NixOS machines
CI dry-run environments

Dubnium remains responsible for machine policy, runtime modes, services, and local AI infrastructure. configctl owns layered home configuration reconciliation.

Core Model

Per-tool home configuration is organized into ownership layers:

~/.config/<tool>/
├── managed.*      # generated by Home Manager/dotfiles; never edit directly
├── local.*        # machine-specific; never automatically promoted
├── custom.d/      # user-authored promotion candidates
└── adopted.d/     # fragments already promoted or represented by managed config

Ownership rules:

managed.*    -> governed source of truth
local.*      -> machine-specific, ignored by promotion
custom.d/*   -> promotion candidates
adopted.d/*  -> archived/adopted fragments, ignored during normal load

Initial CLI Surface

Implemented commands:

configctl status [tool]
configctl doctor
configctl init <tool>
configctl promote <tool> <fragment>
configctl reconcile [tool]

Phase 0 — Documentation and Skeleton

Status: complete.

Tasks:

document the per-tool layering contract
add configctl package scaffold
add initial configctl script
expose configctl from the Dubnium flake packages
install configctl on the workstation target

Phase 1 — Local Layer Initialization

Goal: safe scaffolding of layer directories.

Status: complete.

Commands:

configctl init hypr
configctl init git
configctl init nvim
configctl init zsh

Behavior:

create custom.d/
create adopted.d/
create the tool-appropriate local.* file
do not overwrite existing files
do not modify managed files

Phase 2 — Status and Doctor

Goal: inspect local layer state without mutating anything.

Status: complete.

configctl status [tool] reports:

local layer presence
custom fragment count
adopted fragment count
missing expected directories
unpromoted files in custom.d/

configctl doctor reports:

whether essential tools (git, find) are available
whether the dotfiles repo is found
whether XDG state/cache/data roots exist

Phase 3 — Promote

Goal: move local configuration fragments into the dotfiles repository.

Status: complete.

configctl promote <tool> <fragment>:

identifies the fragment in custom.d/
copies it to the equivalent path in external/dotfiles/files/home/
stages the file in the dotfiles git repository

Promotion remains review-gated via Git (operator must commit and push).

Phase 4 — Reconcile

Goal: detect drift between local overlays and the dotfiles repository.

Status: initial version complete.

configctl reconcile [tool]:

compares custom.d/ locally with the dotfiles repository
reports files present in dotfiles but missing locally (suggesting a sync or adoption)

Future Phases

Adoption Manifest: track promoted fragments by hash across machines.
Governance Integration: link promotion to review workflows.
Cleanup: automated garbage collection of adopted fragments.

Non-Goals

configctl should not:

replace Home Manager
replace Git
replace NixOS modules
become Dubnium-specific
silently promote local configuration
automatically delete user-authored fragments without an adopted/archive path
treat runtime state as governed configuration

Diagrams

Status: living

These diagrams use a C4-inspired structure plus state/runtime views.

System Context

flowchart LR
    Operator[Local operator]
    Host[Dubnium NixOS host]
    GPUs[Display and compute GPUs]
    Audio[Audio interface]
    Studio[Optional external studio host]
    Micrantha[Micrantha / k3s workloads]
    Models[Local model bundles / runtime data]

    Operator -->|mode request/status| Host
    Host --> GPUs
    Host --> Audio
    Host -->|future placement| Studio
    Host --> Micrantha
    Host --> Models

Container View

flowchart TD
    CLI[mode CLI]
    Controller[mode-controller]
    Observer[observe-current]
    Guards[guard scripts]
    Systemd["systemd targets/services/slices"]
    Workloads["Hyprland, audio, vLLM, k3s"]
    Runtime["/run/mode-controller"]
    Audit["/var/lib/mode-controller"]

    CLI --> Runtime
    CLI --> Controller
    Controller --> Observer
    Controller --> Guards
    Controller --> Systemd
    Systemd --> Workloads
    Observer --> Systemd
    Observer --> Runtime
    Controller --> Runtime
    Controller --> Audit

Mode State View

stateDiagram-v2
    [*] --> bootstrapping
    bootstrapping --> desktop: boot default

    desktop --> studioLocal: request studio-local
    studioLocal --> desktop: request desktop

    desktop --> transitioning: request compute
    compute --> transitioning: request desktop

    transitioning --> desktop: observed desktop
    transitioning --> compute: observed compute
    transitioning --> studioLocal: observed studio-local
    transitioning --> degradedDesktop: partial desktop
    transitioning --> degradedCompute: partial compute
    transitioning --> failedTransition: unsafe/conflicted

    degradedDesktop --> desktop: reconcile
    degradedCompute --> compute: reconcile
    failedTransition --> desktop: rollback succeeds

Reconciliation Sequence

sequenceDiagram
    participant U as Operator
    participant C as mode CLI
    participant R as Reconciler
    participant O as Observer
    participant G as Guards
    participant S as systemd

    U->>C: mode request compute
    C->>R: start mode-controller@compute
    R->>R: acquire lock
    R->>O: observe current
    O-->>R: desktop with evidence
    R->>G: run transition guards
    G-->>R: pass/block/error results
    R->>S: terminate session / isolate target / start services
    R->>O: re-observe
    O-->>R: compute or degraded/failed
    R-->>C: transition result
    C-->>U: status

Rolling Implementation Design

Status: living draft

This file captures the current implementation design for Dubnium as a rolling reference. It should be updated as hardware facts, control-plane contracts, and mode-transition behavior are validated on the real host.

Documentation framework:

architecture docs live under docs/architecture/
accepted decisions live under docs/decisions/
operator procedures live under docs/runbooks/
this file remains the rolling synthesis, gap register, and implementation backlog

Architecture Summary

Dubnium is a NixOS host that must behave as one physical machine with multiple operational contracts:

desktop: normal Hyprland workstation/dev mode. GUI and ordinary audio are active. The display GPU is protected. AI is off or tightly bounded in v1.
studio-local: conditional low-latency audio profile. It is a policy overlay on desktop, not the center of the architecture. If studio/audio moves to a Mac mini, the host-local state machine should still make sense.
compute: headless throughput mode. GUI is absent or non-authoritative. vLLM and platform workloads may use more of the machine, including both GPUs when present.

The key design rule is that desired state and current state are different things:

Desired state is operator or automation intent, written under /run/mode-controller.
Current state is observation-derived from runtime facts, not copied from desired state.
A reconciler moves the system toward desired state through guarded transitions.
systemd targets, services, and slices are the enforcement layer.
Transitions must be bounded, logged, idempotent, and able to report blocked, degraded, or failed outcomes explicitly.

The normative source is the Dubnium control-plane specification. Desired state is authoritative intent, current state is observer output, no transition runs without a lock, and success requires post-action re-observation. The local docs and current repo scaffold already align with the main direction: runtime switching first, no specialisations yet, desktop.target and compute.target as first-class targets, studio-local as a desktop overlay, vLLM compute-only in v1, and k3s stable across modes.

Gaps / Risks

The goal is to keep this section operational. Items should either be resolved for v1, converted into implementation work, or left as explicit open questions with an owner before the first live build.

Contradictions to Resolve

Resolved for v1:

Topic	Decision	Follow-up
`studio-local.target` vs overlay	Do not create a first-class `studio-local.target` in v1. Use `studio-local-policy.service` and `audio-priority.service` as a desktop overlay.	Update older checklist wording when touching that file.
Root-on-RAM / impermanence	Defer Root-on-RAM, `/persist`, Home Manager, sops-nix, and impermanence until the base bootable control loop works.	Keep persistent path design compatible with adding `/persist` later.
`modectl` vs `mode`	Keep the local command name `mode`.	Treat `modectl` in upstream notes as an older name unless a rename is explicitly requested.
Desktop AI vs compute-only vLLM	Keep vLLM compute-only in v1.	Revisit bounded desktop AI only after reliable `desktop <-> compute` transitions.
Maintenance mode	Do not implement maintenance mode in the first milestone.	Reserve state names and avoid enum designs that make maintenance hard to add later.

Open compatibility item:

Desired/current state format remains plain text in the current scaffold. This is acceptable for the first bootable milestone only if transition records carry structured metadata. The next hardening pass should move toward desired.json and current.json, or explicitly document why the plain-text files remain the stable interface.

Missing Decisions

Resolved for v1:

Decision	V1 stance
Authority model	Require privileged transition execution. The initial operator path is `sudo mode request <mode>` or root-owned `mode-controller@.service`. Unprivileged users must not be able to forge desired/current state or transition success.
Reboot policy	Boot normalizes to `desktop`. Do not replay last desired mode across reboot in v1.
vLLM service shape	Use one `vllm.service`, compute-only. Keep the controller and options shaped so `vllm@compute.service` can replace it later.
k3s lifecycle	Keep `k3s.service` stable across modes in v1. Express mode pressure through `platform.slice` budgets before adding start/stop behavior.

Still open before live compute testing:

Open item	Concrete next step
GPU release predicate	Define a target-host predicate using `loginctl`, compositor absence, `nvidia-smi` process evidence, and an acceptable residual VRAM threshold. Record both pass and indeterminate outcomes.
Degraded thresholds	Define `degraded-compute` as safe but incomplete compute operation, such as vLLM active on a reduced GPU profile or residual non-critical display allocation below the configured threshold. Define `failed-transition` for unsafe, conflicting, or unclassified post-action states.
Persistent audit location	Choose `/var/lib/mode-controller/events.jsonl` now, with an option to move it under `/persist/var/lib/mode-controller/events.jsonl` when impermanence lands.
k3s compute policy	Decide whether v1 only changes `platform.slice` weights or also applies k3s labels/taints for workload intensity. Do not do both until there is a real workload that needs it.

Risky Assumptions

Risk	Failure mode	Mitigation
NVIDIA/Wayland GPU release is sticky	Compute promotion terminates the GUI but leaves display GPU allocations or ambiguous CUDA/display ownership.	Treat GPU release as an observation predicate, not an assumption. Add bounded timeout, residual threshold, and escalation criteria for specialization/reboot-mediated compute.
`systemctl isolate compute.target` stops too much	Important baseline services disappear because target dependencies are incomplete.	Keep `compute.target` minimal and explicitly list required base services. Test with `systemctl list-dependencies compute.target` before live switching.
Shell observer misclassifies mixed states	Status reports `compute` while GUI, audio, or conflicting services are still active.	Prefer `unknown`, `transitioning`, `degraded-*`, or `failed-transition` over false success. Add JSON evidence output and snapshot tests.
Rollback does not restore a usable desktop	`desktop.target` starts but graphical session/audio/display remain broken.	Make rollback success require post-rollback observation, not just successful systemctl commands. Record degraded desktop if partially restored.
`/run` loses state on reboot	Recent desired/current files disappear and audit history is lost.	Keep live lock/current/desired in `/run`; write transition history to `/var/lib/mode-controller/events.jsonl` before introducing impermanence.

Gap Closure Backlog

These are the smallest useful implementation/doc tasks to close the current gaps without broadening scope:

Update older checklist references so studio-local is consistently described as a desktop overlay, not a v1 target.
Add a short docs/control-plane-decisions.md or extend this file with a dated decision log for authority model, reboot policy, vLLM shape, and audit location.
Define the exact observe-current --json schema before adding more transition logic.
Define the GPU release predicate in docs, then implement it in check_gpu_display_released.
Add persistent audit output to /var/lib/mode-controller/events.jsonl.
Add observer classifications for degraded-compute, degraded-desktop, and failed-transition before relying on rollback.
Keep k3s mode behavior limited to platform.slice weights until a concrete platform workload proves that labels, taints, or service restarts are needed.

Proposed Repo Structure

Use the existing scaffold and keep it simple:

.
├── flake.nix
├── hosts/
│   └── workstation/
│       ├── default.nix
│       └── hardware-configuration.nix
├── modules/
│   ├── dubnium/
│   │   ├── default.nix
│   │   ├── options.nix
│   │   ├── state.nix
│   │   ├── targets.nix
│   │   ├── slices.nix
│   │   ├── services.nix
│   │   ├── controller.nix
│   │   └── guards.nix
│   └── workloads/
│       ├── hyprland.nix
│       ├── audio.nix
│       ├── nvidia.nix
│       ├── vllm.nix
│       └── k3s.nix
├── pkgs/
│   └── mode-tools.nix
├── scripts/
│   ├── mode
│   ├── reconcile
│   ├── observe-current
│   ├── lib.sh
│   └── guards/
│       ├── check_audio_idle
│       ├── check_gpu_display_released
│       ├── check_graphical_session_terminable
│       ├── check_vllm_drainable
│       ├── check_compute_capability_local
│       ├── check_studio_capability_local
│       ├── check_memory_headroom
│       └── check_persistence_paths_ready
└── docs/

Flake Design

nixosConfigurations.workstation imports hosts/workstation/default.nix.
nixosModules.default exposes the Dubnium module.
packages.x86_64-linux.mode-tools packages the CLI, observer, reconciler, and guards.
Add home-manager, sops-nix, and impermanence later only when the base transition loop is proven.

Module Layout

options.nix: all host policy knobs: default mode, GPU topology, vLLM model/profile, studio placement, slice weights.
state.nix: creates /run/mode-controller, writes generated topology and placement files, initializes boot default.
targets.nix: defines desktop.target and compute.target; no v1 studio-local.target.
slices.nix: defines interactive.slice, ai.slice, platform.slice.
services.nix: marker/policy services like studio-local-policy.service, audio-priority.service, mode-observe.service.
controller.nix: mode-controller@.service, boot normalization unit, permissions.
guards.nix: installs guard scripts and documents exit-code contract.
workloads/*.nix: workload-specific units, not mode policy.

systemd Targets and Dependencies

desktop.target
  Wants=graphical.target
  After=graphical.target

compute.target
  Conflicts=graphical.target desktop.target
  Wants=vllm.service
  After=multi-user.target network-online.target

For studio-local, use:

studio-local-policy.service
  Type=oneshot
  RemainAfterExit=true
  Slice=interactive.slice

audio-priority.service
  Type=oneshot
  RemainAfterExit=true
  ExecStart=systemctl set-property --runtime ...
  ExecStop=reset slice weights

Slice Structure

interactive.slice: Hyprland/session-adjacent services, audio priority policy, desktop-critical work.
ai.slice: vLLM and future AI workloads.
platform.slice: k3s and platform/background services.
Optional later: maintenance.slice if maintenance mode becomes real.

Service Layout

vllm.service: compute-only in v1, Slice=ai.slice, WantedBy=compute.target, persistent model/cache path outside the Nix store.
k3s.service: stable across modes in v1, Slice=platform.slice; mode differences are resource budgets/policy, not start/stop.
Hyprland/display stack: owned by normal graphical/session machinery; desktop.target should depend on it but not become a giant desktop controller.
Audio/PipeWire: normal desktop user services; studio-local only applies priority policy and blocks compute promotion when active audio is detected.

Control Plane Shape

Mode CLI

mode status
mode request <desktop|studio-local|compute>
mode reconcile [--target <mode>]
mode current [--refresh] [--json]
mode desired
mode dry-run <mode>
mode explain [<mode>]

Recommended additions after the first scaffold:

mode guards <target>
mode history
mode last-transition
mode doctor

mode request should be synchronous in v1: return success only after post-transition observation satisfies the target. Otherwise it should return non-zero and show the failed or blocking reason.

Observer / Classifier

The observer should be conservative and evidence-first. It should inspect:

active graphical sessions via loginctl
compositor/display-manager state
compute.target and vllm.service
studio-local-policy.service
PipeWire/JACK/REAPER indicators
NVIDIA process/VRAM evidence where available
controller lock/transition marker
last failed transition marker

Output should support plain mode for scripts and JSON for status/debug:

{
  "observed_state": "desktop",
  "confidence": "high",
  "degraded": false,
  "signals": {
    "graphical_session_active": true,
    "compute_target_active": false,
    "vllm_active": false,
    "studio_policy_active": false
  },
  "conflicts": [],
  "timestamp": "..."
}

Classification rule: if signals conflict, report transitioning, degraded-*, or failed-transition; do not pretend the desired target was reached.

Guard Layout

Guards are standalone scripts or subcommands.
Exit codes:
- 0: pass
- 10-19: policy block
- 20+: execution/check error
Each guard emits structured JSON or stable key/value output.
Guards should check one thing each.

Initial guard set:

check_audio_idle: REAPER/PipeWire/JACK activity blocks compute.
check_graphical_session_terminable: pre-action check before killing GUI.
check_gpu_display_released: post-action validation after GUI teardown.
check_vllm_drainable: compute -> desktop.
check_compute_capability_local: placement check.
check_studio_capability_local: blocks studio-local if externalized.
check_memory_headroom: avoids launching compute under obvious pressure.
check_persistence_paths_ready: model store/runtime paths exist and are writable.

First Milestone

The smallest bootable milestone should be narrower than “all modes implemented.”

Goal: boot the flake-managed workstation into desktop, expose the control plane, and prove an observable/auditable desktop baseline before deep workload switching.

Generate real hardware config into hosts/workstation/hardware-configuration.nix.
Confirm host options:
- dubnium.boot.defaultMode = "desktop"
- dubnium.hardware.presentGpus
- dubnium.hardware.displayGpu
- dubnium.hardware.computeGpus
- vLLM disabled or compute-only
- studio placement set to local only if local audio is still intended

Build without switching:

sudo nixos-rebuild build --flake .#workstation

Switch only after evaluation succeeds:

sudo nixos-rebuild switch --flake .#workstation

Verify boot/control-plane files:

mode status
mode current
mode desired
sudo ls -la /run/mode-controller

Verify systemd skeleton:

systemctl status desktop.target
systemctl status compute.target
systemctl status studio-local-policy.service
systemctl status audio-priority.service
systemctl status vllm.service

Prove observer honesty:
- In desktop, mode current should say desktop.
- vllm.service should be inactive.
- studio-local-policy.service should be inactive unless requested.
- If evidence conflicts, status should show conflict/degraded/failed rather than silently reporting success.

Test the safe overlay first:

sudo mode request studio-local
mode status
sudo mode request desktop
mode status

Only then test desktop -> compute with vLLM either disabled, stubbed, or known-good:
```
sudo mode request compute
mode status
sudo mode request desktop
mode status
```
Milestone success criteria:
- The machine boots from the flake.
- mode status/current/desired work.
- Desired/current separation is visible.
- The controller lock prevents concurrent transitions.
- Guard failures are reported distinctly from execution errors.
- desktop -> studio-local -> desktop works as an overlay.
- desktop -> compute -> desktop either works or fails with a clear guard/action/post-observation reason.
- No failed transition is reported as a successful target mode.

The next milestone after that should be a real desktop <-> compute control loop with vLLM active, structured audit records, rollback to desktop, and explicit degraded-compute thresholds.

System Implementation Plan

Status: living plan

This plan is for implementing Dubnium on the actual workstation host. It expands the short bring-up checklist into a cautious, evidence-driven rollout. The goal is not to turn everything on at once. The goal is to prove one layer at a time: hardware facts, Nix evaluation, boot baseline, observer honesty, overlay mode, compute mode, rollback, then hardening.

Current V1 Assumptions

These assumptions come from the current repo configuration and should be confirmed before the first live switch:

Area	Current assumption
Host flake target	`.#workstation`
Hostname	`dubnium-workstation`
Boot default	`desktop`
Studio placement	`local`
`studio-local` representation	desktop overlay using `studio-local-policy.service` and `audio-priority.service`
vLLM lifecycle	compute-only in v1
vLLM model	`Qwen/Qwen2.5-Coder-14B-Instruct`
Current GPU phase	planned 2 GPUs, currently present `[ 0 ]`
Display GPU	`0`
Compute GPUs	`[ 0 ]` until second GPU is present
k3s	disabled in current host config
Bootloader	systemd-boot with EFI variable access
Runtime state	`/run/mode-controller`

Do not proceed to live transition testing until the hardware facts are confirmed against the actual host.

Phase 0: Safety and Ground Truth

Objective: know enough about the machine to avoid destructive or confusing changes.

0.1 Confirm Installation Path

Decide which path applies:

existing NixOS machine: use nixos-rebuild build then switch
fresh install from live USB: use the fresh install runbook first
non-NixOS current OS: do not use this plan directly until disk/install strategy is decided

Exit criteria:

install path is explicit
target disk and boot mode are known if fresh installing
rollback access path is known

0.2 Confirm Remote/Recovery Access

Before switching system configuration:

ip addr
systemctl status sshd

Confirm:

local keyboard/display access works
SSH is enabled or a local console is available
you know how to select an older NixOS generation at boot
important local data is backed up

Failure mode to avoid:

switching into a broken graphical/session state with no recovery path

0.3 Capture Hardware Facts

Run on the target host:

lspci -nn | grep -E 'VGA|3D|Audio|USB'
nvidia-smi
lsblk -f
findmnt
bootctl status

Record:

actual GPU count
which GPU drives display
GPU PCI IDs
NVIDIA driver visibility through nvidia-smi
boot disk/filesystem layout
EFI/systemd-boot status
audio interface and whether REAPER/local studio is still needed on-host

Exit criteria:

dubnium.hardware.presentGpus matches real visible GPUs
dubnium.hardware.displayGpu matches the display path
dubnium.hardware.computeGpus only references present GPUs
bootloader assumptions match the host

0.4 Decide First Compute Profile

For first live validation, choose the least surprising compute profile:

with one GPU: compute may terminate the desktop and use GPU 0
with two GPUs: compute can target both GPUs, but only after single-GPU behavior is proven
vLLM should stay compute-only

If VRAM is tight, add vLLM guardrails before compute testing:

dubnium.vllm.extraArgs = [
  "--max-model-len" "8192"
  "--gpu-memory-utilization" "0.70"
  "--enforce-eager"
];

Do not add desktop AI in the first rollout.

0.5 Seed Local Model Bundle

Preferred path:

copy the selected materialized model bundle from the Dubnium USB seed into /var/lib/dubnium/models
keep model weights out of Git and out of the Nix store

See docs/runbooks/model-seeding.md for the exact operator flow.

Phase 1: Repo and Host Configuration Review

Objective: make the flake match the real system before any switch.

1.1 Generate Hardware Configuration

On the target NixOS machine:

sudo nixos-generate-config --dir ./hosts/workstation

Review:

root filesystem and boot filesystem entries
EFI mount point
generated hardware imports
NVIDIA-related hardware detection

Do not preserve the placeholder hardware file if it does not match the target.

1.2 Review Host Config

Inspect:

sed -n '1,220p' hosts/workstation/default.nix

Confirm or update:

networking.hostName
bootloader settings
services.openssh.enable
dubnium.capabilityPlacement.studio
dubnium.vllm.enable
dubnium.vllm.model
dubnium.vllm.extraArgs
dubnium.hardware.presentGpus
dubnium.hardware.displayGpu
dubnium.hardware.computeGpus
dubnium.k3s.enable

Recommended first-system stance:

keep boot.defaultMode = "desktop"
keep enableDesktopProfile = false
keep k3s.enable = false until mode control is proven
keep computeGpus = [ 0 ] if only one GPU is currently installed

1.3 Confirm Module Assertions

The module already asserts:

display GPU must be present
desktop AI GPUs must be present
compute GPUs must be present
vLLM package and model must be set when vLLM is enabled

These assertions are useful. If they fail, fix the host facts rather than bypassing them.

Exit criteria:

host config expresses real hardware, not planned hardware
planned hardware is represented only in plannedGpuCount
actual services enabled match the first rollout scope

Phase 2: Build Without Switching

Objective: prove Nix evaluation and build before mutating the live system.

Run:

sudo nixos-rebuild build --flake .#workstation

If it fails, classify the failure:

hardware config mismatch
unfree/NVIDIA package issue
vLLM package evaluation issue
missing module import
syntax or option error

Do not run switch until build succeeds.

Useful follow-up checks:

nix flake check
nix build .#packages.x86_64-linux.mode-tools

Exit criteria:

flake builds successfully
mode-tools package builds
no host option assertion is failing

Phase 3: First Switch to Desktop Baseline

Objective: switch only into the safe desktop-default posture.

Run:

sudo nixos-rebuild switch --flake .#workstation

Immediately check:

hostname
mode status
mode current
mode desired
sudo ls -la /run/mode-controller
systemctl status desktop.target
systemctl status compute.target
systemctl status studio-local-policy.service
systemctl status audio-priority.service
systemctl status vllm.service

Expected:

host boots or remains usable
desired mode is desktop
current mode is desktop, or a clearly explained non-desktop state
vllm.service is inactive in desktop
studio-local-policy.service is inactive
audio-priority.service is inactive
/run/mode-controller exists

If mode current reports compute or studio-local unexpectedly, stop and fix observation before testing transitions.

Exit criteria:

desktop baseline is usable
mode CLI works
observer output matches visible reality

Phase 4: Control-Plane Inspection Before Transitions

Objective: prove the controller can explain the system before it mutates the system.

Run:

mode status
mode current --refresh
mode current --json
mode explain desktop
mode explain studio-local
mode explain compute
sudo cat /run/mode-controller/capability-placement.json
sudo cat /run/mode-controller/hardware-topology.json

Check that the JSON/evidence shape is useful enough to diagnose:

graphical session active or not
studio policy active or not
compute target active or not
vLLM active or not
last transition status

If mode current --json is too thin, harden observer output before running compute transitions. The observer is the foundation of safe switching.

Exit criteria:

status output distinguishes desired and current
current state is derived from facts
hardware and placement files match host configuration

Phase 5: Test `desktop -> studio-local -> desktop`

Objective: prove the low-risk overlay path before terminating the GUI for compute.

Run:

sudo mode request studio-local
mode status
systemctl status studio-local-policy.service
systemctl status audio-priority.service
systemctl show interactive.slice -p CPUWeight -p IOWeight
systemctl show ai.slice -p CPUWeight -p IOWeight
systemctl show platform.slice -p CPUWeight -p IOWeight

Expected:

observed mode becomes studio-local
studio-local-policy.service is active
audio-priority.service is active
interactive slice weights are raised
AI/platform slice weights are lowered
vLLM remains inactive

Return to desktop:

sudo mode request desktop
mode status
systemctl status studio-local-policy.service
systemctl status audio-priority.service
systemctl show interactive.slice -p CPUWeight -p IOWeight
systemctl show ai.slice -p CPUWeight -p IOWeight
systemctl show platform.slice -p CPUWeight -p IOWeight

Expected:

observed mode becomes desktop
overlay services are inactive
slice weights return to baseline

Exit criteria:

overlay activation and cleanup are repeatable
observer accurately distinguishes desktop and studio-local
failure records are useful if a command fails

Phase 6: Precompute Guard Validation

Objective: test compute guards without trusting the full transition yet.

Before running a real compute transition:

mode status
systemctl status vllm.service
loginctl list-sessions

Manually confirm:

no active REAPER project
no live audio session you care about
no long-running foreground job
model store path has enough space
vLLM model choice fits current GPU memory plan

Run or inspect guards if exposed through the CLI. If not yet exposed, use the existing transition path cautiously and rely on last-guards.json.

Compute should be blocked when:

audio is active
graphical session is not terminable
memory headroom is insufficient
target is not reachable
required persistence paths are missing

Exit criteria:

you know which guards are hard blocks
guard failures are visible in last-guards.json
no guard silently assumes success

Phase 7: First `desktop -> compute` Transition

Objective: prove one real promotion into compute, accepting that the first attempt may reveal NVIDIA/session behavior.

Preconditions:

desktop baseline has already been verified
studio overlay path has already been verified
no critical local work is running
local console or SSH recovery is available

Run:

sudo mode request compute

Then inspect:

mode status
systemctl status compute.target
systemctl status vllm.service
loginctl list-sessions
nvidia-smi
sudo cat /run/mode-controller/last-transition.json
sudo cat /run/mode-controller/last-guards.json
journalctl -u 'mode-controller@*' -b
journalctl -u vllm.service -b

Expected success:

observed mode is compute
graphical session is absent or non-authoritative
compute.target is active
vllm.service is active if enabled
GPU process evidence matches compute expectations
transition record says success

Acceptable first degraded outcomes:

vLLM starts but only on reduced GPU profile
residual display allocation remains below a documented threshold
non-critical desktop unit remains active without resource conflict

Hard failures:

observer cannot classify final state
audio or GUI conflict remains
GPU release is indeterminate
vLLM fails repeatedly and prevents compute contract
rollback cannot restore desktop

If the transition fails, do not keep retrying blindly. Diagnose the first failed predicate.

Phase 8: First `compute -> desktop` Return

Objective: prove rollback/restoration before treating compute as usable.

Run:

sudo mode request desktop

Then inspect:

mode status
systemctl status desktop.target
systemctl status vllm.service
loginctl list-sessions
nvidia-smi

Expected:

observed mode is desktop
vllm.service is inactive
graphical session path is usable
audio returns to ordinary desktop behavior
no compute-only state remains authoritative

If desktop is only partially restored, classify the result as degraded and fix the observer/controller before more compute testing.

Exit criteria:

one complete desktop -> compute -> desktop loop works or fails with a clear documented reason
rollback is evidence-backed

Phase 9: Repeatability and Soak

Objective: distinguish a one-time success from a reliable operating model.

Repeat:

sudo mode request studio-local
sudo mode request desktop
sudo mode request compute
sudo mode request desktop

For each run, record:

final mode status
transition duration
guard output
whether GPU release was clean
whether desktop restoration was clean
whether vLLM startup was reliable

Minimum repeatability bar before broader usage:

3 clean studio overlay round trips
3 clean compute round trips
no false-success observer classifications
no unexplained stale locks
no manual cleanup needed between runs

Phase 10: Hardening Backlog

Only after the first transition loop is proven, prioritize hardening in this order:

Richer observe-current --json evidence and conflicts.
Persistent audit log at /var/lib/mode-controller/events.jsonl.
Explicit GPU release predicate and thresholds.
Degraded state classification for desktop and compute.
Guard CLI surface such as mode guards <target>.
vLLM runtime guardrails and model store persistence.
k3s enablement and platform.slice policy.
Optional impermanence and /persist mapping.
Bounded desktop AI after second GPU and stable transitions.
Specialisation evaluation only if runtime switching fails repeatedly.

Stop Conditions

Stop implementation and return to planning if any of these occur:

the observer reports false success
desktop cannot be restored through the controller
GPU release is repeatedly indeterminate
target isolation stops recovery-critical services
vLLM causes repeated OOM or driver instability
failures require undocumented manual cleanup

The correct response to any stop condition is not more automation. First improve observation, logs, predicates, and rollback.

Evidence to Keep

For each major milestone, keep the following:

mode status
mode current --json
sudo cat /run/mode-controller/last-transition.json
sudo cat /run/mode-controller/last-guards.json
systemctl status desktop.target compute.target vllm.service
nvidia-smi
journalctl -u 'mode-controller@*' -b

For repeated failures, copy the relevant evidence into an issue, planning note, or future runbook update before changing more code.

External Sources

Dotfiles

Dubnium uses the external/dotfiles checkout for user-level Home Manager configuration.

Repository: ryjen/dotfiles
Branch: feat/nix-migration
Local path: external/dotfiles

The submodule contract is declared in .gitmodules.

ADR-0001: Runtime Switching First

Status: accepted

Context

Dubnium needs to move between interactive desktop behavior and headless compute behavior. NixOS specialisations may eventually provide stronger separation, but they require reboot-mediated workflows and would slow down early validation.

Decision

Use runtime switching first. Implement mode changes through a local reconciliation loop using systemd targets, services, slices, guards, and post-action observation.

Do not introduce NixOS specialisations in v1.

Consequences

Rebootless switching can be validated early.
The observer and guard layer must be conservative.
GPU release reliability becomes a live risk.
Specialisations remain an escalation path if runtime switching proves too brittle.

Escalation Criteria

Reconsider specialisations or reboot-mediated compute if:

display GPU release remains unreliable after bounded iteration
compute promotion frequently lands in degraded or ambiguous states
desktop restoration is unreliable
kernel/module settings diverge materially between modes

ADR-0002: Studio-Local Is a Desktop Overlay

Status: accepted

Context

The host may support local low-latency audio work, but studio capability may move to an external Mac mini or another host. The architecture must not overfit around local studio behavior.

Decision

Represent studio-local as a policy overlay on desktop in v1.

Use:

studio-local-policy.service
audio-priority.service

Do not create a first-class studio-local.target in v1.

Consequences

The host-local state model remains coherent if studio capability moves away.
Studio policy can be applied and removed without a separate top-level target.
The observer still reports studio-local as a mode when overlay predicates are satisfied.
Any direct studio-local -> compute path should be routed through desktop policy unless a future transition contract explicitly permits it.

ADR-0003: vLLM Is Compute-Only in V1

Status: accepted

Context

Desktop-mode AI is possible in the target architecture, especially when a second GPU is installed. For the first reliable control-loop milestone, desktop AI adds resource contention and observer complexity.

Decision

Keep vLLM compute-only in v1.

Use one vllm.service attached to compute behavior. Shape options and controller actions so vllm@compute.service and a future bounded desktop profile can be added later.

Consequences

desktop and studio-local should leave vLLM inactive.
compute owns vLLM activation.
The first milestone can focus on mode transitions and observation.
Bounded desktop AI is deferred until desktop <-> compute switching is reliable on real hardware.

ADR-0004: Boot Defaults to Desktop

Status: accepted

Context

The control-plane specification asks whether the system should replay the last desired mode after reboot or normalize to a safe default. Replaying compute after reboot could surprise the operator and re-enter a throughput posture without current evidence.

Decision

In v1, boot normalizes to desktop.

Do not replay the last desired mode across reboot.

Consequences

First boot behavior is predictable and operator-friendly.
/run/mode-controller can remain ephemeral for live state.
Persistent desired replay can be revisited after transition behavior and audit history are proven.

ADR-0005: k3s Stays Stable Across Modes in V1

Status: accepted

Context

k3s provides platform/control-node duties. Starting and stopping it during every mode transition would add operational churn before there is evidence that it is needed.

Decision

Keep k3s.service stable across desktop, studio-local, and compute in v1.

Express mode differences through platform.slice budgets first. Defer labels, taints, workload intensity policies, and service lifecycle changes until a real platform workload requires them.

Consequences

Mode switching has fewer moving parts.
k3s remains available during desktop and compute operation.
Platform pressure must be bounded through slice policy until richer k3s mode behavior is justified.

ADR-0006: Tailscale Platform Connectivity

Status: accepted

Context

Dubnium needs stable remote reachability for the workstation without moving user-level shell, editor, or agent configuration into the system repository. Tailscale is machine and network identity, so it belongs with Dubnium’s platform policy rather than dotfiles.

Tailscale can also provide subnet routing, exit-node behavior, automatic enrollment, and Tailscale SSH. Those features change routing, firewalling, access control, and trust boundaries, so they should not be enabled as an incidental side effect of installing the client daemon.

Decision

Enable Tailscale as workstation-only platform connectivity in v1.

Dubnium will enable tailscaled and the tailscale CLI on the workstation, but node enrollment remains manual with sudo tailscale up.

Do not enable auth-key or OAuth enrollment, subnet routing, exit-node behavior, or Tailscale SSH in v1. Document those as future options that require explicit routing, ACL, firewall, and secrets-policy review.

Consequences

The workstation can join the tailnet with a small, reviewable system change.
Dotfiles remains responsible for user-level tooling only.
First enrollment is an operator action instead of a rebuild side effect.
Future subnet router, exit-node, and Tailscale SSH support has a documented path without widening v1 network exposure.

ADR-0007: WSL Is a Headless Validation Target

Status: accepted

Context

Dubnium needs a fast way to validate shared flake composition, module wiring, and mode-controller behavior before every change has to run on the bare-metal workstation target.

WSL is useful for that loop, but it is not equivalent to the real workstation. It does not validate EFI, bootloader behavior, workstation hardware generation, Hyprland, audio/studio behavior, NVIDIA runtime details, or final GPU topology.

The upstream nix-community/NixOS-WSL project already owns the WSL-specific boot and integration layer. Reimplementing that locally would create another platform surface for Dubnium to maintain before there is evidence that it is needed.

Decision

Keep wsl as a first-class flake host target for headless validation, built on top of nix-community/NixOS-WSL.

Use .#wsl to validate shared Dubnium composition and headless services inside an existing NixOS-WSL distro. Set its default Dubnium mode to compute, enable the shared mode controller, and keep resource-heavy services such as vllm and k3s disabled by default. Enable those services intentionally when the task is specifically to validate their WSL runtime behavior.

Do not treat .#wsl as a replacement for .#workstation, the bare-metal install path, or workstation hardware validation.

Consequences

Shared module wiring and activation can be exercised from a faster Windows/WSL loop.
WSL-specific platform support stays delegated to the upstream NixOS-WSL module.
The WSL target remains intentionally headless, compute-biased, and lightweight.
Passing WSL validation does not prove workstation graphics, audio, bootloader, EFI, NVIDIA, or final GPU behavior.
WSL runbooks and checks must stay separate from bare-metal first-bring-up and fresh-install procedures.

Escalation Criteria

Reconsider the WSL target shape if:

upstream NixOS-WSL no longer supports the required system integration points
WSL behavior diverges enough from Dubnium’s shared module graph to make the target misleading
bare-metal validation becomes cheap and reliable enough that a separate WSL target no longer reduces risk or cycle time

ADR-0008: Seed Local vLLM Model Bundles

Status: accepted

Context

Dubnium’s first compute workload uses vLLM with a locally served model bundle. The exact model is host configuration, not part of the USB seed format.

Model weights are large mutable runtime artifacts. Keeping them in Git would inflate the repository and blur source policy with runtime state. Keeping them in the Nix store would make first install, rebuild, and recovery depend on large model fetches during system activation and would couple model bytes to immutable system generations.

Fresh install and recovery should work even when the machine does not yet have reliable network access. The seed format should not depend on Hugging Face hub cache internals such as refs, blobs, snapshots, or symlinks.

Decision

Keep model weights out of Git and out of the Nix store.

Treat /var/lib/dubnium/models as the Dubnium-owned runtime model store. Seed normal local model bundle directories from removable media as the preferred v1 provisioning path.

Use a materialized bundle directory for the selected compute model. The workstation vLLM service serves a path under:

/var/lib/dubnium/models

If a Hugging Face cache is used as the source of the seed, materialize the snapshot once before putting it on the USB. The runtime seed and installed model store should be ordinary directories with model files and SHA256SUMS.

Consequences

The Dubnium repository stays small and source-only.
Nix continues to own service policy and runtime configuration, not model artifact storage.
Fresh install and recovery can avoid depending on a large network download.
Runtime no longer depends on Hugging Face cache layout or symlink behavior.
Operators must manage the seed media and verify the local bundle before entering compute mode.
Reproducibility of model bytes depends on the seed contents until a specific model revision is selected and recorded.
vLLM startup failures may indicate an absent, incomplete, misplaced, or revision-mismatched local model bundle.

Escalation Criteria

Reconsider this policy if:

model revision pinning becomes mandatory for reproducible evaluation
a dedicated artifact mirror or cache service becomes available
install-time network access becomes reliable enough to remove the USB seed path
model storage needs to support multiple served models, quantized variants, or per-mode model selection

ADR-0009: Manage Runtime Secrets Outside Nix Source

Status: accepted

Context

Dubnium needs private material for several different lifetimes:

local source payloads for installing this private repository
host-local identities for services such as Tailscale
runtime tokens for workloads such as vLLM model downloads
user-runtime tokens for tools such as Codex and GitHub CLIs after install
large private or mutable artifacts such as model weights

These are not the same class of data. Treating all of them as Nix source would either leak secrets into Git, copy secret bytes into the Nix store, or make activation depend on external state that belongs to the operator.

Existing Dubnium policy already keeps the repository source-only and keeps vLLM model weights in runtime cache state. Installer bootstrap should use local source payloads, such as a git archive tarball or copied working tree, rather than GitHub credentials in the live installer.

Decision

Use sops-nix with age recipients as the preferred provider for runtime service secrets.

Commit only encrypted SOPS documents and non-secret policy. Decrypt secrets at activation into runtime paths under /run/secrets or into sops-nix generated environment files. Services consume those paths; Nix modules declare the consumer contract, not the secret value.

Keep install source bootstrap separate from runtime secrets. Install media should use a local source payload prepared before booting the target machine. Do not require GitHub credentials during install.

Allow user-runtime secrets after install. Tools such as Codex may need an OPENAI_API_KEY, and user workflows may later need a GITHUB_TOKEN. Those tokens belong to the user runtime, not installer bootstrap, and should be decrypted by Home Manager or another user-scoped secret mechanism at session or process launch time.

Keep host enrollment identities separate from ordinary workload tokens. Tailscale remains manually enrolled for v1. If unattended enrollment is added later, it must use a short-lived auth key passed once during enrollment rather than a long-lived key committed to source.

Keep model weights out of Git, out of SOPS, and out of the Nix store. The Dubnium model store under /var/lib/dubnium/models remains mutable runtime state, not secret state.

Consequences

The repository can contain secret wiring without containing secret values.
Host rebuilds can declare which services need secrets without exposing those secrets in derivations or module options.
Operators must manage age identities and encrypted SOPS files during bring-up.
Secret rotation is done by updating encrypted SOPS data and rebuilding or restarting affected services.
Source bootstrap, enrollment, runtime tokens, and model artifacts keep separate handling rules instead of sharing one overloaded mechanism.

Escalation Criteria

Reconsider this policy if:

Dubnium gains a dedicated external secret manager
unattended installation needs to handle many machines at once
secret rotation needs central audit or approval workflows
Kubernetes-hosted workloads become the primary secret consumers

ADR-0010: Keep Persistent Memory Separate From vLLM Runtime

Status: accepted

Context

Dubnium is evolving from a local vLLM compute node toward longer-lived conversational and agentic workflows. Those workflows need durable recall, replayability, externally observable metadata, lifecycle hooks, and scoped retrieval.

vLLM is already the inference runtime for Dubnium’s compute mode. It is built to serve tokens with batching, prefix caching, streaming, model lifecycle control, and GPU-aware scheduling. It is not the right owner for durable user memory, agent task state, retention policy, or governance metadata.

The target hardware is constrained. Dual 12GB RTX 3060 GPUs leave limited room for oversized context windows, high concurrency, and unnecessary KV-cache pressure. Treating persistent memory as “keep all context in the model” would make latency, reliability, and recovery worse.

Persistent memory also changes the security posture. Model output, retrieved documents, tool results, artifacts, and prior conversation summaries are all untrusted inputs when they cross a new session boundary. Without structured metadata and lifecycle events, a future governance layer cannot inspect, constrain, attest, or replay memory behavior.

Decision

Keep vLLM as the inference runtime only.

Build persistent memory as a separate subsystem owned by orchestration, retrieval, storage, summarization, and compaction layers. Orchestrators assemble prompts from working context, retrieved memories, task state, and artifact references before calling vLLM.

Keep the future governance layer external to the memory/runtime architecture. Dubnium memory/runtime should expose structured records, metadata, and lifecycle hooks for governance to inspect later, but vLLM, vector stores, artifact stores, and MemGPT-style runtimes should not depend directly on that future substrate.

Do not persist transformer KV state as the durable memory mechanism. KV cache state can remain an inference optimization inside vLLM, but durable memory must be replayable from stored events, summaries, artifacts, metadata, and retrieval records.

Use separate memory classes:

working context for current session continuity
episodic memory for meaningful historical interactions
semantic memory for normalized stable facts and conventions
task state for active workflows, checkpoints, and execution graphs
artifacts for external files, logs, generated outputs, and large payloads
metadata for provenance, trust hints, retention hints, sensitivity hints, and scope

The first implementation milestone should use a conservative local stack:

Postgres for structured memory, sessions, tasks, artifacts, and provenance
pgvector for local vector search
Redis for transient working context and queues where useful
a small embedding model such as bge-small or nomic-embed
rolling summaries instead of transcript replay
scoped retrieval before prompt assembly

Treat MemGPT-style self-editing memory as a later orchestration upgrade path, not the first storage substrate. The current maintained framework from that lineage is Letta; evaluate it after Dubnium has stable local memory storage, retrieval filters, redaction, provenance, and replay checks. If adopted, it should sit above the persistent memory subsystem and vLLM runtime instead of replacing Dubnium’s metadata, lifecycle hooks, or runtime-secret boundaries.

Boundaries

The inference layer owns token generation, batching, streaming, prefix caching, model startup, GPU assignment, and service health.

The memory subsystem owns storage, retrieval, summarization, embedding, compaction, artifact references, provenance records, and replay inputs.

The orchestration layer owns prompt assembly, scoped retrieval requests, tool coordination, and task workflow progression.

The future governance layer is adjacent. It may later evaluate policy, provenance, trust, retention, audit, and replay concerns by inspecting the structured records emitted by this layer, but it is not embedded in the vLLM runtime, memory database, vector store, artifact store, or MemGPT-style runtime.

Security Model

Assume all inputs are untrusted, including model output and retrieved memories.

Trust boundaries include:

user and agent prompts entering the orchestrator
model output entering summarization or memory extraction
tool output entering task state or memory storage
external documents entering retrieval indexes
retrieved memory entering prompt assembly
retrieval metadata controlling visibility and retention

Durable memory objects must carry enough metadata to support later policy and audit decisions:

source identity
provenance
validation status or validation hints
trust score
sensitivity classification
retention hint or TTL
namespace or project scope
agent boundary
replay lineage

The first milestone must emit enough structure to support mitigation of:

memory poisoning through confidence and validation metadata
persistent prompt injection through instruction classification metadata
cross-agent leakage through scoped namespaces and retrieval events
sensitive data retention through redaction markers and TTL metadata

Do not store credentials, raw secret payloads, or private tokens as memories. Secret values remain governed by the runtime-secret policy in ADR-0009.

Consequences

vLLM workers can stay mostly stateless and focused on low-latency inference.
Memory behavior can be tested, replayed, audited, and evolved without changing the inference service contract.
Prompt size stays bounded by retrieval and compression rather than by naive transcript replay.
Future governance remains possible because memory, retrieval, artifact, and runtime events are structured and externally observable.
Governance does not become an embedded runtime dependency.
More infrastructure is required before memory-backed agents are production ready.
Retrieval quality, memory drift, stale facts, and hallucinated recall become explicit validation targets.
Binary artifacts remain externalized and are referenced through metadata or on-demand multimodal inference rather than injected into prompts by default.

Escalation Criteria

Reconsider this policy if:

vLLM gains a production-grade durable memory interface with replayable external metadata
local hardware changes enough that long-context replay is cheaper than external memory retrieval
a dedicated Anthesis-aligned memory service becomes the primary Dubnium memory provider
Letta or another MemGPT-style agent framework can integrate with Dubnium’s storage, metadata, and replay contracts without becoming the source of truth
compliance requirements demand a concrete external governance authority, attestation system, or retention architecture

References

Persistent Context Memory Architecture

ADR-0010: External Ownership Boundaries

Status: accepted

Context

Dubnium is evolving into a machine orchestration and runtime policy layer for a hybrid workstation and AI-node environment.

The repository already integrates an external dotfiles source for Home Manager and user-scoped configuration. At the same time, the local k3s integration in Dubnium remains intentionally thin and partially placeholder while broader cluster automation work evolves separately.

Without an explicit ownership boundary, there is a risk that:

machine policy drifts into user-home concerns
cluster bootstrap logic becomes duplicated across repositories
recovery boundaries become unclear
operational responsibilities overlap
host rebuilds become fragile or non-reproducible

Decision

Dubnium adopts a layered ownership model.

Dubnium

Dubnium is the authoritative repository for:

machine identity
NixOS host composition
runtime mode control
systemd orchestration
hardware policy
GPU placement policy
runtime reconciliation
machine-scoped secrets and service contracts

Dubnium orchestrates external systems but should avoid duplicating their source of truth.

Dotfiles

ryjen/dotfiles is the authoritative repository for:

Home Manager configuration
user shell configuration
editor configuration
CLI tooling
user-scoped agent tooling
workstation UX preferences
user-scoped secrets materialization

Dubnium may consume dotfiles directly through flake inputs and local checkout paths.

Laboratory

hackelia-micrantha/laboratory is the intended authoritative repository for:

local cluster bootstrap
k3s deployment orchestration
Flux bootstrap and reconciliation
GitOps substrate configuration
cluster overlays and platform services
environment lifecycle workflows

Dubnium may invoke Laboratory entrypoints but should avoid embedding full cluster orchestration logic internally.

Consequences

Positive

cleaner recovery boundaries
reduced duplication
improved source-of-truth clarity
safer rebuild semantics
clearer operational ownership
easier future migration of cluster workflows

Negative

additional repository coordination
version pinning discipline becomes important
submodule or external checkout management complexity
bootstrap sequencing becomes more explicit

Current Implementation State

Current repository state:

dotfiles integration exists today
local k3s wiring remains host-local and intentionally thin
Laboratory integration is planned but not yet fully wired into runtime flows

The current v1 implementation keeps k3s operationally local while explicitly preparing for externalized cluster bootstrap ownership.

Operational Rules

machine boot must not depend on successful Laboratory reconciliation
machine boot must not depend on user-home customization success
dotfiles failure degrades user experience, not machine orchestration
Laboratory failure degrades cluster capabilities, not machine orchestration
Dubnium remains the root machine control plane

Follow-Up Work

add stable Laboratory bootstrap entrypoints
add optional external/laboratory checkout integration
add bootstrap and validation scripts
tighten version pinning and provenance validation
reduce placeholder local cluster assumptions over time

Dubctl Flake Input Manager

dubctl is Dubnium’s small helper for managing top-level flake inputs. It is intended for quick add, remove, search, list, and update operations without hand-editing the common inputs = { ... }; block every time.

dubctl manages only flake inputs. It does not wire new inputs into outputs, NixOS modules, package sets, overlays, or Home Manager arguments. Make those call-site changes explicitly after adding an input.

Install and Run

From this repository:

nix run .#dubctl -- list

Install into a profile:

nix profile install .#dubctl
dubctl list --flake /path/to/dubnium

For local development without Nix packaging:

scripts/dubctl list

Commands

List current inputs:

dubctl list

Search input names and definitions:

dubctl search nix

Show one input definition:

dubctl info nixpkgs

Add an input:

dubctl install foo github:owner/repo

Add an input that follows nixpkgs:

dubctl install foo github:owner/repo --follows nixpkgs

Remove an input:

dubctl remove foo

Update all lock entries:

dubctl update

Update one lock entry:

dubctl update nixpkgs

Use a specific flake directory or file:

dubctl --flake /path/to/repo list
dubctl --flake /path/to/repo/flake.nix info nixpkgs

Lockfile Behavior

install and remove run nix flake lock after editing flake.nix. Use --no-lock when staging or testing a source-only change:

dubctl install foo github:owner/repo --no-lock
dubctl remove foo --no-lock

update runs nix flake update, with an optional input name.

Safety Model

dubctl treats command arguments as untrusted input.

Controls:

input names must be Nix attr-safe names
URLs cannot be empty and cannot contain quotes or newlines
edits are limited to the top-level inputs = { ... }; block
mutations write flake.nix.bak before changing flake.nix
Nix commands are invoked with argv arrays, not shell string concatenation

The backup is local operator safety only. Review the diff before committing.

When Not To Use Dubctl

Do not use dubctl for:

changing outputs arguments
adding module imports
adding overlays
changing Home Manager extra arguments
editing nested flakes such as external/dotfiles unless you pass that flake path explicitly

Those changes are architectural wiring, not package-manager operations.

Runbook: Post-Install Source Reconciliation

Status: living

Use this after a fresh install when the installer source snapshot has produced local changes that should become normal Dubnium repo history.

The custom installer payload is an export-style source snapshot on the USB live system. It is suitable for running nixos-install, but it does not automatically become a durable checkout inside the installed OS. Even when the snapshot is copied into the target filesystem, it is not the long-term working copy because it does not include .git history.

Desired Shape

installed system has a normal Git checkout for Dubnium
install-time changes are reviewed as a Git diff
host-specific files are committed only when they belong in repo policy
secrets, tokens, model weights, local caches, and temporary installer state stay out of Git

1. Locate Or Recreate The Install Snapshot

After first boot, start by checking whether the installer source was copied into the installed filesystem:

test -e ~/local/src/dubnium/flake.nix

If it was not copied, boot the custom installer USB or mount the prepared source media again and import the same source snapshot into a temporary location, such as ~/local/src/dubnium-install-snapshot. The goal is to recover any install-time edits, especially the generated hardware config.

If the installed system already has the copied installer source at ~/local/src/dubnium, check whether it is a Git checkout:

cd ~/local/src/dubnium
git rev-parse --is-inside-work-tree

If that fails, keep the snapshot as evidence and make room for a real checkout:

cd ~/local/src
mv dubnium dubnium-install-snapshot

If the source was copied elsewhere, use that path as the snapshot path in the commands below. If there were no install-time source edits to preserve, skip the snapshot and create the canonical checkout directly.

2. Create The Canonical Checkout

Clone the private Dubnium repo using the installed system’s normal operator credential path. Prefer SSH keys or an intentional short-lived HTTPS token; do not reuse live-installer credentials as a persistent access mechanism.

mkdir -p ~/local/src
git clone <dubnium-private-repo-url> ~/local/src/dubnium
cd ~/local/src/dubnium
git submodule update --init --recursive

If the installed machine should use a different source root, keep the same pattern: one normal Git checkout, and one preserved installer snapshot until the diff has been reconciled.

3. Bring Across Intentional Install Changes

Copy only the changes that should become repo state. The most common first install candidate is the generated hardware config:

cp ~/local/src/dubnium-install-snapshot/hosts/workstation/hardware-configuration.nix \
  hosts/workstation/hardware-configuration.nix

Review any optional host-local file before copying it. For example, hosts/workstation/user.nix may be useful on the installed machine, but it should be committed only if the repo is meant to carry that exact user policy.

For a broader comparison between the preserved snapshot and the canonical checkout:

diff -ruN \
  ~/local/src/dubnium-install-snapshot/hosts/workstation \
  ~/local/src/dubnium/hosts/workstation

Prefer copying specific files over bulk-syncing the snapshot into the checkout.

4. Review, Test, Commit, Push

From the canonical checkout:

git status --short
git diff -- hosts/workstation modules docs

nix --extra-experimental-features 'nix-command flakes' \
  eval .#nixosConfigurations.workstation.config.networking.hostName

git add hosts/workstation/hardware-configuration.nix
git commit -m "Record workstation hardware configuration"
git push

Use a broader validation command when the reconciled change touches modules, services, or shared policy. If evaluation or rebuild fails, keep the snapshot and the Git checkout separate until the failure is understood.

5. Rebuild From The Canonical Checkout

After the change is committed or intentionally kept as local-only state, rebuild from the normal checkout rather than the installer snapshot:

sudo nixos-rebuild switch --flake ~/local/src/dubnium#workstation

Once the canonical checkout has the needed changes and the system rebuilds from it, the preserved install snapshot can be archived or deleted.

Runbook: Laboratory Bootstrap

Status: living

This runbook describes the current intended integration boundary between:

Dubnium
ryjen/dotfiles
hackelia-micrantha/laboratory

Dubnium owns machine orchestration and runtime policy.

Laboratory is the intended source of truth for:

k3s bootstrap
Flux bootstrap
GitOps reconciliation
local cluster lifecycle operations

Current State

The current Dubnium repository still contains a thin local k3s integration for v1 bring-up.

The long-term intended direction is:

Dubnium owns host orchestration
Laboratory owns cluster orchestration

This runbook defines the current bootstrap contract without pretending the full migration is already complete.

Expected Repository Shape

Typical local source layout:

~/local/src/
├── dubnium/
│   ├── external/dotfiles/
│   └── external/laboratory/

The external/laboratory checkout may be:

a Git submodule
a manually managed checkout
another intentionally pinned local source path

The preferred integration ref today is:

feature/fresh

Bootstrap Flow

After the machine is operational:

validate Dubnium host state
validate user environment
bootstrap Laboratory
validate cluster state
fetch kubeconfig
validate Flux reconciliation

Prerequisites

Laboratory expects tooling such as:

tofu or terraform
ansible
kubectl
flux
jq

See the Laboratory repository for current authoritative prerequisites.

Bootstrap Command

Dubnium exposes a thin wrapper entrypoint:

scripts/bootstrap-lab

The wrapper intentionally:

validates the checkout exists
validates the repository shape looks correct
warns when the checkout ref differs from the preferred ref
delegates execution into Laboratory

The wrapper intentionally does not duplicate Laboratory internals.

Environment Overrides

Optional overrides:

export DUBNIUM_LAB_PATH=~/local/src/laboratory
export DUBNIUM_LAB_REF=feature/fresh

Override the delegated bootstrap command:

export DUBNIUM_LAB_BOOTSTRAP_CMD='make deploy ENV=local'

Default Delegated Flow

Current default delegated flow:

make deploy ENV=local && \
make local-kubeconfig ENV=local && \
make validate ENV=local

This is intentionally conservative while the integration boundary evolves.

Failure Boundaries

If Laboratory bootstrap fails:

Dubnium machine orchestration should still function
mode transitions should still function
user environment should still function
only cluster capabilities should be degraded

Machine boot must not require successful Laboratory reconciliation.

Recovery

To retry the bootstrap:

scripts/bootstrap-lab

To validate current cluster state directly through Laboratory:

cd external/laboratory
make validate ENV=local

Runtime Secrets

Dubnium uses sops-nix with age for runtime service secrets. Nix declares which services consume secrets; secret values stay out of Git, module options, and the Nix store.

Secret Classes

Use separate handling for each class:

Source bootstrap: prepare a local repo archive or copied working tree before install; do not require GitHub credentials in the installer.
Runtime service tokens: encrypt with SOPS and expose to services through /run/secrets or generated environment files.
User-runtime tokens: decrypt through the user profile after install for tools such as Codex, GitHub CLIs, or agent workflows.
Host enrollment identities: enroll interactively for v1 unless a future ADR accepts unattended enrollment.
Model weights: seed local model bundles into /var/lib/dubnium/models; do not store them in Git, SOPS, or the Nix store.

Host Age Identity

Create one age identity per host and keep it on that host:

sudo mkdir -p /var/lib/sops-nix
sudo age-keygen -o /var/lib/sops-nix/key.txt
sudo chmod 0600 /var/lib/sops-nix/key.txt
sudo cat /var/lib/sops-nix/key.txt | age-keygen -y

Add the printed public recipient to .sops.yaml when the first encrypted secrets file is introduced.

Host Secret File

Keep encrypted host secret files under an ignored or carefully reviewed path such as secrets/hosts/<host>.yaml. Commit encrypted files only after checking that the cleartext values are not present in the diff.

Example SOPS data shape:

service_name:
  token: example

vLLM Model Downloads

The default Dubnium install should not need a Hugging Face token. Dubnium points vLLM at local model bundle paths under /var/lib/dubnium/models, and the fresh install path seeds those bundles from USB.

Only add a model-provider token if you intentionally choose an online download workflow for a future host. In that case, prefer an environment file generated by sops-nix:

{ config, ... }:
{
  dubnium.secrets.defaultSopsFile = ../../secrets/hosts/workstation.yaml;

  sops.secrets.model-provider-token = {
    key = "model_provider/token";
  };
  sops.templates."vllm-model-provider.env".content = ''
    HF_TOKEN=${config.sops.placeholder.model-provider-token}
    HUGGINGFACE_HUB_TOKEN=${config.sops.placeholder.model-provider-token}
  '';

  dubnium.vllm.environmentFiles = [
    config.sops.templates."vllm-model-provider.env".path
  ];
}

Do not add provider tokens to the custom installer ISO or USB seed partition.

User Runtime Tokens

User tools are owned by the dotfiles Home Manager profile, not by Dubnium system services. Keep tokens such as these in the user SOPS file:

github_token: ghp_example
openai_api_key: sk-example

The dotfiles profile exposes secret file paths, for example GITHUB_TOKEN_PATH and OPENAI_API_KEY_PATH. It can also source a sops-generated shell fragment for interactive user sessions, so tools installed by the profile inherit variables such as OPENAI_API_KEY without per-tool wrappers and without putting plaintext values in Nix options.

Codex should get OPENAI_API_KEY this way. A later user workflow can use GITHUB_TOKEN the same way without changing the installer policy.

Rotation

Edit the encrypted SOPS file with sops.
Rebuild the target host.
Restart any service that consumes the rotated secret if activation did not already restart it.
Revoke the old token at the provider.

Checks

Before committing, inspect staged changes:

git diff --cached
git diff --check

Do not commit plaintext tokens, private keys, generated age identities, model weights, or local decrypted files.

Tailscale

Tailscale is workstation-only platform connectivity in v1. Dubnium enables the daemon and CLI, but enrollment is manual until secrets and OAuth policy are settled.

First Activation

Build and switch the workstation configuration:

sudo nixos-rebuild switch --flake .#workstation

Enroll the node manually:

sudo tailscale up

Follow the browser/device login flow. Do not pass --ssh, --advertise-routes, or --advertise-exit-node for v1.

Verification

Check the daemon:

systemctl status tailscaled

Check tailnet state:

tailscale status
tailscale ip -4

Regular OpenSSH can be used over the assigned tailnet IP if SSH is allowed by the host firewall and OpenSSH configuration.

vLLM Over Tailnet

Dubnium exposes vllm.service on port 8000 over the Tailscale interface only. From another tailnet machine, use the node’s Tailscale IP or MagicDNS name:

curl http://<dubnium-tailnet-name>:8000/v1/models

The local alias ai.dubnium is a host-local convenience entry on Dubnium. To use that same name from other machines, add a tailnet DNS/hosts alias that points ai.dubnium at the Dubnium node’s Tailscale IP.

Deferred Automation

Automatic enrollment should use services.tailscale.authKeyFile only after Dubnium has a settled secrets policy. The intended future shape is:

services.tailscale.authKeyFile = "/run/secrets/tailscale-auth-key";

OAuth or auth-key enrollment should be paired with explicit key scope, expiration, tagging, and rotation decisions.

Deferred Routing Options

Subnet router support would require:

services.tailscale.useRoutingFeatures = "server" or "both"
sudo tailscale up --advertise-routes=...
Tailscale admin approval for the advertised routes
firewall, forwarding, and reverse-path-filtering review

Exit-node support would require:

services.tailscale.useRoutingFeatures = "server" or "both"
sudo tailscale up --advertise-exit-node
Tailscale admin approval
stronger trust and privacy review, because the node can carry client traffic

Deferred Tailscale SSH

Tailscale SSH is not enabled in v1. If enabled later, it should be tied to a written Tailscale ACL policy and explicit operator intent.

Future manual enrollment would use:

sudo tailscale up --ssh

Future declarative enrollment could add:

services.tailscale.extraUpFlags = [ "--ssh" ];

Until that policy exists, use regular OpenSSH over the tailnet IP.

Runbook: Transition Testing

Status: living

Use this after the machine can boot the flake-managed desktop baseline.

Preflight

mode status
mode current
mode desired
systemctl status desktop.target
systemctl status compute.target
systemctl status vllm.service

The expected baseline is:

observed state is desktop
vLLM is inactive
no transition lock is held
latest transition is not failed

Test Studio Overlay

sudo mode request studio-local
mode status
systemctl status studio-local-policy.service
systemctl status audio-priority.service
sudo mode request desktop
mode status

Expected result:

studio-local is observed only while both overlay services are active
returning to desktop stops both overlay services
vLLM remains inactive

Test Compute Promotion

Before testing:

close REAPER and active low-latency audio work
avoid foreground long-running user jobs
expect the graphical session to terminate

sudo mode request compute
mode status
systemctl status compute.target
systemctl status vllm.service

Expected result:

observer reports compute or an explicit degraded/failed state
graphical session is absent or non-authoritative
vLLM is active if enabled
guard and transition records explain any block or failure

Test Desktop Return

sudo mode request desktop
mode status
systemctl status vllm.service

Expected result:

observer reports desktop
vLLM is inactive
graphical/session path is usable

If rollback only partially restores desktop, classify it as degraded rather than successful.

Runbook: Failed Transition Recovery

Status: living

Use this when mode status reports failed-transition, a degraded state, or a post-action observation mismatch.

Inspect State

mode status
mode current --refresh
sudo cat /run/mode-controller/last-transition.json
sudo cat /run/mode-controller/last-guards.json
journalctl -u 'mode-controller@*' -b

Classify the Failure

Common buckets:

guard policy block, such as active audio or unsafe user jobs
guard execution error, such as missing nvidia-smi
graphical session did not terminate
GPU release predicate did not pass
vLLM failed to start or stop
target isolation stopped required services
post-action observation remained conflicted

Recover to Desktop

If the system is not in the middle of an active transition:

sudo mode request desktop
mode status

Success requires observer confirmation, not just successful systemd commands.

If desktop recovery fails:

inspect journalctl -b
inspect display-manager/session logs
stop compute-only services manually only if their ownership is clear
consider rebooting to the v1 boot default, desktop

Record Evidence

For every failure worth keeping:

final mode status
last transition JSON
last guards JSON
relevant systemd unit status
whether rollback restored desktop
whether the failure suggests runtime switching is insufficient

Repeated GPU release or desktop restoration failures should trigger specialisation/reboot-mediated compute evaluation.

WSL Documentation Boundary

Dubnium uses WSL in two different ways, and the docs should keep those roles separate.

WSL As Build Environment

Use WSL as a convenient Linux build environment. This includes:

building the custom installer ISO
preparing the local seed-model bundle
running Nix commands that do not need bare-metal hardware

This role does not imply the wsl host target is being installed or validated. For the installer flow, WSL prepares artifacts and the platform writer prepares the USB unless the USB disk is deliberately exposed to WSL.

Primary docs:

Build Installer Artifacts From WSL
Custom Installer USB
Fresh Install Checklist
Model Seeding

WSL As Validation Target

Use the .#wsl host target only inside an existing nix-community/NixOS-WSL distro. This target validates shared Dubnium module composition and activation before touching the real workstation. It keeps resource-heavy services such as vllm and k3s disabled by default.

This role does not prove bare-metal behavior. Passing WSL validation does not prove EFI, bootloader, Hyprland, audio, or final GPU behavior for .#workstation.

Primary docs:

WSL Bring-Up
ADR-0007: WSL Is a Headless Validation Target

Boundary Rules

Keep bare-metal install steps in fresh-install and custom-installer docs.
Keep .#wsl activation and validation steps in the WSL bring-up runbook.
Do not use the fresh-install checklist for WSL bring-up.
Do not use WSL results as proof that workstation hardware configuration is correct.
When a command is meant to run inside WSL, label it as WSL or Bash.

Build Installer Artifacts From WSL

Status: living

Use this when the Dubnium installer ISO and seed-model bundle should be prepared from an existing WSL distro.

This is only a build workflow. It is not .#wsl host activation and does not validate the WSL target.

Boundary

build the ISO and prepare the seed model here
write the USB with the platform’s guarded writer unless the USB disk is deliberately exposed to the WSL distro

Build

Enter the Nix-capable WSL distro:

wsl -d NixOS

Inside the distro:

cd /path/to/dubnium

git status --short
git -C external/dotfiles status --short

scripts/build-installer-iso.sh \
  --iso ./dubnium-installer.iso

The script prepares the current Dubnium default seed bundle when no existing materialized bundle is detected. Use --seed-model to point at a different bundle, --no-seed-download to require an existing bundle, or --no-seed-model to build installer-only media.

Write The USB

After the ISO exists in the shared checkout, use the platform writer from the custom installer runbook. For Windows PowerShell:

.\scripts\write-installer-usb.ps1 `
  -IsoPath .\dubnium-installer.iso `
  -DiskNumber 7 `
  -ExpectedFriendlyName "USB SanDisk 3.2Gen1" `
  -SeedModelPath ..\models\selected-model-bundle

Each writer still checks the USB disk identity and requires the typed erase confirmation.

Runbook: WSL Bring-Up

Status: living

Use this when the target environment is the wsl host, running inside an existing nix-community/NixOS-WSL distro.

This is separate from the bare-metal install and first-bring-up flow because the commands, platform assumptions, and validation steps are materially different.

This runbook assumes you are already using the community WSL base:

nix-community/NixOS-WSL
setup docs: https://nix-community.github.io/NixOS-WSL/

The dubnium .#wsl target layers on top of that base. It is not a replacement for the initial NixOS-WSL installation process.

When To Use This

Use this runbook when:

you are already inside the NixOS WSL distro
you want to switch that distro to dubnium’s .#wsl target
you want to validate shared Dubnium wiring in WSL before touching the bare-metal workstation target

Do not use this runbook for:

bare-metal install
hosts/workstation/hardware-configuration.nix generation
EFI or bootloader validation
Hyprland or audio/studio validation

Preconditions

WSL is installed on Windows
a NixOS WSL distro based on nix-community/NixOS-WSL already exists and boots successfully
this repo is available inside the distro. Examples:

/mnt/c/Users/<user>/Projects/dubnium
~/src/dubnium

flakes are available, either through system config or explicit flags

Success Criteria

nixos-rebuild switch --flake .#wsl succeeds inside the WSL distro
git is available from the switched system generation
mode status, mode current, and mode desired work
dubnium.k3s.enable and dubnium.vllm.enable evaluate to false
compute.target evaluates without pulling in k3s or vllm.service
the runtime state directory exists at /run/mode-controller

1. Enter The NixOS WSL Distro

If you do not already have a working nix-community/NixOS-WSL distro, stop here and install that first. This runbook starts after that base is already in place.

Enter the distro:

wsl -d NixOS

Inside the distro, go to the repo:

cd /path/to/dubnium
pwd
git status --short

Use the actual checkout path for the machine. Avoid hardcoding personal paths in reusable docs or scripts.

2. Evaluate The WSL Target

If your shell does not already have flakes enabled, use explicit flags:

nix --extra-experimental-features "nix-command flakes" flake show .

Confirm the new target exists:

nixosConfigurations.wsl

Optional targeted checks:

nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.wsl.config.wsl.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.wsl.config.dubnium.boot.defaultMode
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.wsl.config.dubnium.k3s.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.wsl.config.dubnium.vllm.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.wsl.config.systemd.targets.compute.wants

Expected:

wsl.enable = true
default mode is compute
dubnium.k3s.enable = false
dubnium.vllm.enable = false
compute.target has no vllm.service dependency

This confirms the dubnium host is using the upstream community WSL module, not an ad hoc local WSL implementation.

3. Switch The Running Distro To `.#wsl`

Use:

sudo nixos-rebuild switch --flake .#wsl

If flakes are not enabled globally in the current shell:

sudo nixos-rebuild switch --extra-experimental-features "nix-command flakes" --flake .#wsl

This is the main WSL install/activation command.

Keep config truth and live runtime truth separate. A targeted nix eval proves the flake expression, while nixos-rebuild switch and systemctl prove the running distro. If a full switch fails with an environmental WSL error, record that separately from whether the flake evaluated correctly.

4. Verify Dubnium Runtime Basics

Check mode/runtime state:

mode status
mode current
mode desired
sudo ls -la /run/mode-controller

Check that the heavy services are not part of the lightweight WSL profile:

systemctl status k3s --no-pager
systemctl status vllm --no-pager

Both units should be absent or inactive in the default WSL target. Enable them intentionally in a local override only when the task is specifically to validate their WSL runtime behavior.

Check the WSL target’s current interpretation:

wsl is headless
compute is the default desired mode
k3s and vllm are disabled by default to keep WSL activation lightweight
workstation-only graphics/audio expectations do not apply here

5. Known Differences From `workstation`

Important differences:

.#wsl assumes the distro itself was originally created with nix-community/NixOS-WSL
do not run nixos-generate-config --dir ./hosts/workstation for WSL testing
do not expect .#workstation to build cleanly until a real bare-metal hardware config has replaced the placeholder
do not use the fresh-install checklist for WSL bring-up
do not treat WSL results as proof of EFI, bootloader, Hyprland, or audio correctness

The wsl target is for:

flake composition
lightweight activation
shared Dubnium control-plane behavior

6. Common Failure Buckets

flakes not enabled in the current shell
repo is present but the running system has not been switched to .#wsl
Windows PATH injection adds noisy warnings during WSL startup
if repo tooling looks missing after switch, check git --version
Home Manager activation can fail if the active WSL login user does not match the configured Home Manager home; check whoami, getent passwd, and /etc/wsl.conf before changing modules
mode desired/current state is seeded but not yet reconciled automatically at boot

If .#workstation fails during WSL development, first check whether the failure comes from the placeholder hosts/workstation/hardware-configuration.nix instead of the new wsl target.

WSL Documentation Boundary
First Bring-Up
Fresh Install
Custom Installer ISO
First Bring-Up Checklist

Dual-Mode NixOS Workstation / AI Node

Unified Planning + Mode State Machine Document (v0.3 — Living)

1. Purpose

Design a single NixOS system that operates as a policy-driven multi-mode host with support for future workload externalization:

Desktop / Dev workstation
Optional local Studio / Audio profile
Compute / Headless AI node

The broader workstation environment may also externalize selected capabilities, especially Studio/Audio, to a separate machine such as a Mac mini.

The system must support:

low-latency audio workloads (DAW / live)
GUI desktop usage via Hyprland
GPU inference via vLLM
k3s control-plane duties for Micrantha Laboratory / Hyperion
explicit, auditable, reproducible transitions between modes

This document defines:

planning assumptions
architectural boundaries
host-local mode definitions
capability placement model
invariants
state machine
guards and guard functions
source-of-truth model
reconciliation model
implementation mapping to systemd
design alternatives and tradeoffs

2. Core Principles

2.1 Modes Are Operational Contracts

A mode is not just a set of enabled services. A mode defines:

resource ownership
permitted workloads
latency/throughput expectations
security posture
transition preconditions

2.2 Explicit Over Implicit

Mode transitions should be:

explicit when possible
observable
reversible
logged
idempotent

Automation may request a transition, but the controller must decide whether it is safe.

2.3 Latency and Throughput Are Competing Objectives

Desktop / Studio-Local optimize for responsiveness and bounded latency
Compute optimizes for throughput and hardware utilization

The design must not pretend both can be maximized simultaneously.

2.4 One Physical Host, Multiple Logical Planes

This system is treated as:

one shared substrate hosting multiple logical operating modes

2.5 Declarative First, Runtime Reconciliation Second

NixOS declares steady-state intent and system structure
a mode controller reconciles runtime state toward desired operational mode

2.6 Host-Local Modes Must Survive Capability Relocation

The host-local state model should remain coherent even if some capabilities, especially Studio/Audio, move to another machine.

3. System Overview

flowchart TD
    HW[Hardware]

    subgraph BaseOS[NixOS Base Layer]
        Kernel
        Drivers[NVIDIA / CUDA]
        Network
        Storage
        Nix
        systemd
    end

    subgraph Control[Mode Control Plane]
        Desired[Desired State]
        Current[Current State]
        Reconcile[Reconciler]
        Guards[Guard Checks]
    end

    subgraph LocalModes[Host-Local Modes]
        Desktop[Desktop / Dev]
        StudioLocal[Studio-Local / Audio-Priority]
        Compute[Compute / Headless]
    end

    subgraph Placement[Capability Placement]
        StudioCap[Studio Capability]
        AICap[AI Capability]
        PlatformCap[Platform Capability]
    end

    subgraph Workloads[Workloads]
        Hyprland
        PipeWire
        Reaper
        vLLM
        k3s
    end

    HW --> BaseOS
    BaseOS --> Control
    Control --> LocalModes
    LocalModes --> Workloads
    LocalModes --> Placement

4. Mode Definitions and Capability Placement

This document distinguishes between:

host-local operational modes for the NixOS machine
capability placement for functions that may later move to another machine

4.1 Host-Local Modes

Desktop / Dev Mode

Intent

Balanced interactive mode for programming, office work, light desktop use, and bounded AI.

Properties

GUI enabled
audio enabled for ordinary desktop use
GPU0 reserved for display/compositor
GPU1 may be used by AI workloads
vLLM constrained to single-GPU operation or disabled
k3s control plane may remain active
CPU/RAM contention must remain bounded

Studio-Local / Audio-Priority Profile

Intent

A stricter local operating profile for low-latency audio work when Studio remains on the NixOS host.

Properties

modeled as a protected interactive profile closely related to Desktop
GUI enabled
audio stack prioritized
display GPU reserved exclusively for desktop responsibilities
AI workloads disabled or reduced to near-zero
heavy I/O and background maintenance jobs disallowed
scheduler and system policy biased toward stable audio behavior

Design note

This profile is considered conditional and potentially temporary. It exists so the NixOS host can support local audio/studio workflows now, without assuming that Studio remains a permanent first-class local mode forever.

Implementation note

For the first implementation pass, studio-local should be modeled as a policy overlay on desktop, not as a first-class top-level systemd target. The operational state still exists in the controller/state model, but its enactment should initially be handled by marker/helper units layered onto the desktop path.

Compute / Headless Mode

Intent

Throughput-oriented headless mode for AI serving and platform duties.

Properties

GUI disabled
audio stack off or irrelevant
both GPUs available to AI workloads
vLLM may use both GPUs
k3s workloads may run more aggressively
CPU/RAM/storage can be utilized much more aggressively than in interactive modes

4.2 Capability Placement Model

Certain capabilities may be placed either:

locally on the NixOS host
externally on another machine

Capability: Studio / Audio

Possible placements:

local
external-mac-mini

Capability: AI / Inference

Expected placement:

primarily local-nixos-host

Capability: Platform / k3s Control

Expected placement:

primarily local-nixos-host

4.3 Design Implication

The host-local state machine should remain valid even if Studio/Audio is moved to a Mac mini. That means Studio-specific policy should be represented as a local profile or conditional mode, not as the permanent center of the entire host architecture.

5. Resource Ownership Model

5.0 Implementation Note — Hardware-Tolerant Bring-Up

The architecture should continue to plan for the intended dual-GPU topology, but the NixOS implementation should remain tolerant of transitional hardware states while the second GPU is not yet installed or configured.

That means:

the policy model may still describe the intended two-GPU end state
module options should encode planned GPU ownership explicitly
active service profiles must only reference GPUs that are currently present
missing future hardware must not cause ordinary evaluation or steady-state services to fail unnecessarily

5.1 GPU Ownership

Mode	GPU0	GPU1
Desktop	Display / compositor	AI optional
Studio-Local	Display / compositor (protected)	AI off or minimal
Compute	AI	AI

5.2 CPU Ownership

Shared via cgroups/systemd slices
interactive slices retain priority/headroom in Desktop and Studio-Local
compute slices may saturate cores in Compute

5.3 Memory Ownership

bounded AI memory usage in Desktop
stricter constraints in Studio-Local
relaxed/high utilization in Compute

5.4 Storage Ownership

heavy background I/O restricted in Studio-Local
permitted but bounded in Desktop
broadly permitted in Compute

5.5 Audio Ownership

effectively exclusive in Studio-Local
protected in Desktop
not guaranteed in Compute

6. Invariants

These are system-level properties that must remain true regardless of transition path or future Studio placement.

6.1 Safety Invariants

At most one host-local operational mode is authoritative at a time.
A transition must either complete to a stable target state or abort back to a known-safe prior state.
Mode transitions must be idempotent. Re-running a transition toward an already-satisfied state must not cause harm.
When Studio-Local is active, heavyweight compute workloads must not materially jeopardize audio latency.
Compute mode must not require a running graphical session.
GPU0 must not be simultaneously treated as both protected display GPU and unrestricted compute GPU.
The controller must not promote the system into Compute if guard failures indicate active user/audio risk.
The system must always expose a way to determine current mode, desired mode, and last transition result.
The host-local mode model must remain coherent if Studio/Audio capability is externalized to another machine.

6.2 State Invariants

Desired state is authoritative intent.
Current state is observed runtime fact.
Reconciliation moves current state toward desired state; it never rewrites observed state to match wishful intent.
A guard failure blocks transition, but does not silently change desired state unless policy explicitly says so.

6.3 Operational Invariants

Models and mutable runtime data must live outside the Nix store.
Dotfiles may influence user experience, not machine-critical mode policy.
Mode policy must remain expressible and inspectable via systemd and Nix configuration.
Capability placement decisions must not silently invalidate host-local invariants.

7. Desired State vs Current State

7.1 Desired State

The host-local mode the user or automation wants the system to be in.

Examples:

desktop
studio-local
compute

7.2 Current State

The host-local mode the system is actually in, as determined by observation.

Examples:

graphical target active, PipeWire active, vLLM limited → likely desktop
graphical target inactive, compute services active, both GPUs exposed to AI → likely compute
GUI active, audio priority raised, compute services reduced → likely studio-local

7.3 Why This Split Matters

Without this split, the system can lie to itself:

a command says “switch to compute”
but GPU is still held by compositor
vLLM failed to scale up
audio services are still active

In that case:

desired state = compute
current state = transitioning or desktop (degraded)

The control plane must detect and reconcile this rather than assuming success.

8. Source of Truth for Mode

The system needs one authoritative representation of requested host-local mode.

8.1 Options Considered

Option A — File-Based Source of Truth

Example:

/run/mode-controller/desired
/var/lib/mode-controller/desired

Pros

simple
easy to inspect
works outside active user session
easy for scripts and systemd units

Cons

can drift from actual runtime state
needs permissions and lifecycle handling

Option B — Environment Variable Source of Truth

Example:

MODE=compute

Pros

simple for one-shot commands
easy in shell contexts

Cons

poor system-wide authority
ephemeral
fragile across sessions/reboots
bad fit for authoritative machine state

Option C — systemd State as Source of Truth

Example:

compute.target active implies desired mode is compute

Pros

tightly aligned with implementation
introspectable
avoids duplicate state stores

Cons

desired state and current state can become conflated
harder to represent “requested but not yet achieved”
recovery/abort semantics become more awkward

8.2 Recommended Model

Use a hybrid model:

Desired state source of truth: file in /run/mode-controller/desired
Current state source of truth: observed systemd/runtime facts
Transition machinery: systemd targets + controller service

This cleanly separates:

intent
observation
enforcement

8.3 Proposed Files

/run/mode-controller/desired
/run/mode-controller/current
/run/mode-controller/last-transition.json

current may be a cached observation, but observation should always be derivable from system state.

9. State Machine

9.1 States

S0: Boot

Initial state before default operating mode is established.

S1: Desktop

Interactive general-purpose mode.

S2: StudioLocal

Strict interactive low-latency local audio profile.

S3: Compute

Headless throughput-oriented mode.

S4: Transitioning

Ephemeral reconciliation state while moving toward desired mode.

S5: FailedTransition

A recoverable error state indicating that desired state was not achieved.

9.2 State Diagram

stateDiagram-v2
    [*] --> Boot

    Boot --> Desktop : default boot

    Desktop --> StudioLocal : request(studio-local)
    StudioLocal --> Desktop : request(desktop)

    Desktop --> Transitioning : request(compute)
    StudioLocal --> Transitioning : request(compute)
    Compute --> Transitioning : request(desktop)
    Desktop --> Transitioning : request(desktop) / reconcile
    StudioLocal --> Transitioning : request(studio-local) / reconcile
    Compute --> Transitioning : request(compute) / reconcile

    Transitioning --> Desktop : reached(desktop)
    Transitioning --> StudioLocal : reached(studio-local)
    Transitioning --> Compute : reached(compute)
    Transitioning --> FailedTransition : guard_fail / action_fail / timeout

    FailedTransition --> Desktop : recover(previous=desktop)
    FailedTransition --> StudioLocal : recover(previous=studio-local)
    FailedTransition --> Compute : recover(previous=compute)

9.3 Notes

Direct StudioLocal -> Compute may be allowed only through guarded reconciliation, not blind immediate promotion.
Reconciliation should be able to handle “already in desired mode” as a no-op success.
Externalized Studio capability must not require redesign of the host-local state machine; it should only disable or deprecate studio-local usage.

10. Guards

Guards are explicit check functions. They return exit codes and optionally structured diagnostics.

10.1 Guard Interface

Each guard function should follow a predictable interface:

check_<name>
exit 0   = pass
exit 10+ = policy failure / guard blocked
exit 20+ = check execution error / indeterminate

Structured output should ideally emit JSON or key=value diagnostics to stdout/stderr for logs.

10.2 Guard Set

G1: check_audio_idle

Purpose:

verify no active low-latency local audio session that would make compute transition unsafe

Possible checks:

no active REAPER process
no active PipeWire/JACK graph beyond baseline

Exit codes:

0 pass
10 audio active
20 unable to inspect audio graph

G2: check_gpu_display_released

Purpose:

verify display/compositor has released GPU before compute promotion

Possible checks:

no active Hyprland session
no relevant graphical GPU consumers

Exit codes:

0 pass
11 display GPU still owned by GUI
21 GPU inspection failure

G3: check_cpu_load_safe

Purpose:

ensure transition is not occurring during obviously unsafe heavy local activity when policy requires quieting first

Exit codes:

0 pass
12 CPU load too high
22 unable to inspect load

G4: check_user_jobs_safe

Purpose:

detect known long-running interactive/user jobs that should block auto-transition

Possible checks:

selected process patterns
optional allowlist/denylist

Exit codes:

0 pass
13 user jobs active
23 inspection failure

G5: check_memory_headroom

Purpose:

ensure sufficient memory exists to perform transition or launch target services

Exit codes:

0 pass
14 insufficient headroom
24 inspection failure

G6: check_vllm_drainable

Purpose:

ensure compute workloads can be safely reduced when returning to Desktop/Studio-Local

Exit codes:

0 pass
15 compute workload not drainable
25 inspection failure

G7: check_studio_capability_local

Purpose:

verify that local Studio capability is still available on the NixOS host before allowing studio-local

Possible checks:

local policy flag indicates studio capability still hosted locally
local audio stack and workflow prerequisites are not intentionally disabled due to externalization

Exit codes:

0 pass
19 requested local studio capability not available
29 inspection failure

10.3 Guard Policy by Transition

Transition	Required Guards
Desktop -> StudioLocal	check_target_reachable, check_studio_capability_local, check_user_jobs_safe (optional policy), compute downscale checks
StudioLocal -> Desktop	check_target_reachable
Desktop -> Compute	check_target_reachable, check_audio_idle, check_gpu_display_released, check_cpu_load_safe, check_user_jobs_safe, check_memory_headroom
StudioLocal -> Compute	check_target_reachable, check_audio_idle, check_gpu_display_released, check_cpu_load_safe, check_user_jobs_safe, check_memory_headroom
Compute -> Desktop	check_target_reachable, check_vllm_drainable, check_memory_headroom
Compute -> StudioLocal	check_target_reachable, check_studio_capability_local, check_vllm_drainable, check_memory_headroom

11. Actions and Transition Semantics

Actions are the concrete operations used to move from one state to another.

11.1 Action Vocabulary

stop/terminate GUI session
isolate a target
stop/start units
wait for quiescence
update desired/current state files
restart services with different environment/policies

11.2 Action Interface

Each action should return:

0 success
non-zero failure with logged reason

12. Exact Transition Mapping to systemd Operations

This is the implementation-oriented mapping.

12.1 Assumptions

Systemd targets:

desktop.target
compute.target

studio-local is intentionally not a first-class target in v1. It is represented as a desktop overlay through studio-local-policy.service and audio-priority.service.

Supporting services:

mode-controller.service
vllm.service
k3s.service
pipewire.service / user session services
graphical session manager or direct Hyprland session

Helper oneshot services/scripts:

mode-prepare-compute.service
mode-prepare-desktop.service
mode-prepare-studio-local.service
mode-observe.service

12.2 Desktop -> StudioLocal

Desired change

desired mode file = studio-local

systemd operations

systemctl start mode-controller.service (with target=studio-local)
controller runs guard set for Desktop -> StudioLocal
controller verifies local Studio capability still exists
controller stops or constrains AI workloads as needed
- v1 policy: systemctl stop vllm.service
controller isolates or verifies desktop.target
controller starts studio-local-policy.service
controller starts audio-priority.service
controller updates current state observation

Example exact operations

write /run/mode-controller/desired = studio-local
systemctl start mode-controller@studio-local.service
systemctl stop vllm.service
systemctl isolate desktop.target
systemctl start studio-local-policy.service
systemctl start audio-priority.service

12.3 StudioLocal -> Desktop

Desired change

desired mode file = desktop

systemd operations

write desired state
start controller
restore normal interactive policies
optionally allow bounded AI services
stop audio-priority.service
stop studio-local-policy.service
systemctl isolate desktop.target
update current observation

Example exact operations

write /run/mode-controller/desired = desktop
systemctl start mode-controller@desktop.service
systemctl stop audio-priority.service
systemctl stop studio-local-policy.service
systemctl isolate desktop.target

12.4 Desktop -> Compute

Desired change

desired mode file = compute

systemd operations

write desired state
start controller for compute
run guards:
- check_target_reachable
- check_audio_idle
- check_gpu_display_released (or prepare to release)
- check_cpu_load_safe
- check_user_jobs_safe
- check_memory_headroom
if interactive session exists, controller requests/forces session termination
- loginctl terminate-session <id>
wait until compositor releases GPU
stop or de-prioritize audio services if needed
stop desktop-specific services not wanted in compute
set service environment/profile for dual-GPU vLLM
systemctl isolate compute.target
start/restart vllm.service
verify current state

Example exact operations

write /run/mode-controller/desired = compute
systemctl start mode-controller@compute.service
loginctl terminate-session <desktop-session>
systemctl stop graphical-session.target   # if such target exists in design
systemctl isolate compute.target
systemctl restart vllm.service

12.5 Compute -> Desktop

Desired change

desired mode file = desktop

systemd operations

write desired state
start controller for desktop
run guards:
- check_target_reachable
- check_vllm_drainable
- check_memory_headroom
drain/stop or downscale vLLM
constrain compute workloads
systemctl isolate desktop.target
start GUI path
ensure GPU0 reserved for display
start/restore audio path
verify current state

Example exact operations

write /run/mode-controller/desired = desktop
systemctl start mode-controller@desktop.service
systemctl stop vllm.service              # or restart single-GPU profile
systemctl isolate desktop.target

12.6 StudioLocal -> Compute

Two possible policies:

Policy A — direct guarded transition

Allowed if all compute guards pass and Studio-Local resources are cleanly relinquished.

Policy B — normalize through Desktop first

Transition path:

studio-local -> desktop -> compute

Recommendation: Use Policy A in implementation, but conceptually treat it as the same reconciliation pipeline with stricter guards.

13. Reconciliation Model

13.1 Motivation

A single mode request compute command should not blindly assume success. The system should:

record desired mode
observe current state
compare desired vs current
compute required transition plan
execute actions
re-observe
either declare success or enter failed transition state

13.2 Reconciliation Loop

flowchart TD
    Req[Request mode] --> Write[Write desired state]
    Write --> Observe[Observe current state]
    Observe --> Compare{Desired == Current?}
    Compare -->|Yes| Done[No-op success]
    Compare -->|No| Plan[Select transition plan]
    Plan --> Guards[Run guards]
    Guards -->|Fail| Fail[Record failure]
    Guards -->|Pass| Act[Execute actions]
    Act --> Reobserve[Observe current state again]
    Reobserve --> Verify{Reached desired?}
    Verify -->|Yes| Success[Record success]
    Verify -->|No| RetryOrFail[Retry boundedly or fail]

13.3 Reconciliation Semantics

bounded retries only
no infinite loops
every failure is logged with:
- desired state
- prior state
- failing guard or action
- timestamp

13.4 Why This Matters

This lets you support:

manual requests
idle-triggered auto-switching
boot-time default mode
recovery after partial failures

all through one mechanism.

14. Specialisations vs Runtime Switching

This is the main architectural fork.

14.1 Option A — Runtime Switching Only

Use one host definition with multiple systemd targets and runtime policies.

Pros

fast transitions
no reboot required
best UX for switching between Desktop and Studio-Local
simpler for day-to-day operation

Cons

weaker isolation
harder to fully guarantee all services/resources are cleanly re-bound
risk of state leakage between modes
some kernel/driver tuning differences are awkward live

Best fit

Desktop <-> Studio-Local
Desktop <-> Compute where flexibility matters more than hard isolation

14.2 Option B — NixOS Specialisations Only

Use separate NixOS specialisations for Desktop and Compute (and possibly Studio-Local).

Pros

stronger isolation between role profiles
easier to vary deeper system settings, kernel params, service sets
clearer recovery story
closer to “logical separate machines”

Cons

slower transitions, often reboot-oriented in practice
poorer UX for frequent switching
more configuration duplication risk if not structured well

Best fit

Desktop vs Compute if you want very strong separation
not ideal for rapid Studio-Local toggling

14.3 Option C — Hybrid Model

Use:

runtime switching for Desktop <-> Studio-Local
specialisation boundary between Interactive and Compute families

Example:

default specialisation = interactive
- runtime modes inside it: desktop, studio-local
compute specialisation = headless compute

Pros

strongest overall architecture
preserves good UX for Studio-Local transitions
lets Compute differ more deeply if needed
handles future externalization of Studio more cleanly than treating Studio as a permanent top-level host identity

Cons

more design complexity
transition from interactive to compute may become reboot-oriented or at least heavier
more machinery to maintain

14.4 Recommendation

For your current goal, use runtime switching first, with the design shaped so it can later evolve into a hybrid model.

Reasoning

you need to learn actual contention boundaries first
Desktop <-> Studio-Local benefits heavily from live switching
Desktop <-> Compute can start as runtime-switched
if the system proves too “sticky” or leaky, you can later promote Compute into a specialisation without redesigning the higher-level state machine
if Studio moves to a Mac mini, the host-local model remains intact

Practical recommendation

Phase the design like this:

Phase 1: one host, runtime switching only
Phase 2: strong slices/targets/guards
Phase 3: evaluate whether Compute should become a specialisation
Phase 4: if Studio is externalized, deprecate or disable studio-local without changing the operator-facing control model

This preserves velocity while keeping the abstraction clean.

15. Service Placement

15.1 Host-Level Services

Hyprland
PipeWire
Reaper
NVIDIA drivers/runtime
mode controller
possibly vLLM initially
SSH / system services

15.2 k3s-Level Services

Hyperion services
platform/orchestration services
dashboards and supporting workloads
possibly model-serving abstractions later

First-pass implementation note

In v1, prefer keeping k3s.service continuously available while varying:

platform.slice resource budgets
which workloads are allowed to run aggressively
how much local compute capacity cluster workloads may consume

This is preferable to stopping and starting the cluster runtime during ordinary mode transitions.

15.3 Externalized Services (Possible Future)

Studio/Audio workflows on Mac mini
DAW/plugin-heavy sessions
live audio interfaces and controllers

15.4 Recommendation

Keep hardware-near, latency-sensitive, and GPU-debug-sensitive components on the host first. Move services into k3s only after the host-level mode model is stable. Treat Mac mini externalization as a placement decision, not as a redesign trigger for the host-local state machine.

16. Idle Detection Policy

16.1 Role of Idle Detection

Idle detection is an input signal to the reconciler, not authority on its own.

16.2 Signals

input inactivity
audio activity
GPU utilization / ownership
CPU load
selected user-job checks

16.3 Policy

Idle-triggered promotion to Compute should:

update desired state to compute
run the normal reconciliation pipeline
abort safely if guards fail

It must never bypass guards.

16.4 Studio-Local Policy

Auto-promotion from studio-local to compute should generally be disabled unless explicitly requested. This remains true even if Studio capability later moves off-box.

17. Security Boundaries

Zones

user desktop zone
system service zone
AI workload zone
cluster service zone
optional external Studio zone

Controls

bind services to appropriate interfaces
keep secrets outside dotfiles, e.g. SOPS/agenix
keep mode control operations privileged and auditable
do not let externalized capability assumptions silently weaken host-local controls

18. Risks and Failure Modes

18.1 Audio Degradation

Cause:

background contention

Mitigation:

Studio-Local invariants
strict guard/action policy

18.2 GPU Contention

Cause:

compositor and AI workloads racing for ownership

Mitigation:

explicit GPU ownership model
guard checks before Compute promotion

18.3 Partial Transition

Cause:

GUI exits but vLLM fails to restart
desired state written but current state never converges

Mitigation:

reconciliation loop
bounded retries
failed-transition state

18.4 Configuration Drift

Cause:

policy split across ad hoc scripts and dotfiles

Mitigation:

keep mode policy in Nix + systemd-controlled scripts

18.5 Capability Drift

Cause:

Studio capability moved to Mac mini, but local state machine or guards still assume it is local

Mitigation:

explicit capability placement model
check_studio_capability_local
ADR-backed deprecation path for studio-local

19. Open Questions

Should vLLM be host-managed or profile-switched through separate unit templates?
When should Compute graduate into a NixOS specialisation?
How strict should auto-transition be about user jobs and unsaved work heuristics?
Should current state be derived on demand only, or also cached to /run/mode-controller/current?
At what point should local Studio capability be considered officially externalized to a Mac mini?
What data/project sync model is required if Studio is split across machines?

19.1 Resolved Near-Term Decision

For v1:

studio-local is not a first-class target
studio-local is represented as a protected interactive policy overlay on desktop
desktop and compute are the only first-class top-level target families

This keeps the first implementation smaller while preserving the higher-level operational model and leaving room to strengthen Studio semantics later if needed.

19.2 Future Alternatives

Alternative A — Keep `studio-local` as an overlay permanently

Pros:

less target duplication
easier future deprecation if Studio moves to a Mac mini
simpler runtime switching model

Cons:

weaker systemd-level separability
more policy encoded in helper units and controller logic

Alternative B — Promote `studio-local` into a first-class target later

Pros:

stronger explicitness in systemd
easier inspection of Studio-specific dependencies
potentially clearer resource-policy boundaries

Cons:

higher maintenance cost
more duplication with desktop
less aligned with the likely future externalization path

Recommendation

Start with the overlay model. Revisit only if empirical evidence shows that audio-protection policy is too hard to express or validate without a dedicated target.

19.3 Resolved Near-Term Decision — vLLM Service Shape

Target architecture:

vllm@desktop.service
vllm@compute.service

However, for the first implementation pass, a single vllm.service is acceptable if:

desktop and compute profiles are still modeled explicitly in configuration
controller actions remain profile-aware
observation logic can still determine which profile is active

This allows the first bootable milestone to stay small without locking the architecture into a monolithic service model.

19.4 Resolved Near-Term Decision — k3s Service Shape

For v1:

k3s.service should remain stable across host-local modes
mode differences should be expressed through:
- slice/resource budgets
- workload-placement or workload-intensity policy
- optional node labels/taints later

This keeps the control plane smaller and avoids coupling every host-mode transition to cluster-runtime teardown and recovery.

Future alternative

If empirical operation shows that stable-across-modes k3s still creates unacceptable interference or ambiguity, stronger k3s mode switching can be introduced later. That should be treated as a deliberate escalation, not the default starting point.

19.5 Resolved Near-Term Decision — Desktop AI Policy

For v1:

keep vLLM off in desktop for the first convergence milestone
prove desktop ↔ compute transitions before enabling bounded desktop-mode AI

Future alternative

After the control plane is reliable, bounded desktop-mode AI may be introduced as an explicit profile with clear GPU1 ownership and resource limits.

19.6 Resolved Near-Term Decision — `studio-local` Overlay Shape

For v1, represent studio-local with:

studio-local-policy.service
audio-priority.service

This gives the controller and observation logic a clear marker plus an explicit enforcement unit without promoting Studio into a first-class top-level target.

Future alternative

If this proves too implicit, studio-local can later be promoted into a stronger grouped target or target-like overlay.

19.7 Resolved Near-Term Decision — Capability Placement Source

For v1, capability-placement.json should be generated from Nix configuration rather than edited ad hoc at runtime.

Rationale

keeps placement policy reproducible
avoids silent runtime drift
matches the design goal that machine-critical policy remain inspectable in Nix and systemd-managed artifacts

Future alternative

If operational experimentation later requires it, an explicit runtime override layer may be added with well-defined precedence and auditability.

19.8 Resolved Near-Term Decision — `mode force`

For v1, defer mode force.

Rationale

keeps attention on making the ordinary reconciliation path correct
avoids masking immature guard or transition logic
reduces the chance of bypassing safety boundaries during initial bring-up

Future alternative

Add mode force later only after hard-vs-soft guard semantics are stable and well tested.

19.9 Resolved Near-Term Decision — GUI Teardown Semantics

For v1, compute promotion should require:

graphical session absence
explicit GPU-release verification

It should not initially depend on forcibly stopping every greeter or display-manager path unless empirical testing shows those components interfere with reliable GPU handoff.

19.10 Resolved Near-Term Decision — Desktop Target Ownership

For v1, desktop.target should not directly own the greeter/login path.

Rationale

keeps mode ownership focused on operational policy rather than full session-manager orchestration
reduces coupling to whichever login/session stack is chosen
lets session presence remain an observed fact rather than an aggressively managed requirement

Future alternative

If desktop recovery proves unreliable without tighter control, greeter or display-manager paths can later be pulled under stronger mode ownership.

19.11 Resolved Near-Term Decision — `studio-local-policy.service` Scope

For v1, studio-local-policy.service should be:

a reliable marker for observation/classification
a light policy-application unit
explicitly limited in scope

It should not become a giant all-in-one Studio behavior controller.

Rationale

preserves clear observability
avoids burying controller logic inside a catch-all helper unit
keeps Studio overlay behavior inspectable and decomposable

19.12 Resolved Near-Term Decision — `observe-current` Implementation Language

For v1, implement observe-current in shell.

Constraints

keep the output contract stable:
- plain mode name for shell use
- structured JSON for diagnostics
structure the implementation so it can later be replaced by a typed helper without changing callers

Future alternative

If classifier complexity or JSON handling becomes unwieldy, replace only the classifier implementation with a small typed helper while keeping the same external contract.

19.13 Resolved Near-Term Decision — `mode` CLI Packaging

For v1:

keep the script sources in the repository
package them in pkgs/
install them through the NixOS module

Rationale

keeps the tool packaging clean and testable
avoids scattering ad hoc scripts directly into module definitions
preserves a clean path to reuse across hosts later

19.14 Resolved Near-Term Decision — Reconciler Trigger Model

For v1:

use parameterized oneshot reconciliation only
do not enable timer-driven or path-triggered background reconciliation yet

Rationale

keeps failure behavior easier to understand during bring-up
avoids masking transition bugs behind background retries
lets manual transitions prove the model first

Future alternative

After manual transitions are reliable, add periodic or path-triggered reconciliation for self-healing behavior.

19.15 Resolved Near-Term Decision — Boot Policy

For v1:

normalize to desktop on boot
do not replay persistent desired mode across reboot

Rationale

gives the system a predictable safe recovery posture
avoids booting directly back into a problematic compute path while the controller is still maturing
keeps early operational behavior easier to reason about

Future alternative

Once transitions are reliable, desired-state persistence across reboot can be introduced as an explicit policy feature.

19A. Architectural Decision Record — Potential Studio Externalization

Context

There is a realistic possibility that low-latency Studio/Audio workloads will migrate from the NixOS machine to a Mac mini.

Decision

The NixOS host architecture should treat Studio as a conditional local profile (studio-local) rather than a permanently central host mode.

Consequences

the host-local state machine remains stable if Studio moves off-box
Compute and Desktop remain the durable primary host-local modes
Studio capability can be represented separately through workload placement decisions
local audio support can still exist now without overcommitting the architecture to a permanent local Studio role

Follow-on Design Implications

add check_studio_capability_local guard for any studio-local transition
keep local audio policy isolated from core Compute/Desktop mechanics where practical
document future sync, control, and workflow boundaries if Studio becomes externalized

20. Control Interface and Implementation Contract

20.1 `mode` CLI Contract

The system should expose a single operator-facing interface:

mode status
mode request <desktop|studio-local|compute>
mode reconcile
mode current
mode desired
mode explain <desktop|studio-local|compute>
mode dry-run <desktop|studio-local|compute>
mode force <desktop|studio-local|compute>

Command Semantics

`mode status`

Returns:

desired mode
observed current mode
whether reconciliation is needed
last transition result
blocking guard failures, if any

`mode request <mode>`

Behavior:

write desired state
invoke reconciliation
return success only if reconciliation converged

`mode reconcile`

Behavior:

observe current state
compare to desired
select transition plan
run guards
execute actions
record results

`mode current`

Returns only the observed current mode.

`mode desired`

Returns only the desired mode file contents.

`mode explain <mode>`

Prints:

target state properties
expected services
resource ownership rules
guards required for entering that mode
capability placement assumptions, where relevant

`mode dry-run <mode>`

Simulates the full reconciliation plan without mutating state.

`mode force <mode>`

Privileged path that bypasses selected non-safety guards, but must never bypass hard safety guards such as GPU/display or active audio protections unless explicitly designed to allow that.

Implementation note:

defer this command in v1
keep it in the long-term interface contract so the design remains forward-compatible

21. State Storage Layout

21.1 Runtime State Paths

/run/mode-controller/
  desired
  current
  lock
  last-transition.json
  last-guards.json
  reconcile.pid
  capability-placement.json
  hardware-topology.json

21.2 File Semantics

`desired`

Contains the requested mode:

desktop
studio-local
compute

`current`

Cached observation of current state. This is convenience state only; it must be derivable from system facts.

`lock`

Used to serialize reconciliation so only one transition runs at a time.

`last-transition.json`

Stores:

requested mode
prior observed mode
final observed mode
success/failure
guard results
action results
timestamps

`last-guards.json`

Stores latest guard results for diagnostics.

`capability-placement.json`

Stores environment-level placement facts, for example:

studio: local
studio: external-mac-mini

This file is not the host-local mode source of truth. It is an environment metadata input used by guards and planning logic.

`hardware-topology.json`

Stores the currently configured hardware view, for example:

planned GPU count
currently present GPU indexes
display GPU assignment
desktop-mode AI GPU set
compute-mode AI GPU set

This allows the implementation to preserve the intended dual-GPU architecture while remaining tolerant of temporary single-GPU bring-up phases.

22. systemd Unit and Target Layout

22.1 Targets

`desktop.target`

Wants:

graphical-session target path
bounded interactive services
optional constrained AI services

First-pass implementation note:

do not make desktop.target directly own greeter/login-manager startup in v1
treat graphical session presence as an observed runtime fact
strengthen ownership later only if empirical recovery behavior requires it

`compute.target`

Wants:

headless service profile
vLLM compute profile
k3s compute-allowed policy/profile

22.2 Core Services

`mode-controller@.service`

Parameterized oneshot service.

Instance values:

mode-controller@desktop.service
mode-controller@studio-local.service
mode-controller@compute.service

Responsibilities:

load desired mode
observe current mode
run reconciliation
update state files and logs

First-pass implementation note:

use this parameterized oneshot service as the sole reconciler trigger in v1
defer timer/path-triggered background reconciliation until manual operation is proven reliable

`mode-observe.service`

Optional oneshot helper to compute observed current mode and refresh /run/mode-controller/current.

`vllm@.service`

Optional templated service for profile-specific operation:

vllm@desktop.service
vllm@studio-local.service
vllm@compute.service

Alternative:

single vllm.service with environment file switching

First-pass implementation guidance:

prefer separate desktop and compute profiles conceptually
studio-local should not require its own dedicated vLLM unit in v1 if Studio is implemented as a desktop overlay
a single vllm.service is acceptable initially if it preserves a clean migration path to templated units later
keep desktop-mode vLLM disabled for the first transition-proof milestone

`mode-guard@.service`

Optional wrapper pattern for reusable guard execution, though plain scripts may be simpler initially.

`studio-local` overlay units

Recommended first-pass representation:

audio-priority.service
studio-local-policy.service
optional environment/policy file consumed by observation and guard logic

These units should layer on top of desktop.target rather than replacing it with a distinct top-level target in v1.

Recommended scope for studio-local-policy.service:

expose a clear mode marker
apply only light, explicit Studio-specific policy
delegate heavyweight orchestration to the controller or dedicated helper units

22.3 Suggested Slice Layout

system.slice
├── interactive.slice
│   ├── graphical-session scope/services
│   ├── audio-related helpers
│   └── bounded desktop workloads
├── ai.slice
│   ├── vllm service
│   └── AI helpers
└── platform.slice
    ├── k3s service
    └── supporting infra services

Slice Intent

interactive.slice gets priority and headroom in Desktop/Studio-Local
ai.slice is heavily constrained in Studio-Local, moderately constrained in Desktop, relaxed in Compute
platform.slice remains comparatively stable but may have tighter resource budgets in interactive modes and relaxed budgets in Compute

23. Current State Observation Logic

Current state must be observed, not assumed.

23.1 Observation Inputs

GUI Indicators

graphical.target or session-specific equivalent active
active user session via loginctl
Hyprland process/session present

Audio Indicators

PipeWire user service active
active audio clients or REAPER process
optional JACK graph activity

AI Indicators

vllm*.service active
environment/profile indicates single-GPU or dual-GPU mode
optional nvidia-smi-based observation of active GPU usage

Platform Indicators

k3s.service active
optional workload-class indicators

23.2 Observation Heuristic

Observed mode should be derived using a deterministic classifier.

Proposed classifier logic

Observe `compute`

If all of the following are true:

no active graphical session
compute target active or compute service profile active
vLLM compute profile active or both GPUs assigned to AI policy

Then observed current mode = compute

Observe `studio-local`

If all of the following are true:

graphical session active
audio stack active
studio-local policy marker active
AI profile disabled or highly constrained

Then observed current mode = studio-local

Observe `desktop`

If all of the following are true:

graphical session active
desktop policy marker active
no studio-local policy marker

Then observed current mode = desktop

Observe `transitioning`

If:

desired != inferred stable mode
controller is running or lock file exists

Then observed current mode = transitioning

Observe `failed-transition`

If:

last transition failed
current does not match desired
no controller currently reconciling

Then observed current mode = failed-transition

23.3 Recommendation

Use a small classifier script:

/usr/local/libexec/mode-controller/observe-current

Outputs:

plain mode name for shell use
optional JSON with evidence for debugging

First-pass implementation note:

implement this in shell first
preserve a stable output contract so the implementation language can change later without changing the control plane

24. Guard Function Contract

24.1 Guard Naming

check_audio_idle
check_gpu_display_released
check_cpu_load_safe
check_user_jobs_safe
check_memory_headroom
check_vllm_drainable
check_graphical_session_absent
check_graphical_session_present
check_target_reachable
check_studio_capability_local

24.2 Exit Code Convention

0   pass
10  policy block: audio active
11  policy block: display GPU still owned
12  policy block: CPU load too high
13  policy block: user jobs active
14  policy block: insufficient memory headroom
15  policy block: vLLM not drainable
16  policy block: graphical session absent when required
17  policy block: graphical session present when forbidden
18  policy block: target unreachable / invalid request
19  policy block: requested local studio capability not available
20+ execution/inspection errors
30+ internal controller misuse

24.3 Guard Output Contract

Each guard should emit a concise structured line or JSON object such as:

{"guard":"check_audio_idle","ok":false,"code":10,"reason":"reaper process active"}

24.4 Hard vs Soft Guards

Hard guards

Must never be bypassed by ordinary automation:

active audio protection for Studio-Local -> Compute or Desktop -> Compute
GPU/display ownership guard
target validity checks
local Studio capability checks for studio-local

Soft guards

May be bypassed by privileged operator action or policy:

generic CPU load threshold
selected user-job heuristics
non-critical memory thresholds

25. Transition Plans with Exact Operations

This section normalizes each transition into explicit steps.

25.1 Common Transition Framework

All transitions should follow:

acquire lock
observe current state
validate requested mode
if current == desired, exit success
select transition plan
run transition guards
execute pre-actions
isolate or start target
execute post-actions
re-observe current state
record success/failure
release lock

25.2 Plan: Desktop -> StudioLocal

Preconditions

desktop currently observed
request = studio-local
local Studio capability is still hosted on the NixOS machine

Guards

check_target_reachable
check_studio_capability_local
optional check_user_jobs_safe

Exact operations

write desired=studio-local
flock /run/mode-controller/lock
observe current
run guards
systemctl start audio-priority.service      # if modeled separately
systemctl start studio-local-policy.service
observe current
record result

Notes

GUI remains up
audio policy is strengthened
AI capacity is reduced or removed
if Studio capability has been externalized, this transition must fail cleanly with an explanatory reason

25.3 Plan: StudioLocal -> Desktop

Guards

check_target_reachable

Exact operations

write desired=desktop
flock /run/mode-controller/lock
observe current
run guards
systemctl stop audio-priority.service       # if separate helper exists
systemctl stop studio-local-policy.service
systemctl isolate desktop.target
observe current
record result

25.4 Plan: Desktop -> Compute

Guards

check_target_reachable
check_audio_idle
check_cpu_load_safe
check_user_jobs_safe
check_memory_headroom

Pre-actions

terminate graphical session
wait for GUI disappearance
verify GPU/display release

Exact operations

write desired=compute
flock /run/mode-controller/lock
observe current
run initial guards
loginctl terminate-session <session-id>
wait until observe-current no longer sees graphical session
run check_gpu_display_released
systemctl isolate compute.target
systemctl start vllm@compute.service
observe current
record result

Additional notes

systemctl isolate compute.target should conflict with interactive/graphical targets in your target design
GPU release must be verified after GUI shutdown, not merely assumed

25.5 Plan: Compute -> Desktop

Guards

check_target_reachable
check_vllm_drainable
check_memory_headroom

Exact operations

write desired=desktop
flock /run/mode-controller/lock
observe current
run guards
systemctl stop vllm@compute.service         # or downscale path
systemctl isolate desktop.target
systemctl start vllm@desktop.service        # optional bounded single-GPU profile
observe current
record result

Notes

graphical session may be started by display manager or login path depending on design
GPU0 becomes protected for display once Desktop converges

25.6 Plan: StudioLocal -> Compute

Preferred behavior

Treat as a direct guarded transition using the same compute-entry pipeline.

Guards

check_target_reachable
check_audio_idle
check_cpu_load_safe
check_user_jobs_safe
check_memory_headroom

Exact operations

write desired=compute
flock /run/mode-controller/lock
observe current
run guards
loginctl terminate-session <session-id>
wait until graphical session absent
run check_gpu_display_released
systemctl isolate compute.target
systemctl start vllm@compute.service
observe current
record result

Policy note

Because Studio-Local is the most protected interactive mode, auto-promotion from Studio-Local to Compute should generally be disabled unless explicitly requested.

26. NixOS Specialisations vs Runtime Switching — Decision Guidance

26.1 Decision Matrix

Criterion	Runtime Switching	Specialisations	Hybrid
Desktop <-> Studio-Local speed	Excellent	Poor	Excellent
Desktop <-> Compute isolation	Moderate	Strong	Stronger
Complexity	Lower	Moderate	Highest
Early experimentation	Best	Slower	Moderate
Deep kernel/boot divergence	Weak	Strong	Strong
Operational convenience	High	Lower	Moderate
Future externalization of Studio	Good	Good	Best

26.2 Recommended Decision Rule

Adopt runtime switching now unless one or more of the following become true:

compute mode needs materially different kernel parameters or boot-time config
graphical/interactive teardown proves unreliable in practice
GPU role handoff remains too leaky under runtime-only switching
you want Compute to be operationally closer to a dedicated server persona than a temporary mode

If any two of the above become persistent problems, promote Compute into a specialisation.

26.3 Recommended Architecture Path

Phase 1

single NixOS host definition
runtime switching only
targets + slices + controller + guards

Phase 2

strengthen target separation
gather empirical failure/latency data

Phase 3

if needed, introduce specialisation.compute
preserve same desired/current/reconcile interface so operator UX does not change

Phase 4

if Studio is externalized, deprecate or disable studio-local
retain the same operator-facing control model for the host-local system

That means mode request compute could later choose:

runtime reconcile, or
request/reboot into compute specialisation

without changing the higher-level model.

27. Recommended Next Implementation Steps

define exact systemd target dependencies/conflicts in Nix
implement mode CLI wrapper script
implement observe-current
implement guard scripts with fixed exit-code contract
choose between:
- vllm@desktop.service / vllm@compute.service
- one service with profile env file
define slice resource policies for interactive vs AI
wire idle detector to mode request compute
validate transition behavior manually before enabling automation
add a capability-placement flag/model for future Studio externalization

28. Summary

This system should behave like a reconciled state machine for host-local operational modes.

The core model is:

desired mode is explicit runtime intent
current mode is observed reality
reconciliation closes the gap
guards prevent unsafe transitions
systemd targets/services perform the actual mode enactment

The implementation should start with runtime switching, but preserve a clean path to hybrid specialisation if operational evidence justifies stronger separation later.

Studio/Audio should be treated as a conditional local profile plus a capability-placement decision, so that a future move to a Mac mini does not invalidate the host-local architecture.

Mode State Machine Design (v0.1 — Living)

Purpose

Define an explicit, enforceable state machine governing operational modes for a dual-use NixOS system (desktop + AI compute), including states, transitions, guards, and actions.

1. State Definitions

S0: Boot

Initial system state
Minimal services active
Transitions automatically to default mode

S1: Desktop (Dev)

Interactive workstation mode
Balanced resource usage
GUI + audio enabled
Limited AI workloads allowed

S2: Studio (Audio)

Strict low-latency mode
Audio prioritized
AI workloads disabled or near-zero

S3: Compute (Headless)

Throughput-oriented mode
No GUI
Full AI utilization (multi-GPU)

S4: Transitioning

Temporary state
Ensures safe handoff between modes

2. State Diagram

stateDiagram-v2
    [*] --> Boot

    Boot --> Desktop : default

    Desktop --> Studio : enter_studio
    Studio --> Desktop : exit_studio

    Desktop --> Transitioning : to_compute
    Transitioning --> Compute : success
    Transitioning --> Desktop : abort

    Compute --> Transitioning : to_desktop
    Transitioning --> Desktop : success

    Studio --> Desktop : enforced_exit

3. State Properties

Desktop

GUI: ON
Audio: ON
GPU0: Display
GPU1: AI (optional)
vLLM: constrained (1 GPU)
k3s: control plane only

Studio

GUI: ON
Audio: RT priority
GPU0: Display (exclusive)
GPU1: disabled or minimal
vLLM: OFF
k3s: minimal

Compute

GUI: OFF
Audio: OFF/minimal
GPU0 + GPU1: AI
vLLM: multi-GPU
k3s: full workloads

4. Transitions

T1: Desktop → Studio

Trigger: user command

Guards:

No active compute jobs above threshold

Actions:

Reduce/stop vLLM
Raise audio priority
Restrict background jobs

T2: Studio → Desktop

Trigger: user command

Guards: none

Actions:

Restore normal scheduling
Allow background workloads

T3: Desktop → Compute

Trigger:

manual command
idle-triggered event

Guards:

No active audio sessions (PipeWire graph empty)
No REAPER process OR project inactive
GPU not held by compositor
CPU load below threshold
No long-running user jobs

Actions:

Notify user (if interactive)
Terminate GUI session
Wait for GPU release
Stop audio services
Expand vLLM to multi-GPU
Enable compute services (k3s workloads)

T4: Compute → Desktop

Trigger: user command

Guards:

vLLM can scale down OR be stopped
GPU memory can be reclaimed

Actions:

Drain or stop AI workloads
Reduce vLLM to single GPU or stop
Start graphical target
Reassign GPU0 to display
Start audio stack

T5: Studio → Compute

Trigger: (not allowed)

Policy:

Must transition via Desktop

5. Guards (Detailed)

G1: Audio Idle

PipeWire graph contains no active nodes
No JACK clients

G2: GPU Availability

No compositor process using GPU
Low GPU utilization

G3: CPU Load

Load average below threshold (configurable)

G4: User Workload Safety

No known long-running dev tasks
Optional: no foreground terminals

G5: Memory Headroom

Sufficient free RAM for mode switch

6. Actions (Atomic Steps)

A1: Stop GUI

loginctl terminate-session

A2: Release GPU

Wait until no graphical processes hold GPU

A3: Adjust Services

systemd isolate target

A4: Adjust Resource Limits

Modify cgroups/slices

A5: Scale AI Services

Adjust CUDA_VISIBLE_DEVICES
Restart vLLM

7. Failure Handling

Abort Conditions

Guard failure
Timeout waiting for GPU release
Service failure

Behavior

Log reason
Return to previous stable state

8. Observability

Required Signals

Current mode
Last transition
Guard evaluation results
Resource usage snapshot

Interfaces

CLI: mode status
Logs: journald

9. Extensibility

Future states may include:

Maintenance mode
Remote-only desktop mode
GPU-partitioned mode

10. Notes

This state machine should be implemented via systemd targets + controller script
Transitions must be idempotent
Guards should be configurable
Prefer dry-run capability before execution

Summary

This system treats operational modes as a formal state machine with:

explicit states
guarded transitions
deterministic actions

This enables safe coexistence of:

low-latency desktop workloads
high-throughput AI services

Dual-Mode NixOS Workstation AI Node — Unified Planning and Mode State Machine

Implementation Checklist Plan

This is structured to get you from doc → bootable system with minimal thrash.

Phase 0 — Ground Truth (before touching Nix)

Hardware + constraints

Confirm GPU topology (which is GPU0 vs GPU1)
Confirm display wiring (which GPU drives monitor)
Confirm audio interface + latency requirements
Validate NVIDIA driver compatibility with NixOS + Wayland/Hyprland

Decisions to lock

Use runtime switching (no specialisations yet)
Studio = studio-local (conditional policy overlay on desktop, not a first-class target in v1)
Source of truth = /run/mode-controller/desired
mode request is synchronous: return success only after convergence
Choose vLLM unit model for v1:
- v1 fast path: single compute-only vllm.service
- target architecture: vllm@desktop.service and vllm@compute.service
k3s policy for v1:
- keep k3s.service running across modes
- change slice budgets and allowed workload intensity by mode
- defer full k3s mode switching unless operational evidence justifies it
desktop-mode AI policy for v1:
- keep vLLM off in desktop for the first convergence milestone
- only add bounded desktop-mode AI after desktop ↔ compute switching is reliable
studio-local overlay representation for v1:
- studio-local-policy.service
- audio-priority.service
capability-placement.json source for v1:
- generated from Nix configuration
- no runtime override unless a real need emerges
defer mode force in v1
GUI teardown policy for compute transitions:
- require graphical session absence
- require explicit GPU-release verification
- only add display-manager/greeter stop logic if testing proves it necessary
desktop.target should not directly own greeter/login in v1
studio-local-policy.service should be:
- a reliable marker for observation
- a light policy-application unit
- not a giant all-in-one Studio controller
observe-current implementation for v1:
- shell first
- stable plain-text + JSON output contract
- replace with typed helper later only if complexity justifies it
package mode tools in pkgs/ and install them through the module
controller trigger model for v1:
- parameterized oneshot only
- no timer/path-triggered reconcile until manual transitions are proven
boot policy for v1:
- normalize to desktop on boot
- defer persistent desired-state replay across reboot
Define hard vs soft guards before automation

Phase 0.5 — Control Contract (before full workload integration)

Runtime state contract

Define /run/mode-controller/
- desired
- current
- lock
- last-transition.json
- last-guards.json
- capability-placement.json

CLI contract

Implement or stub:
- mode request
- mode status
- mode reconcile
- mode current
- mode desired
- mode dry-run
- mode explain
defer mode force until guard policy is battle-tested

Observation contract

Classifier can return:
- desktop
- studio-local
- compute
- transitioning
- failed-transition

Guard contract

Add check_target_reachable
Standardize exit codes
Standardize structured output
Mark guards as hard vs soft

Phase 1 — Base NixOS System

Core system

Create flake repo (if not already)
Install NixOS (minimal)
Enable flakes + nix-command
Add SSH + basic hardening

GPU + CUDA

Install NVIDIA drivers (matching kernel)
Validate nvidia-smi
Validate CUDA runtime

Desktop

Install Hyprland
Configure login/session (greetd or similar)
Validate Wayland stability with NVIDIA

Audio

Install PipeWire + WirePlumber
Validate low-latency config
Test REAPER baseline

Phase 2 — systemd Mode Skeleton

Targets / policy markers

Define first-class targets:
- desktop.target
- compute.target
Define studio-local as a policy overlay on desktop
Add explicit policy marker/service for studio-local
Decide whether studio-local is represented by:
- audio-priority.service
- studio-local-policy.service layered over desktop
- another lightweight marker unit

Relationships

Add Conflicts= between:
- compute ↔ graphical targets
Add Wants= / After= dependencies

Slices

Define:
- interactive.slice
- ai.slice
- platform.slice
Assign services to slices

Phase 3 — Mode Controller (Core)

Core controller

mode-controller@.service
observe-current
reconcile
lock handling
state-file updates
dry-run path

Failure model

Record failed-transition
Record prior mode
Record guard/action failures
Verify abort-to-safe-state behavior

Phase 4 — Workload Layer

AI / vLLM

Package or install vLLM
Create profile-specific config/env for:
- desktop profile
- compute profile
Implement either:
- v1 fast path: single vllm.service
- target path: vllm@desktop.service + vllm@compute.service
keep vLLM disabled in desktop for the first bootable transition milestone
Validate single-GPU mode
Validate dual-GPU mode
Keep controller actions profile-aware so later split is mechanical

Platform / k3s

Install k3s
Configure control node
Validate cluster health
Deploy minimal workload
Keep k3s.service stable across desktop and compute in v1
Express mode differences via:
- platform.slice budgets
- workload policy / allowed intensity
- optional node labels / taints later

Phase 5 — State Observation

Implement classifier

observe-current script

Detect:

graphical session (loginctl / process)
PipeWire / audio activity
vLLM service state
GPU usage (optional: nvidia-smi)

Output

plain mode
optional JSON (debug)
classify transitioning
classify failed-transition

Phase 6 — Guards

Implement guards (scripts)

check_target_reachable
check_audio_idle
check_gpu_display_released
check_cpu_load_safe
check_user_jobs_safe
check_memory_headroom
check_vllm_drainable
check_studio_capability_local

Standardize

exit codes
JSON output
logging
hard vs soft guard policy

Phase 7 — Transition Execution

Implement transition flows

Desktop → StudioLocal
StudioLocal → Desktop
Desktop → Compute
Compute → Desktop
StudioLocal → Compute

Verify explicitly

graphical session absence before compute promotion
GPU release after GUI shutdown
vLLM profile switching
audio protection works
transitions are idempotent
failed guard returns to prior safe state
failed action records failed-transition

Phase 8 — Idle + Automation

Idle detection

implement idle signal (input + audio + load)
threshold tuning

Policy

idle → mode request compute
guard failures → no transition

Safety

never auto-promote from studio-local

Phase 9 — Observability

Logging

structured logs for:
- transitions
- guards
- failures

Status

mode status shows:
- desired
- current
- last transition
- blocking guards
- capability placement

Phase 10 — Hardening

Failure handling

retry logic (bounded)
failed-transition state handling

Resource tuning

CPU quotas per slice
memory limits
I/O priority
tune platform.slice conservatively for desktop / studio-local, relaxed for compute

Security

restrict mode controller to root
audit transitions
isolate AI services

Phase 11 — Optional Evolution

If runtime switching is insufficient

introduce specialisation.compute
keep same mode interface
optionally promote studio-local overlay into a stronger first-class target only if operational evidence justifies the added complexity
consider stronger k3s mode-switching only if slice-governed steady-state behavior is inadequate

If Studio moves to Mac mini

set capability-placement.json
disable studio-local
keep controller intact

Critical Path (short version)

If you want the fastest path to something real:

Base NixOS + GPU + Hyprland
vLLM working (single GPU)
Define targets (desktop, compute)
Simple mode CLI + desired file
Hardcoded transitions (no guards yet)
Add guards + observation
Add idle automation
Add studio-local last

Where this can go wrong (worth calling out)

GPU release is the hardest boundary → don’t assume, always verify
Audio is fragile → treat StudioLocal invariants as strict
systemd isolate can surprise you → test with minimal configs first
too much cleverness early → get a dumb working version first, then refine

First Bring-Up Checklist

This is the shortest practical path to getting the first live build onto a real NixOS machine.

It assumes:

this repo is available on the target machine
the target machine is the intended workstation host
the current v1 policy remains:
- boot default = desktop
- studio-local is an overlay on desktop
- vLLM is compute-only when explicitly enabled

1. Put the Repo on the Target Machine

git clone <repo-url> /path/to/dubnium
cd /path/to/dubnium

If the repo is already local:

cd /path/to/dubnium

2. Generate Real Hardware Configuration

The scaffold currently contains a placeholder hardware file.

On the target NixOS machine:

sudo nixos-generate-config --dir ./hosts/workstation

This should populate:

hosts/workstation/hardware-configuration.nix

Review that file and make sure:

it matches the actual boot disk/filesystem layout
it does not remove the existing import structure in hosts/workstation/default.nix

3. Review Host-Specific Settings Before First Build

Check hosts/workstation/default.nix.

Important values to confirm:

networking.hostName
dubnium.hardware.presentGpus
dubnium.hardware.displayGpu
dubnium.hardware.computeGpus
dubnium.vllm.enable
dubnium.vllm.model

Current intended first live model:

Qwen/Qwen2.5-Coder-14B-Instruct

Current intended first hardware phase:

planned architecture: 2 GPUs
currently present: GPU 0
compute GPU set: [ 0 ]

4. Build Without Switching First

Do a dry build first:

sudo nixos-rebuild build --flake .#workstation

If this fails:

fix Nix evaluation issues first
do not jump into switch

Common first-failure areas:

hardware configuration mismatch
NVIDIA options
package evaluation problems
typos in host-local settings

5. Switch to the New Configuration

If the build succeeds:

sudo nixos-rebuild switch --flake .#workstation

6. Verify Core Pieces After Switch

Check the mode CLI:

mode status
mode current
mode desired

Check runtime state files:

sudo ls -la /run/mode-controller
sudo cat /run/mode-controller/desired
sudo cat /run/mode-controller/current
sudo cat /run/mode-controller/capability-placement.json
sudo cat /run/mode-controller/hardware-topology.json

Check systemd units:

systemctl status desktop.target
systemctl status compute.target
systemctl status studio-local-policy.service
systemctl status audio-priority.service
systemctl status vllm.service

Notes:

vllm.service should not be active in desktop
with default workstation settings, vllm.service should not exist until dubnium.vllm.enable = true
studio-local-policy.service and audio-priority.service should not be active unless studio-local is requested

7. Test `desktop -> studio-local`

sudo mode request studio-local
mode status
systemctl status studio-local-policy.service
systemctl status audio-priority.service

Expected result:

current mode becomes studio-local
studio-local-policy.service is active
audio-priority.service is active

Then return:

sudo mode request desktop
mode status

8. Test `desktop -> compute`

Before testing:

close REAPER
avoid active audio work
avoid long-running foreground development jobs
seed the local model bundle from USB
explicitly enable dubnium.vllm.enable = true if this test should exercise the vLLM service

Then:

sudo mode request compute
mode status
systemctl status compute.target
systemctl status vllm.service

Expected result:

graphical session is terminated
system converges to compute
if vLLM is enabled, vllm.service is started by compute.target

Important caveat:

Seed the local model bundle from USB before the first compute transition. If the bundle is absent, vLLM should fail clearly rather than relying on a first-run network download.

9. Test `compute -> desktop`

sudo mode request desktop
mode status
systemctl status vllm.service

Expected result:

vllm.service is stopped
system converges back to desktop

10. If Something Fails

Check:

mode status
sudo cat /run/mode-controller/last-transition.json
sudo cat /run/mode-controller/last-guards.json
journalctl -u 'mode-controller@*' -b
journalctl -u vllm.service -b

Most useful first diagnosis buckets:

guard blocked transition
graphical session did not terminate cleanly
GPU did not look released
vLLM service failed to start
model/runtime/CUDA issue

11. First Successful Milestone

You should consider first bring-up successful when all of the following are true:

nixos-rebuild switch --flake .#workstation succeeds
mode status works
desktop -> studio-local -> desktop works
desktop -> compute -> desktop works
last-transition.json and last-guards.json are useful for failures

At that point, the next iteration is:

tighten NVIDIA/vLLM runtime behavior
improve observe-current
tune audio-priority.service
refine slice policy
add second GPU when ready

Fresh Install Checklist

This checklist is for installing dubnium onto a machine from scratch using a NixOS live USB.

Use this when:

the target machine does not already run NixOS
you are replacing the current OS
you want the flake to be the source of truth from first boot

If the machine already runs NixOS, use docs/first-bring-up-checklist.md instead.

Each top-level step has:

Start when: what must already be true before starting the step
Outcomes: what should be true when the step is complete

1. Prepare a NixOS Installer USB

Current preferred path: use the Dubnium custom installer USB, not a stock ISO. The custom installer bakes a source export of this private repo plus external/dotfiles into the live image. Write it to USB as a raw disk image, matching Rufus “DD image mode”. Use separate writable media for a local model seed bundle.

Build the ISO and prepare the seed model:

scripts/build-installer-iso.sh \
  --iso ./dubnium-installer.iso

This writes ./dubnium-installer.iso into the checkout. By default the helper uses the current Dubnium default model bundle, but the USB layout only requires a materialized model directory with config.json and SHA256SUMS. Pass --seed-model when using a different local bundle.

Then prepare the USB with the guarded writer for the current platform.

Windows PowerShell:

.\scripts\write-installer-usb.ps1 `
  -IsoPath .\dubnium-installer.iso `
  -DiskNumber 7 `
  -ExpectedFriendlyName "USB SanDisk 3.2Gen1"

The writer requires the disk identity check and final y/N confirmation. It overwrites the whole USB disk with the ISO image.

Optional one-shot Windows path:

.\scripts\build-installer-usb.ps1 `
  -DiskNumber 7 `
  -ExpectedFriendlyName "USB SanDisk 3.2Gen1"

Optional one-shot Linux or macOS path:

bash scripts/build-installer-usb.sh \
  --disk /dev/sdX \
  --expected SanDisk

On macOS, use a whole disk such as /dev/diskN. On Linux, use a whole USB disk such as /dev/sdX, not a partition.

Manual Linux USB write path:

scripts/write-installer-usb.sh \
  --iso ./dubnium-installer.iso \
  --disk /dev/sdX \
  --expected SanDisk

Expected USB layout:

dubnium-installer.iso -> whole USB disk

Verify the installer media from whichever drive letter Windows assigns:

Test-Path I:\EFI\BOOT\BOOTX64.EFI
Test-Path I:\nix-store.squashfs
Get-Volume -DriveLetter I

Separate seed media should contain:

models/selected-model-bundle/

See docs/runbooks/custom-installer-iso.md for the full USB process and docs/runbooks/model-seeding.md for the seed bundle commands.

Start when

existing Nix-capable build machine with the Dubnium repo checkout
USB stick that can be erased
materialized model bundle is available locally, if seeding the model now
permission to run the guarded USB writer for the platform

Outcomes

custom Dubnium ISO is built from the intended flake source
USB device identity was checked by the platform helper before erase
USB has a bootable raw-written Dubnium installer image
separate seed media has the local model bundle, if seeding now
the install path requires no GitHub token, private SSH key, or Hugging Face download in the live installer

1.1 USB Security And Drift Check

Before leaving the build machine, confirm:

git status --short
git -C external/dotfiles status --short

The ISO bakes tracked flake source, including external/dotfiles, into the installer. Stage or commit intentional changes before building, and do not bake decrypted secrets, long-lived tokens, SSH private keys, local caches, or model weights into the repo.

The USB is private media. The installer payload contains private source, and separate seed media contains unencrypted model files.

1.2 Seamless USB Acceptance Check

Before booting the target, verify the prepared stick:

EFI/BOOT/BOOTX64.EFI
nix-store.squashfs
seed-media/models/selected-model-bundle/config.json
seed-media/models/selected-model-bundle/SHA256SUMS

If the model bundle is not on the USB yet, use docs/runbooks/model-seeding.md before booting the target.

1.3 Stock ISO Fallback

A stock NixOS ISO remains useful for rescue, but it is not the preferred fresh Dubnium install path. If using stock media, you must bring the Dubnium source and initialized dotfiles submodule on separate private media, then install from that local checkout. Do not depend on live-session GitHub credentials for a private-repo install.

2. Boot the Target Machine From USB

Start when

prepared NixOS installer USB
physical access to the target machine
firmware access or boot-menu access

Outcomes

target machine is booted into the NixOS live environment
firmware boot mode and target disk visibility are confirmed
keyboard, display, disk visibility, and network are usable
source import tools are available before repo setup steps begin
private repo source is reachable from the live environment
optional SSH access to the live environment is available if needed

2.1 Confirm Firmware Settings

Before booting the installer, review firmware settings:

boot mode should be UEFI, not legacy/CSM
Secure Boot should be disabled unless you intentionally handle it
internal install disk should be visible
primary display GPU should be the one you expect
Above 4G decoding should be enabled if the firmware exposes it and you plan to use multiple GPUs
virtualization/IOMMU can be enabled if you expect to use it later

Do not proceed if the firmware cannot see the target install disk.

Insert the USB stick into the target machine, power it on, and enter the firmware boot menu.

Common boot-menu keys:

F8
F11
F12
Esc
Del

Choose the USB entry. Prefer the UEFI entry if the firmware shows both legacy and UEFI options.

2.3 Confirm Live Environment Basics

After the NixOS live environment boots, open a terminal.

Check that the machine sees CPU, memory, disks, and network devices:

lscpu | head
free -h
lsblk -o NAME,SIZE,MODEL,TYPE,MOUNTPOINTS
ip link

Check network connectivity:

ip addr
ping -c 3 1.1.1.1
ping -c 3 github.com

If networking is not up:

connect Ethernet if available
use the graphical network manager in the GNOME ISO
on the minimal ISO, use nmtui if available:

sudo nmtui

Exit criteria:

keyboard works
display works
target install disk is visible
internet access works

2.4 Ensure Source Import Tools Are Available In the Live Environment

On the custom Dubnium installer USB, confirm the baked source helper exists:

command -v unpack-dubnium
tar --version

If unpack-dubnium is available, section 3 can use the baked source snapshot directly and does not need GitHub credentials.

Before importing the repo source from the live USB session, confirm git and basic archive tools are available:

git --version
tar --version

If git is missing and you need it for validation, install it in the live environment:

nix-shell -p git --run 'git --version'

If you need git for more than one command, enter a shell with it available:

nix-shell -p git
git --version

Exit criteria:

git --version succeeds in the current shell or in the shell you will use to inspect the repo
tar --version succeeds if you are extracting an archive

2.5 Optional: Enable SSH Into the Live Environment

Use this if the target machine is easier to drive from another computer. Note: If using the Custom Installer, your SSH keys may already be authorized. Otherwise:

Set a temporary password: passwd
Or add your key: mkdir -p ~/.ssh && echo "ssh-ed25519 ..." >> ~/.ssh/authorized_keys

Start SSH:

sudo systemctl start sshd
ip addr

Then connect from another machine using the live environment IP address:

ssh nixos@<target-ip>

This access is temporary and only applies to the live USB environment.

3. Make the Repo Available in the Live Environment

Start when

live NixOS environment is running
the custom installer source snapshot is available, or a separate private source export is attached to the machine

Outcomes

Dubnium repo exists in the live environment
repo contains flake.nix
repo contains hosts/workstation/default.nix
repo contains external/dotfiles/flake.nix
commands are being run from the repo root

3.0 Preferred: Unpack From Custom Installer Media

For the current one-shot install path, run the guarded installer helper:

install-dubnium-from-usb

This replaces the manual section 3 through section 9 flow for the simple unencrypted layout. The helper prints lsblk, prompts for the target whole disk, and asks for final y/N confirmation before erasing anything. Defaults are btrfs, dubnium home profile, passwd password mode, and copying the install snapshot into the installed system. Use --password-mode hash to write a host-local initial password hash before install, or --password-mode skip when another login path already exists. Use --dry-run first if disk identity is not yet obvious.

If booted from the Dubnium custom installer USB, use the baked source snapshot:

unpack-dubnium
cd ~/local/src/dubnium

This is the token-free private repo path. It does not clone from GitHub during install.

To choose the installed normal user, create hosts/workstation/user.nix before install:

{
  dubnium.user.name = "alice";
  dubnium.user.description = "Example User";
}

3.1 Alternate: Copy Source From Local Media

If you brought the repo on separate media, attach it now and identify it:

lsblk -o NAME,SIZE,MODEL,TRAN,TYPE,MOUNTPOINTS

Mount the removable media read-only if practical, then extract or copy the exported source into your working directory.

Example for a separate git archive-style export:

mkdir -p ~/installer-src
cd ~/installer-src
tar -xzf /path/to/dubnium-installer-src.tgz
cd dubnium

Example for a plain copied export tree:

mkdir -p ~/Projects
cp -a /path/to/dubnium ~/Projects/dubnium
cd ~/Projects/dubnium

This path avoids depending on live-session GitHub credentials. Prefer the custom installer payload when available, because it keeps the source path and helper behavior consistent.

3.2 Alternate: Extract A Separate Source Archive

If you are not using the current custom installer payload, bring a separate source archive and extract it to the same live-session path:

mkdir -p ~/local/src
tar -xzf /path/to/dubnium-installer-src.tgz -C ~/local/src
cd ~/local/src/dubnium

3.3 Verify Repo Contents

pwd
ls
git status --short
test -f flake.nix
test -f hosts/workstation/default.nix
test -f external/dotfiles/flake.nix

Exit criteria:

repo is present locally, whether copied, extracted, or imported
flake.nix exists
hosts/workstation/default.nix exists
external/dotfiles/flake.nix exists

4. Partition the Target Disk

Start when

target install disk is visible in lsblk
disk encryption decision is made
swap/hibernation decision is made
target disk has been positively identified and is safe to erase

Outcomes

target disk has a new GPT partition table
EFI system partition exists
root partition exists
EFI_PART and ROOT_PART point to real block devices
no partitioning commands have touched the USB installer

This repo does not yet prescribe a disk layout.

The example below uses a simple UEFI layout:

EFI system partition: 1 GiB, FAT32, mounted at /boot
root partition: rest of disk, ext4, mounted at /

This example does not create a separate /home partition and does not create a swap partition. Add those only if you deliberately want them.

This example does not enable disk encryption. If you want LUKS or a separate encrypted data layout, stop here and use a different partition/filesystem plan.

This example also does not create a swap partition. If hibernation is required, stop here and design swap explicitly. If hibernation is not required, zram can be handled later in NixOS configuration.

4.1 Identify the Install Disk

List disks:

lsblk -o NAME,SIZE,MODEL,TRAN,TYPE,MOUNTPOINTS

Example NVMe disk:

nvme0n1  1.8T Samsung_SSD disk

Example SATA/SAS/USB-style disk:

sda  1.8T Samsung_SSD disk

Set the target disk variable:

DISK=/dev/nvme0n1

or:

DISK=/dev/sda

Important:

this must be the internal install disk
this must not be the USB installer
all data on this disk will be destroyed once partitioning begins

4.2 Confirm Existing Layout

Before touching the disk:

echo "$DISK"
lsblk -o NAME,SIZE,MODEL,TYPE,FSTYPE,MOUNTPOINTS "$DISK"
sudo fdisk -l "$DISK"

Before touching disks, decide:

disk device name
EFI size
root filesystem choice
whether you want swap or zram only
whether you want a separate /home

Minimum sane layout:

EFI system partition
root partition

Example tools:

lsblk
blkid
fdisk
parted
gdisk

Do not proceed until you are sure which disk you are installing to.

4.3 Preview and Clear Existing Signatures

Preview existing filesystem and partition signatures:

sudo wipefs -n "$DISK"

If the disk is definitely the install target, clear old signatures:

sudo wipefs -a "$DISK"

This is destructive. Do not run it against the USB installer or any disk you intend to preserve.

4.4 Create a GPT Partition Table

This is destructive. Only run it after confirming DISK.

echo "About to partition: $DISK"
lsblk -o NAME,SIZE,MODEL,TYPE,MOUNTPOINTS "$DISK"

Create the partition table and partitions:

sudo parted "$DISK" -- mklabel gpt
sudo parted "$DISK" -- mkpart ESP fat32 1MiB 1025MiB
sudo parted "$DISK" -- set 1 esp on
sudo parted "$DISK" -- mkpart primary ext4 1025MiB 100%

Ask the kernel to re-read the partition table:

sudo partprobe "$DISK"
sleep 2
lsblk -o NAME,SIZE,MODEL,TYPE,FSTYPE,MOUNTPOINTS "$DISK"

4.5 Set Partition Variables

For NVMe disks, partitions are usually named with p1 / p2:

EFI_PART="${DISK}p1"
ROOT_PART="${DISK}p2"

For SATA/SAS-style disks, partitions are usually named 1 / 2:

EFI_PART="${DISK}1"
ROOT_PART="${DISK}2"

Verify:

echo "EFI_PART=$EFI_PART"
echo "ROOT_PART=$ROOT_PART"
test -b "$EFI_PART"
test -b "$ROOT_PART"
lsblk -o NAME,SIZE,MODEL,TYPE,FSTYPE,MOUNTPOINTS "$DISK"

5. Create Filesystems and Mount Them

Start when

EFI_PART points to the EFI partition
ROOT_PART points to the root partition
both partition variables have been verified with test -b

Outcomes

EFI partition is formatted FAT32
root partition is formatted ext4
root partition is mounted at /mnt
EFI partition is mounted at /mnt/boot
mount layout matches the future NixOS filesystem config

5.1 Format the Partitions

This is destructive to the selected partitions.

sudo mkfs.fat -F 32 -n NIXBOOT "$EFI_PART"
sudo mkfs.ext4 -L nixos "$ROOT_PART"

5.2 Mount the Root Filesystem

sudo mount "$ROOT_PART" /mnt

5.3 Mount the EFI Filesystem

The current host config expects systemd-boot, so mount the EFI filesystem at /mnt/boot:

sudo mkdir -p /mnt/boot
sudo mount "$EFI_PART" /mnt/boot

5.4 Verify Mount Layout

Once mounted, verify:

findmnt /mnt
findmnt /mnt/boot
lsblk -o NAME,SIZE,FSTYPE,LABEL,MOUNTPOINTS "$DISK"

Expected:

root partition mounted at /mnt
EFI partition mounted at /mnt/boot

6. Generate Hardware Configuration Into the Repo

Start when

repo is available and current shell is at repo root
target root filesystem is mounted at /mnt
target EFI filesystem is mounted at /mnt/boot

Outcomes

hosts/workstation/hardware-configuration.nix reflects the target hardware
generated filesystem entries match /mnt and /mnt/boot
placeholder hardware config has been replaced
git diff shows the hardware config change

6.1 Generate Config

From the repo root:

sudo nixos-generate-config --root /mnt --dir ./hosts/workstation

This should populate:

hosts/workstation/hardware-configuration.nix

Important:

this file must reflect the real disk layout you just mounted
this replaces the scaffold placeholder currently in the repo

6.2 Review Generated Hardware Config

sed -n '1,220p' hosts/workstation/hardware-configuration.nix

Confirm:

root filesystem points at the root partition or its filesystem label/UUID
/boot points at the EFI partition
generated imports look normal
no obvious reference to the USB installer disk exists

6.3 Confirm Git Diff

git diff -- hosts/workstation/hardware-configuration.nix

Exit criteria:

hardware config changed from placeholder to real host config
filesystem entries match the mounted target disk

7. Review Host Config Before Install

Start when

generated hardware config exists
host config exists at hosts/workstation/default.nix
hardware facts are known well enough to set GPU options accurately
login/access strategy is known

Outcomes

hostname, bootloader, SSH, GPU, vLLM, and k3s settings are reviewed
GPU settings reference only installed/visible GPUs
vLLM first-install stance is explicit
k3s first-install stance is explicit
at least one installed-system login path is known

7.1 Inspect Host Config

Check hosts/workstation/default.nix.

sed -n '1,240p' hosts/workstation/default.nix

At minimum confirm:

hostname
current GPU assumptions
vLLM model choice
any network or SSH expectations
bootloader settings
k3s enablement

Current scaffold assumptions:

boot default is desktop
studio-local is a desktop overlay
vLLM is compute-only
planned topology is 2 GPUs
currently present GPU set defaults to [ 0 ]

7.2 Confirm GPU Settings

If the target currently has only one NVIDIA GPU:

dubnium.hardware.presentGpus = [ 0 ];
dubnium.hardware.displayGpu = 0;
dubnium.hardware.computeGpus = [ 0 ];

If the target has two NVIDIA GPUs and you are ready to expose both to compute, update only after confirming nvidia-smi ordering:

dubnium.hardware.presentGpus = [ 0 1 ];
dubnium.hardware.displayGpu = 0;
dubnium.hardware.computeGpus = [ 0 1 ];

For first bring-up, prefer the most conservative accurate setting. Do not list a GPU that is not installed and visible.

7.3 Confirm vLLM Settings

Current host config disables vLLM by default so the workstation can prove the base desktop system before model/runtime work:

dubnium.vllm.enable = false;

If opting into vLLM for compute testing, set dubnium.vllm.enable = true and consider explicit first-run guardrails:

dubnium.vllm.extraArgs = [
  "--max-model-len" "8192"
  "--gpu-memory-utilization" "0.70"
  "--enforce-eager"
];

7.4 Confirm k3s Settings

Current host config has:

dubnium.k3s.enable = false;

Keep k3s disabled for the first install unless you specifically want to validate k3s during the first boot.

7.5 Confirm User and Access Settings

Before installing, confirm how you will log into the installed system:

rg -n "users\\.users|openssh|authorizedKeys|initialPassword|hashedPassword" hosts modules

The current host config enables SSH, but this checklist should not assume a normal user account exists unless the NixOS config declares it.

Choose one access strategy before install:

root password set by nixos-install
declared normal user with password or SSH key
SSH key access configured in NixOS

For the default workstation user, keep the password hash local by adding hosts/workstation/user.nix before install:

{
  users.users.ryjen.initialHashedPassword = "$y$j9T$...";
}

Generate the hash in the live environment with:

mkpasswd -m yescrypt

Do not reboot into the installed system without knowing at least one login path.

8. Optional Dry Evaluation Before Install

Start when

repo is at install-ready state
generated hardware config exists
network access is working in the live environment
Nix can evaluate flakes in the live environment

Outcomes

flake evaluation has been attempted
mode-tools package build has been attempted
any evaluation/build failure is understood before install
no unknown evaluation error is carried into nixos-install

8.1 Build the Target System

If the live environment has working Nix daemon support and networking, try:

sudo nixos-rebuild build --flake .#workstation

This is optional but useful.

If it fails:

fix evaluation problems before running the installer

8.2 Build the Mode Tools Package

nix build .#packages.x86_64-linux.mode-tools

8.3 Inspect Common Evaluation Failures

Common buckets:

hardware configuration references the wrong disk
NVIDIA package/options fail to evaluate
vLLM package is unavailable or expensive to build in the live environment
unfree packages are blocked
host option assertions fail

Exit criteria:

the flake evaluates
the system build either succeeds or fails for a known reason you have decided to accept before nixos-install

9. Install From the Flake

Start when

/mnt and /mnt/boot are mounted correctly
hardware config and host config are reviewed
dirty repo state is intentional
installed-system login path is known
repo persistence plan is explicit

Outcomes

NixOS is installed from .#workstation
bootloader installation result is known
root password or equivalent access path is established
repo is copied into the installed filesystem or a post-boot source import plan is explicit

9.1 Final Preinstall Check

Before installing:

findmnt /mnt
findmnt /mnt/boot
lsblk -o NAME,SIZE,FSTYPE,LABEL,MOUNTPOINTS "$DISK"
git status --short

Confirm:

/mnt is the target root filesystem
/mnt/boot is the target EFI filesystem
generated hardware config is present
host config is reviewed
any dirty repo state is intentional

9.2 Confirm Repo Persistence Plan

The live USB environment is temporary. The install itself uses the live checkout at ~/local/src/dubnium, but that path does not automatically become an installed-system checkout.

If you want the flake source to be available immediately after first boot, copy the current repo into the target filesystem before installing.

Example target location:

sudo mkdir -p /mnt/home/<user>/Projects
sudo cp -a "$(pwd)" /mnt/home/<user>/Projects/dubnium

If the installed system will have a different user or home path, adjust the destination.

If you prefer not to copy from the live environment, plan how you will import the repo source again after first boot. Do not assume the live-environment checkout survives reboot.

The custom installer source payload belongs to the USB live system. It is enough to install from, but it does not automatically become a checkout on the installed system. If install-time changes need to go back to the private Dubnium repo, reconcile them after first boot using Post-Install Source Reconciliation.

9.3 Run Installer

From the repo root:

sudo nixos-install --flake .#workstation

If the installer asks for a root password, set one unless you have already configured another access path.

9.4 Capture Install Result

If install succeeds, note:

whether bootloader installation succeeded
whether any warnings appeared
whether a root password was set

If install fails, do not reboot yet. Inspect the error while still in the live environment.

10. Reboot Into the Installed System

Start when

nixos-install --flake .#workstation completed successfully
bootloader result is known
root password or other access path exists
no unresolved install error remains

Outcomes

machine boots from the internal disk
USB installer is removed or not selected
installed NixOS system reaches a login/session path
if boot fails, rescue path is known and documented

10.1 Unmount and Reboot

If install succeeded:

sync
sudo reboot

Remove the USB stick when appropriate so the machine boots from disk.

10.2 Select Installed Disk

If the machine boots back into the USB installer:

remove the USB stick
enter firmware boot menu
select the internal disk or Linux Boot Manager

10.3 Recovery If Boot Fails

If the installed system does not boot:

boot the USB installer again
mount root and EFI partitions back under /mnt
inspect /mnt/etc/nixos and the generated hardware config
check firmware boot entries with bootctl from a chroot if needed

Concrete rescue mount:

sudo mount "$ROOT_PART" /mnt
sudo mount "$EFI_PART" /mnt/boot

Enter the installed system:

sudo nixos-enter --root /mnt

Inside the chroot:

bootctl status
nixos-rebuild boot --flake /home/<user>/Projects/dubnium#workstation
exit

If the repo was not copied into the installed filesystem, use the path where it actually exists or import it again from your prepared source media.

11. First Boot Verification

Start when

installed system has booted from internal disk
operator can log in locally or over SSH
repo exists on the installed system or can be imported immediately

Outcomes

installed system identity is verified
repo source is available on the installed system
mode CLI works
runtime state files exist
first observed mode is desktop
vLLM and studio overlay services are inactive in desktop
NVIDIA basics are verified before any compute testing

11.1 Verify Basic System Identity

After booting the installed system:

hostname
uname -a
ip addr

11.2 Verify Repo Location

If you copied the repo before install:

test -d ~/Projects/dubnium
cd ~/Projects/dubnium
git status --short

If the repo is missing, import it now before treating the system as fully owned by the flake source.

11.3 Verify Mode CLI

mode status
mode current
mode desired

11.4 Verify systemd Units

systemctl status desktop.target
systemctl status compute.target
systemctl status vllm.service
sudo ls -la /run/mode-controller

11.5 Verify Runtime State Files

sudo cat /run/mode-controller/desired
sudo cat /run/mode-controller/current
sudo cat /run/mode-controller/capability-placement.json
sudo cat /run/mode-controller/hardware-topology.json

Expected first-boot posture:

current mode should be desktop
vLLM should not be active in desktop
studio-local-policy.service should not be active
audio-priority.service should not be active

11.6 Verify NVIDIA Before Compute Testing

Before testing compute, verify NVIDIA basics:

nvidia-smi
lsmod | grep nvidia

Do not run mode request compute from the fresh-install checklist. Compute transition testing belongs in the bring-up and transition-testing runbooks after the desktop baseline, observer, and NVIDIA runtime all look correct.