Dubnium Documentation
Use this site as the operator entrypoint for installing, bringing up, and understanding the Dubnium workstation.
Primary target: the NixOS host named workstation. WSL is a headless validation
target for shared modules and docs, not the deployed workstation.
Choose Your Path
- Installing a workstation: start with Fresh Install and Custom Installer USB.
- Bringing up an existing workstation: use First Bring-Up before transition testing.
- Enabling local inference: seed the model with Model Seeding, then follow vLLM Runtime.
- Working on persistent context: start with Persistent Context Memory Architecture, Memory Service, and the Memory Governance Contract.
- Validating in WSL: use WSL Documentation Boundary and WSL Bring-Up.
- Managing flake inputs: use Dubctl Flake Input Manager.
Current Defaults
dubnium.vllm.enable = false: vLLM is opt-in for explicit compute testing.dubnium.plano.enable = false: Plano routing is opt-in until its runtime is installed and validated.- Runtime and user secrets stay outside Nix source. See Runtime Secrets.
- User-level Home Manager configuration comes from
external/dotfiles. - Generated documentation is committed under
web/docs/. - Flake input operations use
dubctl, exposed asnix run .#dubctland installed by default on the workstation.
Install Source Contract
- Installer media labels are
DUB-ISOandDUB-SEED. - Install bootstrap uses local source from media or checkout state, not an install-time GitHub token.
- Post-install source reconciliation is explicit. See Post-Install Source Reconciliation.
Ownership Boundaries
| Owner | Responsibility |
|---|---|
| Dubnium | NixOS system config, workstation services, install media, runtime units |
| Dotfiles | Home Manager user config, user shell, user-level tool configuration |
| Runtime secret provider | Host and user secrets outside the Git-tracked Nix source |
| Model/router repos | Client policy, routing schemas, and model-router behavior |
Sanity Checks
nix flake check
sudo nixos-rebuild build --flake .#workstation
mdbook build
When building docs from Windows, run mdbook build inside the NixOS WSL distro
with mdbook and mdbook-mermaid in the shell.
Known Warnings
mdbook-mermaidmay warn about a minor mdBook version mismatch; that warning is non-fatal when the HTML backend finishes successfully.- vLLM runtime setup should avoid broad PyTorch, audio, JAX, or TPU extras unless they are explicitly required.
Start Here
Local Inference
Memory System
- Persistent Context Memory Architecture
- vLLM Persistent Memory Prototype
- Memory Service
- Memory Data Model Specification
- Memory Governance Contract
- Anthesis Memory Envelope Examples
- vLLM Memory Phase 1 Plan
- Memory Phase 2: Governed Structured Memory
Architecture
External Sources
- ryjen/dotfiles
feat/nix-migrationis checked out atexternal/dotfilesand owns user-level Home Manager configuration.
Local Docs Viewer
This repository includes mdBook config for local browsing only. mdBook is not a Dubnium OS dependency and does not need to be installed in the target system configuration.
nix shell nixpkgs#mdbook nixpkgs#mdbook-mermaid
mdbook serve --open
Generated output goes to web/docs/.
Decisions
- ADR-0001: Runtime Switching First
- ADR-0002: Studio-Local Is a Desktop Overlay
- ADR-0003: vLLM Is Compute-Only in V1
- ADR-0004: Boot Defaults to Desktop
- ADR-0005: k3s Stays Stable Across Modes in V1
- ADR-0006: Tailscale Platform Connectivity
- ADR-0007: WSL Is a Headless Validation Target
- ADR-0008: Seed Local vLLM Model Bundles
- ADR-0009: Manage Runtime Secrets Outside Nix Source
- ADR-0010: Keep Persistent Memory Separate From vLLM Runtime
- ADR-0011: External Ownership Boundaries
Runbooks
- Custom Installer USB
- First Bring-Up
- Fresh Install
- Post-Install Source Reconciliation
- Laboratory Bootstrap
- Model Seeding
- Runtime Secrets
- vLLM Runtime
- vLLM Persistent Memory Prototype
- Memory Service
- Tailscale
- Transition Testing
- Failed Transition Recovery
- Dubctl Flake Input Manager
WSL
- WSL Documentation Boundary
- Build Installer Artifacts From WSL
- WSL Bring-Up
- ADR-0007: WSL Is a Headless Validation Target
Runbook: Fresh Install
Status: living
Use this when installing Dubnium from a NixOS live USB onto a fresh machine.
Primary checklist:
Key Rules
- Decide disk layout before writing partitions.
- After booting from the USB installer, verify the tools needed to inspect or extract the prepared repo source are available.
- Because
dubniumis private, use the custom installer USB as the preferred source path instead of assuming live GitHub access will work. - The custom installer USB bakes a source export into the live image; use
unpack-dubniumto extract it to~/local/src/dubnium. - The same physical USB should carry the materialized model bundle on
DUB-SEED, so first boot does not depend on a model-provider download. - Generate
hosts/workstation/hardware-configuration.nixfrom the real target mount layout. - After first boot, reconcile install-time source changes into a normal Git checkout before treating them as repo history.
- Review host options before install.
- Boot into a desktop-default system first.
- Validate
mode statusbefore testing transitions.
First Boot Expectations
- current mode should classify as
desktop - the selected user’s Home Manager configuration should be present from the Dubnium dotfiles profile
- vLLM should not be active
studio-localoverlay services should not be active unless requested/run/mode-controllershould exist
Do not start compute testing until the desktop baseline is observable and repeatable.
Custom Installer Quick Path
If booted from the Dubnium custom installer USB:
install-dubnium-from-usb
The one-shot command partitions and formats the selected disk, unpacks the
baked source snapshot, generates the workstation hardware config, and runs
nixos-install. By default it then sets the normal user’s password inside the
installed system with passwd; use --password-mode hash for the older
host-local hash flow or --password-mode skip when another login path already
exists. With no arguments it prints lsblk, prompts for the target whole disk,
defaults to btrfs, copies the install snapshot into the installed system, and
requires final y/N confirmation unless --yes is passed.
Manual path:
unpack-dubnium
cd ~/local/src/dubnium
Then follow the fresh-install checklist from the partitioning step onward and install with:
sudo nixos-install --flake .#workstation
After first boot, restore the selected model seed from the USB model bundle as described in Model Seeding.
If the install used the custom source snapshot or another export without .git
history, follow Post-Install Source Reconciliation
before committing or pushing install-time changes.
Runbook: First Bring-Up
Status: living
Use this when the target machine already runs NixOS or can build/switch from the repo.
Primary checklist:
Success Criteria
nixos-rebuild build --flake .#workstationsucceeds.nixos-rebuild switch --flake .#workstationsucceeds.configctl doctorsucceeds.mode status,mode current, andmode desiredwork./run/mode-controllerexists and contains live state files.desktop.targetandcompute.targetexist.- vLLM is inactive in
desktop. studio-localcan be requested and removed as a desktop overlay.
Immediate Failure Buckets
- generated hardware configuration does not match the host
- NVIDIA/CUDA evaluation or runtime issue
- graphical target/session mismatch
- mode controller tools not installed
- observer reports false success or conflicting state
If mode state looks wrong, prefer fixing observation before adding transition logic.
Runbook: Custom Installer USB
Status: living
Use this when installing Dubnium from private installer media without relying on GitHub credentials during the live install.
The current Dubnium installer flow writes the custom ISO to one physical USB stick as a raw disk image, matching Rufus “DD image mode” behavior:
dubnium-installer.iso -> whole USB disk
The installer image bakes an exported source snapshot of this repo and the
external/dotfiles submodule into the live system. The snapshot excludes .git
directories, so it is source content rather than a Git working copy with
history. Treat the USB as private media because it contains the private Dubnium
source.
Model seed bundles are separate from raw USB writing. Put the materialized bundle on separate media, or build it into a future image format explicitly.
What This Provides
- no GitHub token during install
- no install-time private GitHub clone
git,jq,rsync,vim, and install helpers in the live environmentunpack-dubnium, which unpacks the baked source snapshot to:
~/local/src/dubnium
- raw whole-disk USB writing for the custom installer ISO
Build The Installer ISO
Before baking, make sure the repo and submodule state are intentionally clean or intentionally staged. The flake source snapshot only sees tracked files.
git status --short
git -C external/dotfiles status --short
scripts/build-installer-iso.sh \
--iso ./dubnium-installer.iso
By default the script ensures the current Dubnium default seed bundle
idempotently for separate seed media. The seed contract is model-agnostic: the
seed must be a materialized model directory with config.json and SHA256SUMS.
Detection first checks DUBNIUM_SEED_MODEL, then common paths beside the repo
for the current default bundle.
Use --seed-model to override detection, --no-seed-download to require a
pre-existing bundle, or --no-seed-model to build installer-only media.
The script is a wrapper around this build:
nix --extra-experimental-features 'nix-command flakes' \
build .#nixosConfigurations.installer.config.system.build.isoImage
The ISO appears under:
result/iso/
The ISO build uses Nix’s flake source snapshot and bakes that source into the
installer image. This is an export-style payload: no .git directories and no
Git history.
Create A Standalone Git Export Payload
If you want a source artifact separate from the ISO, use the git-export helper:
scripts/export-installer-source.sh dubnium-installer-source.tar.gz
The helper requires the main repo and external/dotfiles submodule to be clean.
It uses git archive for both sources and writes a payload shaped like:
dubnium/
└── external/
└── dotfiles/
This payload is useful for inspection, offline transfer, or alternate installer media. The custom ISO still bakes its own payload from the same flake source that Nix evaluates.
Verify The Baked Payload
The built payload should contain the workstation host, dotfiles submodule source, and USB helpers:
payload="$(find /nix/store -maxdepth 1 -name '*-dubnium-installer-source.tar.gz' | head -n 1)"
tar -tzf "$payload" | grep -E \
'^dubnium/(flake.nix|hosts/workstation/default.nix|external/dotfiles/flake.nix|scripts/build-installer-iso.sh|scripts/export-installer-source.sh|scripts/write-installer-usb.ps1|scripts/write-installer-usb.sh)$'
if tar -tzf "$payload" | grep -q '/\.git/'; then
echo "unexpected .git directory in payload"
exit 1
fi
Prepare The USB From Windows PowerShell
After building dubnium-installer.iso, use this helper only when preparing the
USB from Windows PowerShell. It writes the ISO bytes directly to the whole USB
disk, like Rufus DD image mode:
.\scripts\write-installer-usb.ps1 `
-IsoPath .\dubnium-installer.iso `
-DiskNumber 7 `
-ExpectedFriendlyName "USB SanDisk 3.2Gen1"
The script refuses to continue unless the selected disk is the expected USB
device. It overwrites the whole disk with the ISO image. -SeedModelPath is
intentionally rejected in raw mode because there is no separate writable seed
partition to copy into.
After writing, eject and reinsert the USB if Windows does not refresh the new ISO layout immediately. Verify the installer media from whichever drive letter Windows assigns:
Test-Path I:\EFI\BOOT\BOOTX64.EFI
Test-Path I:\nix-store.squashfs
Get-Volume -DriveLetter I
Prepare The USB From macOS Or Linux
The Bash helper performs the same raw whole-disk image write on macOS or Linux. Pass the whole USB disk, not a partition.
Linux example:
lsblk -o NAME,SIZE,MODEL,TRAN,TYPE,MOUNTPOINTS
scripts/write-installer-usb.sh \
--iso ./dubnium-installer.iso \
--disk /dev/sdX \
--expected SanDisk
macOS example:
diskutil list
diskutil info /dev/diskN
scripts/write-installer-usb.sh \
--iso ./dubnium-installer.iso \
--disk /dev/diskN \
--expected SanDisk
The script refuses to write non-removable media, requires the selected device
identity to contain --expected when provided, and asks for
y at the Proceed? [y/N]: prompt before erasing the disk unless --yes is passed.
Optional One-Shot Wrappers
The older wrappers still exist for convenience, but they are not the preferred boundary:
.\scripts\build-installer-usb.ps1 `
-DiskNumber 7 `
-ExpectedFriendlyName "USB SanDisk 3.2Gen1"
bash scripts/build-installer-usb.sh \
--disk /dev/sdX \
--expected SanDisk
Use the Bash one-shot path only when the whole USB disk is visible inside the Linux environment.
Seamless USB Acceptance Check
Before leaving the build machine, verify the USB contains everything needed for a token-free install:
EFI/BOOT/BOOTX64.EFI
nix-store.squashfs
The install path should not require:
- a GitHub token
- a private SSH key
- a Hugging Face download during install when separate model seed media is used
- copying model weights into the Dubnium Git tree
Keep the USB physically private. It contains private source code in the installer payload.
Add The Model Seed Bundle
Do not copy the raw Hugging Face cache directory as the seed. The cache uses
refs, blobs, snapshots, and symlinks. Seed media should contain a normal
local model bundle.
Use separate writable media for the model seed bundle. Mount that media and copy a materialized model directory:
sudo mkdir -p /mnt/e
sudo mount -t drvfs E: /mnt/e
sudo mkdir -p /mnt/e/models
sudo rsync -a --info=progress2 \
/path/to/selected-model-bundle/ \
/mnt/e/models/selected-model-bundle/
Expected seed path:
models/selected-model-bundle/
Lightweight bundle check:
test -f /mnt/e/models/selected-model-bundle/config.json
test -f /mnt/e/models/selected-model-bundle/SHA256SUMS
See Model Seeding for creating the bundle and checksum manifest.
Install From The USB
Boot the target machine from the USB. Prefer the UEFI entry for the Dubnium installer USB.
For the guarded one-shot path, run the helper with no arguments:
install-dubnium-from-usb
The helper prints lsblk, prompts for the target whole disk, and then prompts
for install options. Defaults are btrfs for the root filesystem,
dubnium for the Home Manager machine profile, passwd for password setup,
and copying the install snapshot to /root/dubnium-install-snapshot in the
installed system.
This command erases the selected whole disk, unpacks the baked source snapshot,
generates hosts/workstation/hardware-configuration.nix, and runs:
sudo nixos-install --flake .#workstation
Use --dry-run to print the plan without touching disks. Use --user USER to
write hosts/workstation/user.nix before install. Use
--home-profile dubnium|technetium to select the Home Manager machine profile
that installs the matching ~/.config/hypr/adopted.d/machine.conf. Use
--password-mode hash to write a host-local initial password hash before
install, or --password-mode skip when another login path already exists; the
default passwd mode sets the password inside the installed system after
nixos-install. Use --no-copy-source if you do not want the install snapshot
preserved for post-install reconciliation.
The one-shot command still prints the plan and requires final confirmation:
Proceed? [y/N]:
Use --yes only for rehearsed installs where the disk identity was already
verified.
Manual path:
In the live installer terminal:
unpack-dubnium
cd ~/local/src/dubnium
Confirm the baked source exists:
test -f flake.nix
test -f hosts/workstation/default.nix
test -f external/dotfiles/flake.nix
Then continue the fresh-install flow from the local checkout:
sudo nixos-install --flake .#workstation
The workstation target imports the Dubnium Home Manager module from
external/dotfiles, so the Dubnium dotfiles profile is applied to the selected
normal user as part of the system install.
Install For Another User
To choose the installed normal user, create hosts/workstation/user.nix in the
unpacked source before running nixos-install:
{
dubnium.user.name = "alice";
dubnium.user.description = "Example User";
}
Then install normally:
sudo nixos-install --flake .#workstation
The same dotfiles Dubnium Home Manager profile is applied to the selected user.
The profile source lives in the dotfiles submodule, but the username and home
directory are supplied by dubnium.user.name.
unpack-dubnium --user USER only changes where the source is unpacked in the
live installer session. It does not change the installed NixOS user; use
dubnium.user.name for that.
After First Boot
After the installed system boots, seed the vLLM model store from the bundle on separate seed media and verify the checksum manifest before starting compute mode. See Model Seeding for the exact restore commands.
What Not To Put On The USB
Avoid storing:
- long-lived private SSH keys
- reusable GitHub credentials
- generated age identity files
- decrypted SOPS files
- model weights inside the Git repo or ISO payload
- raw Hugging Face cache directories as the seed shape
The source snapshot and a separate materialized model bundle are enough for this installer flow.
Model Seeding
Dubnium keeps model weights out of Git and out of the Nix store. Nix owns the
runtime policy and vLLM service definition; model bytes are runtime data under
/var/lib/dubnium/models.
The workstation configuration selects a vLLM model, but the USB seed format does
not depend on one specific model. Use the configured model’s local bundle name
where the examples say selected-model-bundle.
The installed workstation serves a local model bundle from:
/var/lib/dubnium/models/selected-model-bundle
This avoids depending on the Hugging Face hub cache layout at runtime. The USB seed carries a normal directory of model files plus a checksum manifest.
Runtime Model Store
Dubnium creates:
/var/lib/dubnium/models
The workstation vLLM service passes this local path to vllm serve:
/var/lib/dubnium/models/selected-model-bundle
Do not commit model weights to the Dubnium repo. Do not put model weights inside the Nix store or custom ISO payload.
USB Seed Layout
Use a stable USB layout so the same seed can be used during fresh install, recovery, or rebuild:
DUB-SEED/
└── models/
└── selected-model-bundle/
├── config.json
├── generation_config.json
├── model-00001-of-000NN.safetensors
├── model-00002-of-000NN.safetensors
├── model.safetensors.index.json
├── tokenizer.json
├── tokenizer_config.json
├── vocab.json
├── merges.txt
├── LICENSE
├── README.md
└── SHA256SUMS
The exact file set may vary by model revision, but the directory must be a
materialized model snapshot, not a Hugging Face refs / blobs / snapshots
cache tree.
Create A Local Bundle
If the source already exists as a normal model directory, copy it directly to the seed partition:
mkdir -p /run/media/$USER/DUB-SEED/models
rsync -a --info=progress2 \
/path/to/selected-model-bundle/ \
/run/media/$USER/DUB-SEED/models/selected-model-bundle/
Preferred source: a materialized model directory from a trusted local store or previously prepared artifact. Do not make the fresh install depend on Hugging Face availability.
Legacy fallback: if the only available source is an existing Hugging Face cache on the build machine, materialize the current snapshot once by following symlinks. This is a build-machine preparation step, not an install-time dependency:
MODEL_CACHE=/var/lib/vllm/.cache/huggingface/hub/models--OWNER--MODEL
REVISION="$(cat "$MODEL_CACHE/refs/main")"
mkdir -p /run/media/$USER/DUB-SEED/models/selected-model-bundle
rsync -aL --info=progress2 \
"$MODEL_CACHE/snapshots/$REVISION/" \
/run/media/$USER/DUB-SEED/models/selected-model-bundle/
Then create the checksum manifest:
cd /run/media/$USER/DUB-SEED/models/selected-model-bundle
find . -type f ! -name SHA256SUMS -print0 \
| sort -z \
| xargs -0 sha256sum \
> SHA256SUMS
Seed From USB
After the workstation has booted into NixOS and the USB is mounted, copy the bundle into the Dubnium model store:
sudo mkdir -p /var/lib/dubnium/models
sudo rsync -a --info=progress2 \
/run/media/$USER/DUB-SEED/models/selected-model-bundle/ \
/var/lib/dubnium/models/selected-model-bundle/
sudo chown -R root:root /var/lib/dubnium/models/selected-model-bundle
Adjust the mount path if the USB is mounted somewhere else.
Verify the checksum manifest:
cd /var/lib/dubnium/models/selected-model-bundle
sudo sha256sum -c SHA256SUMS
Then verify the local model path exists:
test -f /var/lib/dubnium/models/selected-model-bundle/config.json
test -f /var/lib/dubnium/models/selected-model-bundle/model.safetensors.index.json
Acceptance Check
After seeding, switch to compute only when normal bring-up preconditions are satisfied:
sudo mode request compute
systemctl status vllm.service
journalctl -u vllm.service -b
The first start should load the local model path. If vLLM tries to fetch model files from the network, the model argument or bundle location is wrong.
Runbook: vLLM Runtime
Status: living
Use this when Dubnium’s NixOS configuration manages vllm.service, but the
vLLM Python/CUDA runtime is installed outside the Nix store.
NixOS owns:
vllm.service/var/lib/vllm/var/lib/dubnium/modelsCUDA_VISIBLE_DEVICESai.dubnium- Tailscale-only firewall exposure
The external runtime owns:
/var/lib/vllm/venv- Python, PyTorch, vLLM, and CUDA wheel packages inside that venv
This keeps rebuilds fast and avoids compiling PyTorch, CUDA, CuPy, MAGMA,
OpenCV CUDA, or vLLM during nixos-rebuild.
Scope
This runbook covers the current hybrid-Nix phase. NixOS is authoritative for
the service contract, host alias, firewall exposure, users, directories,
environment, and health checks. The Python/CUDA package runtime is mutable
operator-managed state under /var/lib/vllm/venv.
A pure-Nix vLLM runtime is a separate later phase. That phase should be treated as build-infrastructure work: it likely needs a dedicated CUDA builder, an Attic/Cachix/nix-serve cache, or an upstream Nixpkgs packaging path that avoids rebuilding the full CUDA/PyTorch/vLLM stack on every workstation.
Preconditions
- the host has been switched to a Dubnium generation with
dubnium.vllm.runtime = "external" uvis available in the operator shell- NVIDIA GPU access works on the host
- model weights are already seeded under
/var/lib/dubnium/models
Check GPU visibility first:
nvidia-smi
1. Create The Runtime Directory
sudo install -d -m 0755 -o root -g root /var/lib/vllm
sudo install -d -m 0755 -o root -g root /var/lib/dubnium/models
The NixOS module also declares these directories. These commands are safe to
run before or after nixos-rebuild switch.
2. Install vLLM Into The Managed venv
Create a fresh venv:
sudo uv venv --python /run/current-system/sw/bin/python3.12 --python-preference only-system /var/lib/vllm/venv
Install vLLM with CUDA/PyTorch wheels selected by uv:
sudo env UV_TORCH_BACKEND=auto uv pip install --python /var/lib/vllm/venv/bin/python vllm
This is intentionally the only default install command. Do not install audio,
JAX, TPU, or broad framework extras during workstation bring-up. In particular,
avoid commands that reinstall torchvision, torchaudio, or jax unless a
specific workload requires them and the host has enough memory to resolve,
download, install, and import that dependency set. The default Dubnium vLLM
path is text inference against a local model bundle.
The upstream vLLM GPU install docs recommend uv pip install vllm --torch-backend=auto so uv can select the PyTorch backend from the installed
CUDA driver. If that flag is not supported by the installed uv, use the
environment variable form above or update uv.
If the installed uv supports newer PyTorch backends, use a specific CUDA
backend that matches the host driver. For CUDA 13.0:
sudo uv pip install --python /var/lib/vllm/venv/bin/python --torch-backend=cu130 vllm
Some packaged uv versions may not list cu130 yet. On those versions, keep
the default install command above, or upgrade uv to a version that supports
the host CUDA backend. Do not use a broad PyTorch-family reinstall as a
workstation bring-up workaround; it can pull optional packages such as
torchaudio and exceed available memory.
If PyTorch CUDA selection is wrong after the default install, recreate the venv
and rerun the vLLM install with a supported UV_TORCH_BACKEND or
--torch-backend value rather than layering more framework packages into the
same environment.
Host config adds the venv’s PyTorch and NVIDIA wheel library directories to
LD_LIBRARY_PATH. That is required because the external venv is outside the Nix
store and vLLM’s CUDA extension must be able to find libtorch, libcudart,
and the CUDA wheel libraries at runtime.
The service also sets CC to Nix’s C compiler wrapper. Triton may compile a
small runtime helper during vLLM startup even when vLLM itself is installed in
the external venv.
Keep dubnium.vllm.runtime = "package" available for the future pure-Nix
phase, but do not use it for this external-runtime path.
3. Verify The Runtime
Check the executable:
/var/lib/vllm/venv/bin/vllm --version
Check CUDA through PyTorch:
/var/lib/vllm/venv/bin/python -c "import torch; print(torch.cuda.is_available())"
Expected:
True
If this prints False, fix the venv/PyTorch/CUDA wheel selection before
debugging Dubnium’s systemd service.
4. Verify The Local Model Bundle
Dubnium keeps model weights out of Git and out of the Nix store. The vLLM service should point at a local model bundle.
MODEL_DIR=/var/lib/dubnium/models/qwen2.5-coder-14b-instruct
If the model bundle was seeded from removable media, verify that the local bundle exists:
test -f "$MODEL_DIR/config.json"
test -f "$MODEL_DIR/model.safetensors.index.json" || test -f "$MODEL_DIR/model.safetensors"
If SHA256SUMS exists, verify it:
cd "$MODEL_DIR"
sudo sha256sum -c SHA256SUMS
If vLLM tries to download model files on first start, the configured model path or local bundle is wrong.
5. Start The Service
Start compute mode or restart the service directly:
sudo systemctl start compute.target
sudo systemctl restart vllm.service
Inspect service state:
systemctl status vllm --no-pager
journalctl -u vllm -n 100 --no-pager
systemctl show vllm.service -p ExecStart --value
systemctl show vllm.service -p Environment --value
If /var/lib/vllm/venv/bin/vllm does not exist or is not executable,
vllm.service should fail before startup with an executable check error. That
means the NixOS service contract is present but the external runtime has not
been installed yet.
6. Verify The API
From the Dubnium host:
getent hosts ai.dubnium
curl http://ai.dubnium:8000/v1/models
From another tailnet machine:
curl http://<dubnium-tailnet-name>:8000/v1/models
ai.dubnium is host-local unless the tailnet DNS or client hosts file also
maps that name to the Dubnium node’s Tailscale IP.
References
- vLLM GPU installation docs: https://docs.vllm.ai/en/latest/getting_started/installation/gpu/
- Model seeding policy: ADR-0008
- Tailscale exposure: Tailscale
Plano routing gateway
Dubnium owns the system/runtime side of Plano. User-level client configuration lives in ryjen/dotfiles through Home Manager modules.
Boundary
Dubnium
systemd service lifecycle
compute target integration
vLLM/Ollama local model endpoint
ai.slice placement
runtime state under /var/lib and /var/cache
ryjen/dotfiles
Home Manager user config
~/.config/planoai/dubnium.yaml
~/.config/model-router/profiles/local-first-dev.yaml
shell environment and helper scripts
ryjen/model-router
source policy schemas
route-decision record semantics
governance-oriented model-router design
Service model
The Plano workload module is defined at:
modules/workloads/plano.nix
It creates:
plano.service
When enabled, the service is attached to:
compute.target
ai.slice
It is intentionally disabled by default in hosts/workstation/default.nix.
Defaults
dubnium.plano = {
enable = false;
runtime = "external";
externalExecutable = "/var/lib/plano/venv/bin/planoai";
host = "127.0.0.1";
port = 12000;
localBaseUrl = "http://127.0.0.1:8000/v1";
exposeOnTailscale = false;
};
The default local model endpoint assumes vllm.service is serving an OpenAI-compatible API on port 8000.
Enablement
Enable once the Plano executable exists:
dubnium.plano.enable = true;
For the current external runtime default, verify:
test -x /var/lib/plano/venv/bin/planoai
If Plano becomes available as a Nix package or overlay, switch to:
dubnium.plano = {
enable = true;
runtime = "package";
package = pkgs.<plano-package>;
};
Validation
Dry-build the workstation target:
sudo nixos-rebuild build --flake .#workstation
Then inspect the generated unit:
systemctl cat plano.service
When enabled and in compute mode:
sudo mode request compute
systemctl status vllm.service
systemctl status plano.service
Check the gateway endpoint:
curl http://127.0.0.1:12000
The exact health endpoint may differ depending on Plano’s runtime API.
Security notes
- Keep
exposeOnTailscale = falseuntil the gateway behavior is validated - Do not store cloud provider secrets in the generated config
- Prefer environment files managed by sops-nix or another host secret provider
- Treat Plano as routing infrastructure, not an authorization layer
- Privacy and route policy belong above the gateway in model-router/Anthesis semantics
Failure behavior
The service fails closed if the configured Plano executable is missing because ExecStartPre checks that the executable exists.
Fallback between models must not bypass privacy, budget, safety, or approval failures. Those are policy failures, not operational retry events.
Persistent Context Memory Architecture
Status: planning
This document describes the long-term persistent context memory architecture for Dubnium’s local vLLM runtime.
Goals
The architecture should:
- support long-lived conversational and agentic workflows
- preserve low-latency vLLM inference characteristics
- separate inference runtime concerns from memory persistence
- expose enough structure for replay, audit, and policy enforcement
- operate efficiently on constrained local GPU hardware
- leave room for Anthesis-style governed agent systems
Future Governance Boundary
A future governance layer remains external to this memory/runtime architecture.
The memory/runtime layer stores, retrieves, summarizes, compacts, and serves context. It records structured metadata and lifecycle events so another layer can inspect, constrain, attest, or replay behavior later.
The future governance layer evaluates policy, provenance, trust, retention, audit, and replay concerns. This document does not define that governance authority.
Dubnium memory/runtime layer
= stores, retrieves, summarizes, compacts, and serves context
Future governance layer
= evaluates policy, provenance, trust, retention, audit, and replay concerns
Design implication: memory records, artifacts, retrieval events, and runtime transitions must be structured and externally observable, but vLLM, vector stores, artifact stores, and MemGPT-style runtimes must not depend directly on a future governance substrate.
Core Principle
vLLM is the inference runtime.
Persistent memory is a separate subsystem.
Do not persist transformer KV state as durable memory. KV state can remain an inference optimization inside vLLM. Durable memory must be reconstructable from stored events, summaries, artifacts, metadata, and retrieval records.
flowchart TD
U[User or Agent] --> O[Orchestrator]
O --> W[Working Context Buffer]
O --> R[Retriever]
O --> T[Task State Store]
R --> V[(Vector Store)]
R --> M[(Structured Memory Store)]
O --> L[vLLM]
L --> S[Summarizer]
S --> E[Embedding Pipeline]
E --> V
S --> M
Layers
Inference
Responsibilities:
- token generation
- batching
- prefix caching
- streaming
- model lifecycle management
Recommended components:
| Component | Recommendation |
|---|---|
| Inference runtime | vLLM |
| Primary models | Qwen, DeepSeek, Llama-family |
| Embeddings | bge-small or nomic-embed |
| Quantization | AWQ or GPTQ initially |
Inference nodes should remain stateless where possible. Durable memory logic does not belong inside inference workers.
Working Context
Working context maintains immediate conversational and task continuity.
It contains recent messages, tool outputs, current objectives, active plans, and unresolved references.
Storage options:
| Option | Use |
|---|---|
| Redis | fast transient sessions |
| SQLite | single-user local setups |
| Postgres | unified durable stack |
Recommended strategy:
- keep the last N conversational turns verbatim
- keep a rolling summary for older turns
- keep external references outside the prompt
Episodic Memory
Episodic memory stores meaningful historical interactions, such as debugging sessions, deployment history, design discussions, incidents, and user preferences.
Example shape:
{
"id": "uuid",
"timestamp": "ISO8601",
"session_id": "uuid",
"memory_type": "episodic",
"summary": "Condensed interaction summary",
"importance": 0.82,
"ttl": null,
"source": "conversation",
"provenance": {
"model": "qwen",
"extractor_version": "1"
}
}
Semantic Memory
Semantic memory stores normalized stable facts and reusable knowledge: infrastructure topology, user preferences, architecture decisions, project conventions, and coding standards.
Semantic memory is not raw transcript storage.
Instead of storing “user mentioned NixOS several times”, store:
{
"fact": "Primary workstation uses NixOS",
"confidence": 0.94,
"scope": "personal-preference"
}
Task State
Task state is active execution state, not conversational memory.
Examples:
- queued work
- workflow checkpoints
- active RFC generation
- agent plans
- unresolved actions
- execution graphs
Task state should be strongly structured. Do not embed executable workflow state inside vector stores.
| Component | Recommendation |
|---|---|
| Structured store | Postgres |
| Queueing | RabbitMQ or Redis Streams |
| Workflow engine | Temporal later |
Retrieval
Retrieval responsibilities:
- semantic search
- scoped retrieval
- ranking
- filtering
- relevance compression
flowchart LR
Q[Query] --> E[Embed Query]
E --> S[Vector Search]
S --> R[Re-ranker]
R --> C[Context Builder]
Retrieval constraints:
| Constraint | Example |
|---|---|
| Session scope | only current project |
| TTL | exclude expired memories |
| Agent boundary | isolate agents |
| Recency weighting | prioritize recent events |
The orchestrator constrains retrieval scope and memory assembly. Future governance can inspect the retrieval event stream and stored metadata, but the retriever must remain useful without embedding a governance engine.
Minimal Stack
| Concern | Technology |
|---|---|
| Inference | vLLM |
| Structured data | Postgres |
| Vector search | pgvector |
| Session cache | Redis |
| Object storage | local filesystem first, MinIO later |
| Queueing | Redis Streams first, RabbitMQ later |
Artifact And Binary Memory
Artifacts and memory are distinct concepts.
| Concept | Meaning |
|---|---|
| Memory | semantic or cognitive abstraction |
| Artifact | raw external object |
| Evidence | immutable referenced source |
| Context | transient prompt state |
| Knowledge | validated normalized facts |
Raw binaries should not be first-class prompt memory. Binaries remain externalized, semantic extraction feeds retrieval systems, agents retrieve references and derived context, and multimodal inference runs on demand.
Initial artifact types:
| Type | Examples |
|---|---|
| Images | screenshots, whiteboards, diagrams |
| Documents | PDFs, Office docs |
| Audio | recordings, meetings |
| Video | demos, walkthroughs |
| Source bundles | archives, repos |
| Logs | runtime and system logs |
| Structured data | CSV, JSON, YAML |
flowchart TD
A[Artifact Upload] --> B[Object Storage]
A --> C[Extraction Pipeline]
C --> D[OCR]
C --> E[Captioning]
C --> F[Metadata Extraction]
C --> G[Embedding Generation]
D --> H[Semantic Records]
E --> H
F --> H
G --> H
H --> I[(Vector Store)]
H --> J[(Structured Metadata Store)]
Artifact metadata should include content hashes, storage URIs, MIME type, derived captions or OCR, embedding references, provenance, trust hints, and sensitivity hints.
Binary artifacts create operational risk: screenshots can contain credentials, EXIF metadata can leak location, visual data can be sensitive, retrieved artifacts can amplify exposure, and malicious files can poison extraction pipelines. Those controls belong in the external governance/security layer, but the memory layer must expose enough metadata and hooks for them.
Multimodal Retrieval
For normal text prompts, retrieve captions, OCR text, semantic embeddings, metadata, and artifact references rather than injecting raw binaries.
When multimodal reasoning is required:
- Semantic retrieval locates relevant artifacts.
- Artifact references are resolved.
- Binaries are attached to VLM requests.
- Multimodal inference runs on demand.
Candidate model classes:
| Model | Purpose |
|---|---|
| Qwen-VL | local multimodal reasoning |
| CLIP or SigLIP | image-text embeddings |
| Whisper | audio transcription |
| OCR pipelines | document extraction |
OCI-Compatible Future
Dubnium should stay compatible with OCI-style cognition and artifact distribution.
OCI registries are a strong long-term fit for content addressing, distribution, deduplication, signing, provenance layering, immutable references, artifact versioning, and registry federation.
Candidate future artifact classes:
| Artifact class | Example |
|---|---|
| Model artifacts | GGUF, safetensors |
| Embedding indexes | vector snapshots |
| Prompt bundles | governed prompts and system policies |
| Memory bundles | exported episodic memory sets |
| Workflow definitions | agent workflows |
| Execution traces | replayable sessions |
| Multimodal artifacts | image, document, and audio evidence |
| Tool contracts | MCP capability manifests |
Long-term direction:
OCI artifact
= versioned governed cognition object
This allows Dubnium to evolve toward replayable cognition, portable agent state, attestable workflows, signed memory exports, reproducible multimodal sessions, and distributed cognition registries without coupling cognition storage to one database implementation.
MemGPT-Style Runtime Evolution
MemGPT-style runtimes remain an incremental upgrade path after the persistent memory substrate is stable. Current Letta documentation describes this lineage as agents with in-context core memory, recall memory, archival memory, and self-editing memory tools.
Do not couple Dubnium directly to Letta or MemGPT internals early. Define stable interfaces first:
class MemoryRuntime:
def retrieve(...): ...
def summarize(...): ...
def compact(...): ...
def promote(...): ...
def classify(...): ...
Evolution path:
| Phase | Capability |
|---|---|
| 1 | governed retrieval with explicit schemas |
| 2 | rolling summaries, compaction, and bounded working context |
| 3 | reflection, summarization loops, memory promotion, relevance scoring |
| 4 | adaptive retrieval, workflow-aware recall, retrieval planning |
| 5 | portable cognitive runtime artifacts and OCI-packaged memory overlays |
Preserve the distinction between runtime cognition and durable external state. MemGPT-style runtimes should remain replaceable, capability-scoped, inspectable, and externally configurable.
Phases
Phase 1: Minimal Viable Memory
Deliver durable conversation storage, semantic retrieval, basic summarization, Postgres plus pgvector, an embedding pipeline, retrieval API, and rolling conversation summaries.
Phase 2: Structured Memory
Deliver episodic and semantic separation, retrieval filtering, scoped namespaces, metadata tagging, and confidence scoring.
Phase 3: Multi-Agent Coordination
Deliver isolated agent memory, shared collaborative memory, workflow continuity, capability-scoped retrieval, memory federation, execution checkpoints, and task orchestration.
Non-Goals
Avoid initially:
- serialized GPU KV persistence
- distributed GPU cache coherence
- infinite-context simulation
- recurrent-memory transformer experimentation
- fully autonomous self-modifying memory
These add substantial complexity and operational instability.
First Milestone
Build a local prototype with:
- vLLM
- Qwen coder model
- Postgres
- pgvector
- Redis
- bge-small embeddings
- retrieval middleware
- rolling summaries
Then validate latency, retrieval quality, memory drift, and hallucinated recall before expanding into multi-agent memory systems.
Runbook: vLLM Persistent Memory Prototype
Status: planning
Use this when designing or validating a Dubnium memory subsystem around the local vLLM runtime.
vLLM owns inference. The memory subsystem owns persistence, retrieval, summarization, compaction, artifact references, and replay inputs. Do not make durable memory depend on serialized transformer KV state.
Scope
This runbook covers the first prototype milestone:
- durable conversation and event storage
- rolling summaries
- embeddings for retrieval
- scoped retrieval
- externally observable metadata on every stored memory
- bounded prompt assembly for vLLM
It does not cover multi-agent federation, distributed workflow engines, cryptographic memory attestation, or a pure-Nix packaging path for all services. It also does not adopt Letta or another MemGPT-style agent framework in the first milestone; those belong after the local storage, retrieval, and governance contracts are proven.
Future governance remains external to this runbook. The prototype records metadata and lifecycle events so a later governance substrate can inspect, constrain, attest, or replay behavior, but the prototype does not implement the governance authority itself.
Target Shape
flowchart TD
U[User or Agent] --> O[Orchestrator]
O --> W[Working Context]
O --> R[Retriever]
O --> T[Task State]
R --> V[(pgvector)]
R --> M[(Postgres Memory Tables)]
O --> L[vLLM]
L --> S[Summarizer]
S --> E[Embedding Worker]
E --> V
S --> M
Prototype Components
Use conservative local services first:
| Concern | Prototype choice |
|---|---|
| Inference | existing vllm.service |
| Structured store | Postgres |
| Vector search | pgvector |
| Working context | Redis or Postgres |
| Queueing | Redis Streams initially |
| Object storage | local filesystem first, MinIO later |
| Embeddings | bge-small or nomic-embed |
Keep large artifacts outside prompt assembly. Store references to files, logs, and generated outputs, then retrieve and compress only the relevant excerpts.
Data Classes
Working context is transient session state: recent messages, current objective, active plan, unresolved references, and recent tool outputs.
Episodic memory records meaningful historical interactions, such as debugging sessions, deployment history, design discussions, and operational incidents.
Semantic memory records normalized facts, preferences, project conventions, infrastructure topology, and architecture decisions. Do not treat raw transcripts as semantic memory.
Task state records active workflow state: queued work, checkpoints, execution graphs, pending validations, and unresolved actions.
Metadata records where a memory came from, how trusted it appears, how sensitive it appears, how long it should live, and which scopes may retrieve it. A later governance layer can evaluate that metadata, but the Phase 1 memory service only records and exposes it.
Minimum Schema Direction
The first schema should keep memory objects and embeddings separate so memory metadata can evolve without rewriting vector payloads.
Suggested tables:
sessionsmemoriesmemory_embeddingstasksartifactsprovenance
Each memory row should include:
{
"id": "uuid",
"session_id": "uuid",
"memory_type": "episodic",
"summary": "Condensed interaction summary",
"scope": "project:dubnium",
"importance": 0.82,
"confidence": 0.76,
"sensitivity": "internal",
"validation_status": "unverified",
"ttl": null,
"source": "conversation",
"created_at": "ISO8601",
"provenance": {
"origin": "agent",
"model": "qwen",
"extractor_version": "1"
}
}
Retrieval Contract
The retriever should take a scoped request from the orchestrator and return scoped context candidates, not final prompts.
Required filters:
- project or session scope
- agent namespace
- TTL expiration
- recency
Recommended ranking inputs:
- vector similarity
- keyword match
- recency
- importance
- source authority
- validation status
The context builder should compress results before prompt assembly and preserve citations, artifact references, retrieval event ids, or memory ids so a response can be audited later.
Storage Path
- Capture a conversation, tool event, task event, or artifact reference.
- Classify the event and reject data that should not become durable memory.
- Redact secrets and sensitive payloads.
- Summarize the event into a typed memory candidate.
- Attach provenance, sensitivity, scope, confidence, and retention metadata.
- Embed the memory summary.
- Store structured memory and vector data.
- Schedule expiration or revalidation when retention metadata requires it.
Retrieval Path
- Receive a query and current task scope from the orchestrator.
- Embed the query.
- Search the vector index and any structured filters.
- Apply scope, TTL, and sensitivity filters before re-ranking.
- Re-rank by relevance, recency, importance, and source hints.
- Compress selected context.
- Return context candidates with ids, scope, and provenance.
- Assemble the final vLLM prompt outside the retriever.
Validation Checks
Before treating the prototype as useful, test:
- latency impact on vLLM request path
- recall quality for prior sessions
- false recall and hallucinated-memory rate
- memory poisoning resistance
- prompt-injection persistence resistance
- cross-project and cross-agent isolation
- secret redaction before storage
- TTL expiration and revalidation behavior
- replay from stored events and memory ids
Acceptance Criteria
The first milestone is complete when:
- vLLM can answer with retrieved context without changing
vllm.service - memory storage survives service restart
- retrieval can be scoped to one project
- expired or sensitive memories are excluded from prompt assembly
- summaries can be traced back to source events or artifacts
- a replay can reconstruct which memories were available to a response
Artifact Handling
Artifacts are not memory. Store raw binaries outside prompts and retrieve derived context by default:
- captions
- OCR text
- extracted metadata
- embeddings
- content hashes
- artifact references
Use on-demand multimodal inference only when a task needs the binary itself. The retrieval result should carry an artifact reference rather than copying the artifact into ordinary text prompt memory.
Incremental Upgrade: MemGPT / Letta
After the Phase 1 substrate is stable, evaluate MemGPT-style self-editing memory as an orchestration-layer upgrade. Use current Letta documentation when testing concrete framework integration; reserve “MemGPT” for the research pattern unless a legacy component explicitly uses that name.
The evaluation should answer:
- whether Letta can use Dubnium’s Postgres/pgvector-backed memory stores without bypassing scope, sensitivity, TTL, validation, or provenance filters
- whether agent-managed memory edits can be audited and replayed
- whether archival and recall memory operations can preserve Dubnium memory ids and source lineage
- whether the framework can call local vLLM without requiring model-hosted memory persistence
- whether rejected, expired, or sensitive memories stay out of generated prompts
Do not adopt the framework if it requires storing ungoverned transcripts, credentials, or tool outputs in durable memory.
References
Runbook: Memory Service
Status: prototype
Use this after explicitly enabling dubnium.memory.enable = true for the workstation host. The memory service is intentionally opt-in during Phase 1 so first bring-up does not automatically start additional persistent services.
The memory service is the local persistent context substrate for Dubnium. It does not govern agent behavior by itself. Anthesis or another orchestrator should authorize retrieval, inspect provenance, and decide whether retrieved memory may be injected into an agent prompt.
Service Boundary
Anthesis / orchestrator
-> Dubnium memory API
-> Postgres + pgvector
-> Redis working context / queue substrate
-> vLLM prompt assembly outside the memory service
The API must remain bound to 127.0.0.1 for the Phase 1 prototype.
Service Impact
Enabling dubnium.memory starts additional local services:
postgresql.serviceredis-dubnium-memory.servicedubnium-memory-api.service
It also runs packaged memory-service migrations before the API starts. Validate the package and module evaluation before enabling this on the bare-metal workstation target.
Enable Locally
The default workstation target keeps the memory service disabled. Enable it through a host-local override such as hosts/workstation/user.nix:
{
dubnium.memory = {
enable = true;
api.host = "127.0.0.1";
api.port = 8090;
retention.defaultTtlDays = null;
};
}
Then build before switching:
nix --extra-experimental-features "nix-command flakes" build .#memory-service
sudo nixos-rebuild build --flake .#workstation
Verify Disabled Default
Without a host-local override, the workstation target should keep the prototype disabled:
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.dubnium.memory.enable
Expected:
false
Verify Enabled Configuration
After enabling through hosts/workstation/user.nix, verify:
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.dubnium.memory.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.dubnium.memory.api.host
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.services.postgresql.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.services.redis.servers.dubnium-memory.enable
Expected:
true
"127.0.0.1"
true
true
Verify Services
After switching an enabled configuration:
systemctl status postgresql
systemctl status redis-dubnium-memory
systemctl status dubnium-memory-api
ai-memory health
Expected health response:
{
"status": "ok"
}
Raw HTTP is also available for debugging:
curl http://127.0.0.1:8090/healthz
Scope Convention
Use explicit scope prefixes for new memory rows:
personal:
project:
session:
agent:
workflow:
Examples:
project:dubnium
session:11111111-1111-4111-8111-111111111111
agent:anthesis-reviewer
workflow:memory-phase-2
The current implementation provides advisory scope helpers. Full runtime enforcement is intentionally deferred until existing callers and examples are migrated.
CLI Smoke Test
Store one memory:
ai-memory store --file docs/examples/memory-store-request.json
Retrieve scoped memory:
ai-memory retrieve \
--query "What is Dubnium memory for?" \
--scope project:dubnium \
--require-verified \
--purpose review \
--actor-type agent \
--actor-id anthesis-reviewer \
--envelope-id env-manual-smoke-test
Inspect retrieval events:
ai-memory events
Expire old memories:
ai-memory expire --now 2026-05-28T00:00:00Z
Use a non-default API URL when needed:
ai-memory --url http://127.0.0.1:8090 health
API Smoke Test
The CLI is preferred for operator use. Raw HTTP examples are kept for debugging and automation parity.
Store one memory:
curl -sS http://127.0.0.1:8090/memory/store \
-H 'Content-Type: application/json' \
-d @docs/examples/memory-store-request.json
Retrieve scoped memory:
curl -sS http://127.0.0.1:8090/memory/retrieve \
-H 'Content-Type: application/json' \
-d '{
"query": "What is Dubnium memory for?",
"scope": "project:dubnium",
"allowed_sensitivity": ["internal"],
"require_verified": true,
"limit": 8
}'
Inspect retrieval events:
curl -sS http://127.0.0.1:8090/memory/retrieval-events
Retrieval Behavior
Normal retrieval excludes memory when:
- scope does not match the request
- sensitivity is not explicitly allowed
require_verifiedis true and memory is notverified- memory is expired by TTL
- memory has
validation_status = rejected
Rejected memory is excluded even when require_verified = false. Audit retrieval of rejected memory is future work and should use a separate endpoint or explicit audit mode.
Security Checks
- API binds to
127.0.0.1 - raw vLLM remains separate from durable memory
- memory rows include scope, sensitivity, validation status, source, and provenance
- expired memories are excluded from retrieval
- rejected memories are excluded from normal retrieval
- sensitive memories are excluded unless explicitly allowed
- retrieval events record returned memory ids and artifact ids
- logs must not contain raw token-like values
- prompt assembly must happen outside the memory service
Anthesis Governance Hook
Phase 1 does not implement Anthesis directly. The intended integration contract is:
- Anthesis classifies the task and authorizes retrieval scope
- Anthesis calls
/memory/retrievewith explicitscope,allowed_sensitivity, andrequire_verified - The memory service returns memories plus a retrieval event id
- Anthesis records the retrieval event, memory ids, provider decision, and prompt assembly in an execution envelope
- Anthesis decides whether retrieved memory may enter the model context
Memory may inform an agent, but governance decides whether it is allowed to do so.
Troubleshooting
journalctl -u dubnium-memory-api -b
journalctl -u postgresql -b
journalctl -u redis-dubnium-memory -b
Common failure buckets:
- database role or socket mismatch
- pgvector extension unavailable for the selected Postgres package
- migration failure
- API accidentally bound to a non-local address
- malformed JSON payload
- scope mismatch during retrieval
Validation Before Merge
git diff --check
nix --extra-experimental-features "nix-command flakes" build .#memory-service
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.dubnium.memory.enable
pytest pkgs/memory-service/tests
Default workstation expectation before opt-in:
false
If the full workstation build fails on host-specific hardware configuration, report that separately from the memory service package/module validation.
Memory Data Model Specification
Status: draft
This document is the canonical data model and requirements specification for the Dubnium memory service prototype. It reconciles the architecture direction, API/domain models, and current Postgres migration.
Implementation references:
pkgs/memory-service/src/dubnium_memory/models.pypkgs/memory-service/src/dubnium_memory/embeddings.pypkgs/memory-service/src/dubnium_memory/migrations/001_initial.sqlpkgs/memory-service/src/dubnium_memory/migrations/002_pgvector_embeddings.sql
Goals
The data model must support:
- durable episodic, semantic, and working memory records
- scoped retrieval for projects, sessions, and agents
- externalized artifacts and evidence references
- retrieval event capture for audit and replay
- metadata needed by a future external governance layer
- local Postgres and pgvector evolution without coupling to vLLM internals
Non-Goals
The data model does not define:
- transformer KV-cache persistence
- prompt assembly format
- future governance authority behavior
- autonomous memory mutation rules
- object storage implementation details
- a Letta or MemGPT internal schema
Trust Boundary
All stored content is untrusted when it enters the system and when it is retrieved later. This includes user input, agent output, model-generated summaries, tool output, artifact-derived text, and database rows.
Boundary requirements:
- validate API payloads before constructing domain objects
- redact secret-like values before persistence
- use parameterized SQL for all request-derived values
- keep secrets out of logs and durable memory summaries
- store enough provenance, validation, sensitivity, scope, and TTL metadata for external policy systems to inspect later
- return retrieval candidates and identifiers, not assembled prompts
Domain Objects
Memory
Memory is a normalized semantic or episodic record. It is not raw transcript storage and should not contain binary artifact data.
Required fields:
| Field | Type | Requirement |
|---|---|---|
id | UUID | Stable identifier generated before persistence |
memory_type | enum | One of working, episodic, semantic |
summary | string | Non-empty, max 8000 chars, redacted before persistence |
scope | string | Non-empty, max 256 chars |
source | string | Non-empty source label, max 128 chars |
provenance | object | JSON object, empty object allowed |
Optional or defaulted fields:
| Field | Type | Default | Requirement |
|---|---|---|---|
session_id | UUID or null | null | References sessions.id when durable |
importance | float | 0.0 | Range 0.0 to 1.0 |
confidence | float | 0.0 | Range 0.0 to 1.0 |
sensitivity | string | internal | Non-empty, max 64 chars |
validation_status | enum | unverified | One of unverified, verified, rejected |
ttl | timestamp or null | null | Expired records excluded and removable |
artifact_refs | list | empty | Each artifact scope must match memory scope |
Durable table: memories.
Current gap: artifact refs are represented in domain/API objects but are not yet persisted as a relationship table.
Retrieved Memory
Retrieved memory is a context candidate returned by retrieval. It must contain only the fields needed by callers to decide whether and how to assemble context.
Fields:
idsummaryscopesensitivityvalidation_statusprovenanceartifact_refs
Retrieval responses must not construct prompts. Prompt assembly remains outside the memory service.
Retrieve Request
Retrieve requests define caller intent and visibility constraints.
Fields:
| Field | Type | Default | Requirement |
|---|---|---|---|
query | string | none | Non-empty, max 4000 chars |
scope | string | none | Non-empty, max 256 chars |
allowed_sensitivity | string list | ["internal"] | Must not be empty |
require_verified | bool | false | Filters to verified memories when true |
limit | int | 8 | Range 1 to 32 |
Retrieval Event
Retrieval events record what was available to a caller at retrieval time.
Fields:
| Field | Type | Requirement |
|---|---|---|
id | UUID | Generated for each retrieval |
scope | string | Request scope |
query | string | Request query |
returned_memory_ids | UUID list | Ordered returned memory ids |
returned_artifact_ids | UUID list | Artifact ids referenced by returned memories |
created_at | timestamp | Durable database timestamp |
Durable table: retrieval_events.
Replay requirements:
- preserve returned memory ids
- preserve returned artifact ids
- preserve query and scope
- preserve timestamp
- later replay surfaces should reconstruct candidate availability from these identifiers and persisted records
Artifact Reference
Artifact refs are lightweight pointers from memory records to external evidence. They do not embed raw binary content.
Fields:
| Field | Type | Requirement |
|---|---|---|
id | UUID | Artifact identifier |
scope | string | Must match containing memory scope |
sha256 | string | Content hash |
storage_uri | string | External storage pointer |
artifact_type | string | Type such as image, document, log |
Durable table: artifacts.
Current gap: memory-to-artifact relationship persistence is not implemented.
Embedding
Embeddings are model-specific vector representations. They are separate from memory records so memory facts remain portable across embedding model changes.
Fields:
| Field | Type | Requirement |
|---|---|---|
model | string | Non-empty, max 128 chars |
dimensions | int | Positive |
vector | float list | Length must match dimensions |
Current durable table: memory_embeddings.
Current durable fields:
memory_idembedding_modelembedding_refembeddingembedding_dimensionscreated_at
Current implementation can persist embedding references and pgvector values for a memory. The application service can embed stored summaries when configured with an embedder and an embedding-capable store. The Postgres store can query vectors behind the storage boundary.
Session
Sessions group conversational or agentic work under a scope.
Durable table: sessions.
Fields:
idscopecreated_at
Current gap: session creation and lookup APIs are not implemented.
Task State
Task state is active execution state, not memory. It should remain structured and queryable instead of being embedded in vector stores.
Durable table: tasks.
Fields:
idscopestatusstatecreated_atupdated_at
Current gap: task-state domain objects and APIs are not implemented.
Provenance
Provenance records attach lineage to one memory, artifact, or retrieval event.
Durable table: provenance.
Fields:
idmemory_idartifact_idretrieval_event_idsource_identitysource_eventcreated_at
Constraint: exactly one of memory_id, artifact_id, or retrieval_event_id
must be set.
Current gap: provenance has initial schema support but no write path beyond memory-local JSON metadata.
Durable Tables
| Table | Purpose | Status |
|---|---|---|
sessions | Session metadata | Schema only |
memories | Normalized memory records | Implemented for store/retrieve/expire |
memory_embeddings | Embedding references and vectors | Implemented for persistence |
tasks | Active workflow state | Schema only |
artifacts | Externalized artifact metadata | Schema only |
retrieval_events | Retrieval audit/replay records | Implemented for retrieval event persistence |
provenance | Lineage records | Schema only |
API Requirements
The API boundary must:
- reject non-JSON write requests
- reject oversized payloads
- validate UUIDs, timestamps, enum values, scores, and bounds
- redact secret-like values before storing memory summaries
- return JSON errors without stack traces
- expose retrieval events for local replay/audit inspection
- keep durable storage implementation behind the application service contract
Retrieval Requirements
Retrieval must filter by:
- scope
- allowed sensitivity
- validation status when
require_verifiedis true - TTL expiration
Retrieval should rank by:
- lexical or vector relevance
- importance
- confidence
- recency
Current implementation supports scope, sensitivity, verification, TTL, lexical matching, vector relevance in the Postgres store, importance, and confidence. Recency ranking is future work.
Evolution Requirements
Future changes should preserve:
- vLLM runtime statelessness
- memory/runtime separation from governance authority
- external artifact references instead of binary prompt memory
- replayable retrieval events
- replaceable embedding providers
- MemGPT/Letta integration above Dubnium memory APIs, not as source of truth
Before adding autonomous memory writes, durable storage, redaction, retrieval filters, provenance, expiration, and replay evidence must pass local validation.
Memory Governance Contract
Status: draft
This contract defines how orchestrators such as Anthesis may request memory from the Dubnium memory service without delegating governance authority to the memory service itself.
Boundary
Anthesis / orchestrator
- classifies task risk
- authorizes memory scope
- chooses sensitivity filters
- decides whether retrieved memory enters prompt context
- records execution envelope
Dubnium memory service
- stores memories
- filters by scope, sensitivity, verification, rejection, and TTL
- returns retrieval candidates
- records retrieval events and metadata
The memory service must not assemble final prompts or decide whether a memory is safe to inject into an agent context.
Scope Convention
Memory scopes should use one of these prefixes:
personal:
project:
session:
agent:
workflow:
Examples:
project:dubnium
session:11111111-1111-4111-8111-111111111111
agent:anthesis-reviewer
workflow:memory-phase-2
The current scope helper is advisory. Runtime enforcement may be added after existing callers and examples are fully migrated.
Retrieval Request
A retrieval request may include governance metadata in addition to the Phase 1 filters.
{
"query": "What changed in memory phase 2?",
"scope": "project:dubnium",
"allowed_sensitivity": ["internal"],
"require_verified": true,
"limit": 8,
"purpose": "review",
"requester": {
"actor_type": "agent",
"actor_id": "anthesis-reviewer"
},
"envelope_id": "env-20260528-001"
}
Required Fields
| Field | Meaning |
|---|---|
query | Retrieval query text |
scope | Retrieval boundary, such as project:dubnium |
Optional Fields
| Field | Meaning | Default |
|---|---|---|
allowed_sensitivity | Sensitivity labels allowed in results | ["internal"] |
require_verified | Whether only verified memory may return | false |
limit | Maximum memory candidates | 8 |
purpose | Orchestrator purpose: ask, plan, patch, review, test | omitted |
requester | Actor requesting retrieval | omitted |
envelope_id | Upstream Anthesis execution envelope id | omitted |
Retrieval Event
Every retrieval returns an event.
{
"id": "uuid",
"scope": "project:dubnium",
"query": "What changed in memory phase 2?",
"returned_memory_ids": ["uuid"],
"returned_artifact_ids": [],
"metadata": {
"allowed_sensitivity": ["internal"],
"require_verified": true,
"limit": 8,
"purpose": "review",
"requester": {
"actor_type": "agent",
"actor_id": "anthesis-reviewer"
},
"envelope_id": "env-20260528-001"
}
}
The event is an audit hook. It is not proof that the memory entered a prompt. Anthesis must separately record prompt assembly and provider execution in its own envelope.
Normal Retrieval Rules
Normal retrieval excludes memory when:
- scope does not match the request
- sensitivity is not explicitly allowed
require_verifiedis true and memory is notverified- memory is expired by TTL
- memory has
validation_status = rejected
Rejected memory is excluded even when require_verified = false.
Audit retrieval of rejected memory is future work and should use a separate endpoint or explicit audit mode.
Memory Promotion
Memory should move through explicit states:
working -> episodic -> semantic -> repo doc / ADR / runbook
Promotion rules:
- working memory may be generated inside a session
- episodic memory must summarize a meaningful event or task
- semantic memory must represent a stable fact, decision, convention, or invariant
- repo docs, ADRs, and runbooks remain higher-authority than memory rows
Rejection Reasons
Memory candidates should be rejected or marked rejected when they contain:
- secret-like content that redaction could not confidently sanitize
- cross-scope contamination
- unsupported or hallucinated claims
- stale facts
- prompt-injection residue
- weak or missing provenance
Rejected memory must not appear in normal retrieval paths.
Anthesis Envelope Handoff
Anthesis should record:
- retrieval request
- retrieval event id
- returned memory ids
- returned artifact ids
- prompt assembly decision
- provider decision
- model/provider response
- validation result
The memory service only supplies retrieval candidates and metadata. Governance remains external.
Anthesis Memory Envelope Examples
Status: draft
This document shows how Dubnium memory retrieval evidence should appear inside an Anthesis execution envelope. It is intentionally contract-only: Dubnium does not implement Anthesis runtime orchestration here.
Boundary
Dubnium memory service
- stores memories
- filters retrieval candidates
- records retrieval events
- returns memory ids, artifact ids, and retrieval metadata
Anthesis
- authorizes retrieval
- assembles prompts
- decides whether retrieved memory may be used
- records provider decisions
- records validation results
The memory service retrieval event proves that memory was fetched. It does not prove that memory entered the model prompt. Anthesis must record the prompt assembly decision separately.
Envelope Fragment
A governed Anthesis execution envelope should include a memory section shaped like this:
{
"memory": {
"retrieval_request": {
"query": "What is the current Dubnium memory boundary?",
"scope": "project:dubnium",
"allowed_sensitivity": ["internal"],
"require_verified": true,
"limit": 8,
"purpose": "review",
"requester": {
"actor_type": "agent",
"actor_id": "anthesis-reviewer"
},
"envelope_id": "env-20260528-001"
},
"retrieval_event": {
"id": "22222222-2222-4222-8222-222222222222",
"scope": "project:dubnium",
"query": "What is the current Dubnium memory boundary?",
"returned_memory_ids": [
"11111111-1111-4111-8111-111111111111"
],
"returned_artifact_ids": [],
"metadata": {
"allowed_sensitivity": ["internal"],
"require_verified": true,
"limit": 8,
"purpose": "review",
"requester": {
"actor_type": "agent",
"actor_id": "anthesis-reviewer"
},
"envelope_id": "env-20260528-001"
}
},
"prompt_assembly_decision": {
"used_memory_ids": [
"11111111-1111-4111-8111-111111111111"
],
"excluded_memory_ids": [],
"decision": "used",
"reason": "Verified internal project memory matched the authorized scope and review purpose."
}
}
}
Provider Decision Fragment
Memory evidence should sit beside, not inside, the provider decision.
{
"provider_decision": {
"selected_provider": "vllm.local",
"selected_model": "qwen2.5-coder-14b-instruct",
"provider_class": "local",
"cloud_escalation_allowed": false,
"reason": "Review task used verified internal project memory and did not require external context."
}
}
Validation Fragment
Validation should explicitly tie output review to the memory/context decision.
{
"validation": {
"status": "passed",
"checks": [
{
"name": "memory_scope",
"status": "passed",
"details": "All retrieved memory was scoped to project:dubnium."
},
{
"name": "rejected_memory_exclusion",
"status": "passed",
"details": "No rejected memories were returned or used."
},
{
"name": "prompt_assembly_recorded",
"status": "passed",
"details": "Used and excluded memory ids were recorded."
}
]
}
}
Non-Use Case
If memory is retrieved but not used, Anthesis should record that explicitly:
{
"memory": {
"retrieval_event_id": "22222222-2222-4222-8222-222222222222",
"returned_memory_ids": [
"11111111-1111-4111-8111-111111111111"
],
"returned_artifact_ids": [],
"prompt_assembly_decision": {
"used_memory_ids": [],
"excluded_memory_ids": [
"11111111-1111-4111-8111-111111111111"
],
"decision": "excluded",
"reason": "Memory was relevant but unverified; task required verified memory."
}
}
}
Rejected Memory Case
Rejected memory should not appear in normal retrieval events. If a future audit mode retrieves rejected memory, the envelope must make the audit mode explicit:
{
"memory_audit": {
"mode": "audit_rejected_memory",
"normal_prompt_use_allowed": false,
"retrieved_rejected_memory_ids": [
"33333333-3333-4333-8333-333333333333"
],
"reason": "Operator audit of previously rejected cross-scope memory."
}
}
Audit-mode retrieval is future work. Normal prompt assembly must not use rejected memory.
Minimum Envelope Requirements
For any Anthesis-governed run that uses Dubnium memory, record:
- retrieval request
- retrieval event id
- returned memory ids
- returned artifact ids
- prompt assembly decision
- used memory ids
- excluded memory ids
- provider decision
- validation result
This creates a replayable boundary between retrieval, prompt assembly, provider execution, and validation.
vLLM Memory Phase 1 Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Build a minimal local persistent memory prototype around Dubnium’s existing vLLM service without coupling durable memory to transformer KV state.
Architecture: vLLM remains the inference runtime. A separate memory workload provides Postgres/pgvector storage, optional Redis working context, summarization and embedding workers, and a scoped retrieval API that an orchestrator can use before calling vLLM. A future governance layer remains external; Phase 1 records metadata and lifecycle events but does not implement the governance authority.
Tech Stack: NixOS modules, Postgres, pgvector, Redis, Python service code, pytest, systemd services.
Scope
This plan implements the Phase 1 prototype described in ADR-0010 and vLLM Persistent Memory Prototype.
Do not implement multi-agent federation, Temporal, MinIO, cryptographic attestation, a production policy DSL, or durable KV-cache persistence in this phase.
Do not implement Letta or another MemGPT-style framework in Phase 1. Keep it as an incremental upgrade candidate after storage, retrieval filters, redaction, provenance, and replay checks are stable.
Do not implement MinIO, OCI artifact publishing, VLM artifact resolution, or binary artifact extraction in Phase 1. Store artifact references and metadata only where needed; binary artifact pipelines are a later architecture phase.
Trust Boundaries
Risk: medium.
Attacker-controlled inputs include user prompts, agent messages, model output, tool output, retrieved artifacts, imported documents, and model-generated summaries. Treat all of them as untrusted before storage and before prompt assembly.
The Phase 1 implementation must enforce:
- validation at API boundaries
- scoped retrieval before prompt assembly
- redaction before durable storage
- TTL filtering
- sensitivity metadata and filters
- provenance on every memory row and artifact reference
- retrieval event logging for later replay
- no secret values in logs or memory payloads
Planned Files
Create:
modules/workloads/memory.nix: NixOS workload module for Postgres, pgvector, Redis, memory API, and workers.pkgs/memory-service/default.nix: package the local Python memory service.pkgs/memory-service/pyproject.toml: Python package metadata.pkgs/memory-service/src/dubnium_memory/__init__.py: package marker.pkgs/memory-service/src/dubnium_memory/api.py: HTTP API boundary and input validation.pkgs/memory-service/src/dubnium_memory/config.py: environment parsing.pkgs/memory-service/src/dubnium_memory/db.py: database connection and migrations runner.pkgs/memory-service/src/dubnium_memory/models.py: typed request and memory models.pkgs/memory-service/src/dubnium_memory/filters.py: retrieval scope, TTL, and sensitivity filters.pkgs/memory-service/src/dubnium_memory/redaction.py: secret and sensitive payload redaction.pkgs/memory-service/src/dubnium_memory/retrieval.py: scoped query and ranking logic.pkgs/memory-service/src/dubnium_memory/storage.py: memory persistence.pkgs/memory-service/src/dubnium_memory/workers.py: summarization and embedding worker entrypoints.pkgs/memory-service/migrations/001_initial.sql: schema for sessions, memories, embeddings, tasks, artifacts, retrieval events, and provenance.pkgs/memory-service/tests/test_filters.py: retrieval filter tests.pkgs/memory-service/tests/test_redaction.py: redaction tests.pkgs/memory-service/tests/test_storage.py: storage contract tests.pkgs/memory-service/tests/test_retrieval.py: retrieval filter tests.docs/runbooks/memory-service.md: operator runbook for the prototype.
Modify:
modules/dubnium/options.nix: adddubnium.memoryoptions and assertions.hosts/workstation/default.nix: import and enable the memory workload for the workstation only after the module evaluates.flake.nix: expose thememory-servicepackage.docs/README.md: link the memory service runbook.docs/SUMMARY.md: link the memory service runbook.
Implementation Tasks
Task 1: Add Memory Options
Files:
-
Modify:
modules/dubnium/options.nix -
Step 1: Add a disabled-by-default
dubnium.memoryoption set
Add this next to the existing dubnium.vllm and dubnium.k3s options:
memory = {
enable = mkEnableOption "persistent memory services for local vLLM orchestration";
api = {
host = mkOption {
type = types.str;
default = "127.0.0.1";
description = "Host address bound by the Dubnium memory API.";
};
port = mkOption {
type = types.port;
default = 8090;
description = "Port bound by the Dubnium memory API.";
};
};
database = {
name = mkOption {
type = types.str;
default = "dubnium_memory";
description = "Postgres database used by the Dubnium memory subsystem.";
};
user = mkOption {
type = types.str;
default = "dubnium_memory";
description = "Postgres role used by the Dubnium memory service.";
};
};
redis = {
enable = mkOption {
type = types.bool;
default = true;
description = "Whether Redis is enabled for transient working context and worker queues.";
};
};
retention = {
defaultTtlDays = mkOption {
type = types.nullOr types.int;
default = null;
description = "Default TTL in days for memory objects without an explicit TTL.";
};
};
};
- Step 2: Add assertions for safe local defaults
Add these to the existing assertions list:
{
assertion = (!config.dubnium.memory.enable) || (config.dubnium.memory.api.host == "127.0.0.1");
message = "dubnium.memory.api.host must stay local-only for the Phase 1 prototype";
}
{
assertion =
(config.dubnium.memory.retention.defaultTtlDays == null)
|| (config.dubnium.memory.retention.defaultTtlDays > 0);
message = "dubnium.memory.retention.defaultTtlDays must be positive when set";
}
- Step 3: Verify option evaluation
Run:
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.dubnium.memory.enable
Expected:
false
Task 2: Package The Memory Service Skeleton
Files:
-
Create:
pkgs/memory-service/default.nix -
Create:
pkgs/memory-service/pyproject.toml -
Create:
pkgs/memory-service/src/dubnium_memory/__init__.py -
Create:
pkgs/memory-service/src/dubnium_memory/config.py -
Create:
pkgs/memory-service/src/dubnium_memory/api.py -
Modify:
flake.nix -
Step 1: Create package metadata
Create pkgs/memory-service/pyproject.toml:
[project]
name = "dubnium-memory"
version = "0.1.0"
description = "Local persistent memory service for Dubnium vLLM orchestration"
requires-python = ">=3.12"
dependencies = [
"fastapi",
"pydantic",
"psycopg[binary]",
"uvicorn",
]
[project.scripts]
dubnium-memory-api = "dubnium_memory.api:main"
- Step 2: Create the Nix package
Create pkgs/memory-service/default.nix:
{ python312Packages }:
python312Packages.buildPythonApplication {
pname = "dubnium-memory";
version = "0.1.0";
pyproject = true;
src = ./.;
build-system = [
python312Packages.setuptools
python312Packages.wheel
];
dependencies = [
python312Packages.fastapi
python312Packages.pydantic
python312Packages.psycopg
python312Packages.uvicorn
];
}
- Step 3: Add minimal app entrypoint
Create pkgs/memory-service/src/dubnium_memory/__init__.py:
"""Dubnium persistent memory service."""
Create pkgs/memory-service/src/dubnium_memory/config.py:
from pydantic import BaseModel
class Settings(BaseModel):
database_url: str
host: str = "127.0.0.1"
port: int = 8090
Create pkgs/memory-service/src/dubnium_memory/api.py:
import os
from fastapi import FastAPI
import uvicorn
from dubnium_memory.config import Settings
app = FastAPI(title="Dubnium Memory API")
@app.get("/healthz")
def healthz() -> dict[str, str]:
return {"status": "ok"}
def settings_from_env() -> Settings:
return Settings(
database_url=os.environ["DATABASE_URL"],
host=os.environ.get("DUBNIUM_MEMORY_HOST", "127.0.0.1"),
port=int(os.environ.get("DUBNIUM_MEMORY_PORT", "8090")),
)
def main() -> None:
settings = settings_from_env()
uvicorn.run(app, host=settings.host, port=settings.port)
- Step 4: Expose the package from the flake
Modify flake.nix under packages.${system}:
memory-service = pkgs.callPackage ./pkgs/memory-service { };
- Step 5: Verify package build
Run:
nix --extra-experimental-features "nix-command flakes" build .#memory-service
Expected:
result/bin/dubnium-memory-api exists
Task 3: Add Schema And Storage Contracts
Files:
-
Create:
pkgs/memory-service/migrations/001_initial.sql -
Create:
pkgs/memory-service/src/dubnium_memory/models.py -
Create:
pkgs/memory-service/src/dubnium_memory/storage.py -
Create:
pkgs/memory-service/tests/test_storage.py -
Step 1: Create the first migration
Create pkgs/memory-service/migrations/001_initial.sql:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE IF NOT EXISTS sessions (
id uuid PRIMARY KEY,
scope text NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE TABLE IF NOT EXISTS memories (
id uuid PRIMARY KEY,
session_id uuid REFERENCES sessions(id),
memory_type text NOT NULL CHECK (memory_type IN ('working', 'episodic', 'semantic')),
summary text NOT NULL,
scope text NOT NULL,
importance double precision NOT NULL DEFAULT 0.0,
confidence double precision NOT NULL DEFAULT 0.0,
sensitivity text NOT NULL DEFAULT 'internal',
validation_status text NOT NULL DEFAULT 'unverified',
ttl timestamptz,
source text NOT NULL,
provenance jsonb NOT NULL DEFAULT '{}'::jsonb,
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE TABLE IF NOT EXISTS memory_embeddings (
memory_id uuid PRIMARY KEY REFERENCES memories(id) ON DELETE CASCADE,
embedding vector(384) NOT NULL,
model text NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE TABLE IF NOT EXISTS tasks (
id uuid PRIMARY KEY,
scope text NOT NULL,
status text NOT NULL,
state jsonb NOT NULL DEFAULT '{}'::jsonb,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now()
);
CREATE TABLE IF NOT EXISTS artifacts (
id uuid PRIMARY KEY,
scope text NOT NULL,
uri text NOT NULL,
media_type text,
sensitivity text NOT NULL DEFAULT 'internal',
provenance jsonb NOT NULL DEFAULT '{}'::jsonb,
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE TABLE IF NOT EXISTS provenance (
id uuid PRIMARY KEY,
memory_id uuid REFERENCES memories(id) ON DELETE CASCADE,
source_identity text NOT NULL,
source_event jsonb NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE TABLE IF NOT EXISTS retrieval_events (
id uuid PRIMARY KEY,
scope text NOT NULL,
query text NOT NULL,
returned_memory_ids uuid[] NOT NULL DEFAULT '{}',
returned_artifact_ids uuid[] NOT NULL DEFAULT '{}',
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE INDEX IF NOT EXISTS memories_scope_created_at_idx
ON memories (scope, created_at DESC);
CREATE INDEX IF NOT EXISTS memories_ttl_idx
ON memories (ttl);
- Step 2: Define typed storage input
Create pkgs/memory-service/src/dubnium_memory/models.py:
from datetime import datetime
from typing import Literal
from uuid import UUID
from pydantic import BaseModel, Field
MemoryType = Literal["working", "episodic", "semantic"]
ValidationStatus = Literal["unverified", "verified", "rejected"]
class MemoryIn(BaseModel):
id: UUID
session_id: UUID | None = None
memory_type: MemoryType
summary: str = Field(min_length=1, max_length=8000)
scope: str = Field(min_length=1, max_length=256)
importance: float = Field(default=0.0, ge=0.0, le=1.0)
confidence: float = Field(default=0.0, ge=0.0, le=1.0)
sensitivity: str = Field(default="internal", max_length=64)
validation_status: ValidationStatus = "unverified"
ttl: datetime | None = None
source: str = Field(min_length=1, max_length=128)
provenance: dict
- Step 3: Implement storage with parameterized SQL
Create pkgs/memory-service/src/dubnium_memory/storage.py:
from psycopg import Connection
from dubnium_memory.models import MemoryIn
def store_memory(conn: Connection, memory: MemoryIn) -> None:
conn.execute(
"""
INSERT INTO memories (
id, session_id, memory_type, summary, scope, importance, confidence,
sensitivity, validation_status, ttl, source, provenance
)
VALUES (
%(id)s, %(session_id)s, %(memory_type)s, %(summary)s, %(scope)s,
%(importance)s, %(confidence)s, %(sensitivity)s, %(validation_status)s,
%(ttl)s, %(source)s, %(provenance)s
)
""",
memory.model_dump(),
)
- Step 4: Add a storage test
Create pkgs/memory-service/tests/test_storage.py:
from uuid import uuid4
from dubnium_memory.models import MemoryIn
def test_memory_requires_summary() -> None:
payload = {
"id": uuid4(),
"memory_type": "episodic",
"summary": "",
"scope": "project:dubnium",
"source": "conversation",
"provenance": {"origin": "test"},
}
try:
MemoryIn(**payload)
except Exception as exc:
assert "summary" in str(exc)
else:
raise AssertionError("empty summary should be rejected")
Task 4: Add Redaction And Retrieval Filters
Files:
-
Create:
pkgs/memory-service/src/dubnium_memory/redaction.py -
Create:
pkgs/memory-service/src/dubnium_memory/filters.py -
Create:
pkgs/memory-service/tests/test_redaction.py -
Create:
pkgs/memory-service/tests/test_filters.py -
Step 1: Implement conservative redaction
Create pkgs/memory-service/src/dubnium_memory/redaction.py:
import re
SECRET_PATTERNS = [
re.compile(r"(?i)(api[_-]?key|token|secret|password)\s*[:=]\s*([^\s]+)"),
]
def redact_text(value: str) -> str:
redacted = value
for pattern in SECRET_PATTERNS:
redacted = pattern.sub(r"\1=[REDACTED]", redacted)
return redacted
- Step 2: Test redaction
Create pkgs/memory-service/tests/test_redaction.py:
from dubnium_memory.redaction import redact_text
def test_redacts_api_key_like_values() -> None:
text = "OPENAI_API_KEY=sk-test-value"
assert redact_text(text) == "OPENAI_API_KEY=[REDACTED]"
- Step 3: Implement retrieval filtering
Create pkgs/memory-service/src/dubnium_memory/filters.py:
from datetime import datetime, timezone
from typing import TypedDict
class MemoryCandidate(TypedDict):
id: str
scope: str
sensitivity: str
validation_status: str
ttl: datetime | None
def is_retrievable(
memory: MemoryCandidate,
*,
scope: str,
allowed_sensitivity: set[str],
require_verified: bool,
) -> bool:
if memory["scope"] != scope:
return False
if memory["sensitivity"] not in allowed_sensitivity:
return False
if require_verified and memory["validation_status"] != "verified":
return False
if memory["ttl"] is not None and memory["ttl"] <= datetime.now(timezone.utc):
return False
return True
- Step 4: Test scope and TTL enforcement
Create pkgs/memory-service/tests/test_filters.py:
from datetime import datetime, timedelta, timezone
from dubnium_memory.filters import is_retrievable
def test_rejects_cross_scope_memory() -> None:
memory = {
"id": "m1",
"scope": "project:other",
"sensitivity": "internal",
"validation_status": "verified",
"ttl": None,
}
assert not is_retrievable(
memory,
scope="project:dubnium",
allowed_sensitivity={"internal"},
require_verified=True,
)
def test_rejects_expired_memory() -> None:
memory = {
"id": "m1",
"scope": "project:dubnium",
"sensitivity": "internal",
"validation_status": "verified",
"ttl": datetime.now(timezone.utc) - timedelta(days=1),
}
assert not is_retrievable(
memory,
scope="project:dubnium",
allowed_sensitivity={"internal"},
require_verified=True,
)
Task 5: Add Retrieval API Boundary
Files:
-
Modify:
pkgs/memory-service/src/dubnium_memory/api.py -
Create:
pkgs/memory-service/src/dubnium_memory/retrieval.py -
Create:
pkgs/memory-service/tests/test_retrieval.py -
Step 1: Add request and response models
Add to models.py:
class RetrieveRequest(BaseModel):
query: str = Field(min_length=1, max_length=4000)
scope: str = Field(min_length=1, max_length=256)
allowed_sensitivity: list[str] = Field(default_factory=lambda: ["internal"])
require_verified: bool = False
limit: int = Field(default=8, ge=1, le=32)
class RetrievedMemory(BaseModel):
id: UUID
summary: str
scope: str
sensitivity: str
validation_status: ValidationStatus
provenance: dict
- Step 2: Implement retrieval query contract
Create pkgs/memory-service/src/dubnium_memory/retrieval.py:
from psycopg import Connection
from dubnium_memory.models import RetrieveRequest, RetrievedMemory
def retrieve_memories(conn: Connection, request: RetrieveRequest) -> list[RetrievedMemory]:
rows = conn.execute(
"""
SELECT id, summary, scope, sensitivity, validation_status, provenance
FROM memories
WHERE scope = %(scope)s
AND sensitivity = ANY(%(allowed_sensitivity)s)
AND (%(require_verified)s = false OR validation_status = 'verified')
AND (ttl IS NULL OR ttl > now())
ORDER BY importance DESC, created_at DESC
LIMIT %(limit)s
""",
request.model_dump(),
).fetchall()
return [RetrievedMemory.model_validate(dict(row)) for row in rows]
- Step 3: Add API endpoint
Add to api.py:
from dubnium_memory.models import RetrieveRequest, RetrievedMemory
@app.post("/memory/retrieve")
def retrieve(request: RetrieveRequest) -> list[RetrievedMemory]:
raise NotImplementedError("database connection wiring is added in the service module task")
Keep this endpoint local-only until the database dependency is wired. Do not expose it on the network in Phase 1.
Task 6: Add NixOS Workload Module
Files:
-
Create:
modules/workloads/memory.nix -
Modify:
hosts/workstation/default.nix -
Step 1: Create the workload module
Create modules/workloads/memory.nix:
{ lib, config, pkgs, ... }:
let
cfg = config.dubnium.memory;
memoryPackage = pkgs.callPackage ../../pkgs/memory-service { };
in
{
config = lib.mkIf cfg.enable {
services.postgresql = {
enable = true;
extensions = ps: [ ps.pgvector ];
ensureDatabases = [ cfg.database.name ];
ensureUsers = [
{
name = cfg.database.user;
ensureDBOwnership = true;
}
];
};
services.redis.servers.dubnium-memory = lib.mkIf cfg.redis.enable {
enable = true;
bind = "127.0.0.1";
port = 6379;
};
systemd.services.dubnium-memory-api = {
description = "Dubnium persistent memory API";
wantedBy = [ "multi-user.target" ];
after = [ "postgresql.service" ];
requires = [ "postgresql.service" ];
environment = {
DUBNIUM_MEMORY_HOST = cfg.api.host;
DUBNIUM_MEMORY_PORT = toString cfg.api.port;
DATABASE_URL = "postgresql:///${cfg.database.name}?host=/run/postgresql";
};
serviceConfig = {
Type = "simple";
ExecStart = "${memoryPackage}/bin/dubnium-memory-api";
Restart = "always";
RestartSec = "5s";
NoNewPrivileges = true;
PrivateTmp = true;
ProtectHome = true;
Slice = "platform.slice";
};
};
};
}
- Step 2: Import the module without enabling it
Modify hosts/workstation/default.nix imports:
../../modules/workloads/memory.nix
Do not set dubnium.memory.enable = true until package build and module eval
pass.
- Step 3: Verify disabled module eval
Run:
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.systemd.services.dubnium-memory-api.enable
Expected: the attribute should be absent or evaluation should show the service
is not defined while dubnium.memory.enable = false.
Task 7: Enable Prototype Locally
Files:
-
Modify:
hosts/workstation/default.nix -
Step 1: Enable the memory workload
Add under dubnium:
memory = {
enable = true;
api.host = "127.0.0.1";
api.port = 8090;
};
- Step 2: Verify generated service contracts
Run:
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.services.postgresql.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.services.redis.servers.dubnium-memory.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.systemd.services.dubnium-memory-api.environment.DUBNIUM_MEMORY_HOST
Expected:
true
true
"127.0.0.1"
Task 8: Add Operator Runbook
Files:
-
Create:
docs/runbooks/memory-service.md -
Modify:
docs/README.md -
Modify:
docs/SUMMARY.md -
Step 1: Create the runbook
Create docs/runbooks/memory-service.md with:
# Runbook: Memory Service
Status: prototype
Use this after `dubnium.memory.enable = true`.
## Verify Services
```bash
systemctl status postgresql
systemctl status redis-dubnium-memory
systemctl status dubnium-memory-api
curl http://127.0.0.1:8090/healthz
Expected:
{"status":"ok"}
Security Checks
- the API binds to
127.0.0.1 - memories include scope, sensitivity, validation status, and provenance
- expired memories are not returned
- sensitive memories are not returned unless explicitly allowed
- retrieval events are logged with memory ids and artifact references
- logs do not contain raw token-like values
- [ ] **Step 2: Link the runbook**
Add `Memory Service` to the Runbooks lists in `docs/README.md` and
`docs/SUMMARY.md`.
- [ ] **Step 3: Build docs**
Run:
```bash
mdbook build
Expected: docs build succeeds. Generated web/docs changes may be reverted if
the review scope is source docs only.
Final Verification
Before committing Phase 1 implementation:
git diff --check
nix --extra-experimental-features "nix-command flakes" build .#memory-service
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.workstation.config.dubnium.memory.enable
pytest pkgs/memory-service/tests
mdbook build
If a full workstation build still fails on the known placeholder hardware configuration, report that separately from targeted memory-module evaluation.
Follow-Up: MemGPT-Style Agent Upgrade
After Phase 1 is stable, create a separate ADR or spike plan for evaluating Letta as the maintained framework lineage from MemGPT. That spike should be read-only against existing memory rows at first, then test controlled agent-managed memory writes only after external governance hooks and replay evidence are in place.
Follow-Up: Artifact And OCI Architecture
After Phase 1 is stable, create a separate implementation plan for artifact handling. That work should start with filesystem content-addressed storage and metadata extraction, then evaluate MinIO and OCI-style exported cognition artifacts only after memory rows, retrieval events, and artifact references have stable ids.
Memory Phase 2: Governed Structured Memory
Status: planning
Phase 1 prepared the local memory substrate: package, API, Postgres/pgvector schema, Redis support, retrieval events, tests, and an opt-in workstation runbook.
Phase 2 makes memory useful for governed agent workflows without turning the memory service into the governance authority.
Goal
Build a structured, policy-aware memory layer that Anthesis or another orchestrator can govern explicitly.
The Phase 2 target is:
Anthesis decides what memory may be used.
Dubnium stores, retrieves, filters, and records memory events.
vLLM remains the inference runtime.
Non-Goals
Do not implement these in Phase 2:
- autonomous self-editing memory
- global always-on personal memory injection
- durable transformer KV-cache persistence
- multi-agent memory federation
- Temporal or complex workflow orchestration
- MinIO or OCI memory bundles
- raw artifact extraction pipelines
- public or Tailscale-exposed memory API
- Anthesis itself inside the Dubnium memory service
Boundary
flowchart TD
A[Anthesis / Orchestrator] --> B[Memory Policy Decision]
B --> C[Dubnium Memory API]
C --> D[(Postgres)]
C --> E[(pgvector)]
C --> F[(Redis)]
C --> G[Retrieval Event]
G --> A
A --> H[Execution Envelope]
A --> I[vLLM / Agent Prompt]
Dubnium must expose enough structure for Anthesis to audit and replay memory use, but Dubnium must not silently decide that retrieved memory belongs in a prompt.
Phase 2 Capabilities
1. Memory Namespaces
Add explicit namespace concepts on top of the existing scope field.
Suggested namespace shape:
personal:<name>
project:<repo-or-system>
session:<uuid>
agent:<agent-id>
workflow:<workflow-id>
The existing scope field can remain the primary filter, but Phase 2 should document and validate accepted scope patterns.
2. Memory Classes
Keep the current memory types:
workingepisodicsemantic
Add operational guidance:
| Type | Meaning | Default retention |
|---|---|---|
working | transient task/session context | short TTL |
episodic | event/session summaries | medium or explicit TTL |
semantic | normalized stable facts/decisions | long-lived but reviewable |
Semantic memory should require stronger provenance and confidence than working memory.
3. Governance Metadata
Each memory row already carries sensitivity, validation_status, source, provenance, and ttl. Phase 2 should standardize expected provenance fields.
Recommended provenance shape:
{
"origin": "agent|operator|system|import",
"source_uri": "optional source reference",
"source_event_id": "optional event id",
"extractor": "manual|summary-worker|agent",
"extractor_version": "1",
"governance": "manual|anthesis|none",
"envelope_id": "optional Anthesis envelope id"
}
4. Retrieval Policy Contract
Add a policy-facing retrieval request contract:
{
"query": "string",
"scope": "project:dubnium",
"allowed_sensitivity": ["internal"],
"require_verified": false,
"limit": 8,
"purpose": "ask|plan|patch|review|test",
"requester": {
"actor_type": "human|agent|system",
"actor_id": "string"
},
"envelope_id": "optional Anthesis envelope id"
}
The existing API can continue accepting the Phase 1 shape, but Phase 2 should add optional fields and preserve backward compatibility.
5. Retrieval Event Completeness
Retrieval events should eventually record:
- query
- scope
- returned memory ids
- returned artifact ids
- allowed sensitivities
require_verified- requester
- purpose
- envelope id
- timestamp
This is the key replay hook for Anthesis.
6. Memory Promotion
Add an explicit promotion workflow:
working -> episodic -> semantic -> repo doc / ADR / runbook
Rules:
- working memory can be generated freely inside a session
- episodic memory requires summarization and provenance
- semantic memory requires confidence, review status, and scope
- repo docs remain the highest-authority source for durable project truth
7. Memory Rejection
Add a clear rejection path:
candidate memory -> rejected -> never retrieved unless explicitly requested for audit
Rejection reasons should include:
- secret-like content
- cross-scope contamination
- hallucinated or unsupported claim
- stale fact
- prompt-injection residue
- unsupported provenance
8. Prompt Assembly Boundary
The memory service should never return a final prompt. It should return candidates and event metadata.
The orchestrator owns:
- prompt assembly
- context ordering
- final redaction
- policy enforcement
- provider selection
- execution envelope capture
Implementation Tasks
Task 1: Add Governance-Oriented Request Metadata
Files:
pkgs/memory-service/src/dubnium_memory/models.pypkgs/memory-service/src/dubnium_memory/serialization.pypkgs/memory-service/tests/test_models.pypkgs/memory-service/tests/test_api.py
Add optional fields to retrieval requests:
purposerequesterenvelope_id
Keep them optional so Phase 1 clients do not break.
Task 2: Extend Retrieval Events
Files:
pkgs/memory-service/src/dubnium_memory/migrations/003_retrieval_event_metadata.sqlpkgs/memory-service/src/dubnium_memory/postgres.pypkgs/memory-service/tests/test_migrations.pypkgs/memory-service/tests/test_postgres.py
Add nullable metadata columns or a metadata jsonb field to retrieval_events.
Recommended initial shape:
ALTER TABLE retrieval_events
ADD COLUMN IF NOT EXISTS metadata jsonb NOT NULL DEFAULT '{}'::jsonb;
This avoids premature schema churn while keeping replay metadata available.
Task 3: Add Scope Validation Helpers
Files:
pkgs/memory-service/src/dubnium_memory/scopes.pypkgs/memory-service/tests/test_scopes.py
Add validation for scope prefixes:
personal:project:session:agent:workflow:
Do not enforce globally until existing tests and callers are migrated.
Task 4: Add Promotion/Rejection Contract Docs
Files:
docs/specs/memory-governance-contract.mddocs/runbooks/memory-service.md
Document:
- memory promotion rules
- rejection reasons
- semantic memory expectations
- Anthesis envelope handoff
Task 5: Add Policy Examples
Files:
docs/examples/memory-policy.project-dubnium.jsondocs/examples/memory-retrieval-request.jsondocs/examples/memory-retrieval-event.json
These examples should be data contracts, not active enforcement.
Acceptance Criteria
Phase 2 is complete when:
- retrieval requests can carry optional governance metadata
- retrieval events preserve that metadata for replay
- scope conventions are documented and testable
- memory promotion/rejection rules are documented
- Anthesis can use memory ids and retrieval event ids in execution envelopes
- no prompt assembly happens inside the memory service
- memory remains opt-in on the workstation host
Risks
| Risk | Mitigation |
|---|---|
| Memory poisoning | require scope, provenance, validation status, and retrieval event logging |
| Cross-project leakage | enforce scoped retrieval and explicit sensitivity filters |
| Silent context injection | keep prompt assembly outside memory service |
| Governance coupling | expose metadata; let Anthesis decide policy |
| Schema churn | prefer additive migrations and metadata JSON for early governance fields |
| Stale semantic facts | use confidence, validation status, TTL, and promotion workflow |
Recommended First PR
The first Phase 2 PR should be small:
- Add
metadata jsonbtoretrieval_events - Add optional retrieval request metadata fields
- Preserve metadata in retrieval event responses
- Add tests
- Add governance contract docs
Do not add Anthesis runtime wiring yet.
Architecture Overview
Status: living
This is the arc42-lite entrypoint for Dubnium. It describes the system shape, constraints, building blocks, runtime behavior, deployment view, and current risks without replacing lower-level implementation docs.
Purpose
Dubnium is a policy-driven NixOS workstation and AI node. It supports multiple host-local operational contracts on one physical machine:
desktop: interactive Hyprland workstation and development mode.studio-local: conditional low-latency audio overlay ondesktop.compute: headless throughput-oriented AI/platform mode.
The architecture exists to make mode transitions explicit, observable, guard-driven, auditable, and reversible.
Constraints
- Desired state is not current state.
- Current state must be derived from runtime observation.
- Runtime reconciliation is mandatory for mode changes.
- systemd targets, services, and slices are the enforcement mechanism.
- Runtime switching comes before NixOS specialisations.
studio-localis conditional and must not dominate the architecture.- Host-local modes must remain separate from capability placement.
- Failure, degraded, and blocked states must be modeled explicitly.
System Context
Actors and adjacent systems:
- Local operator: requests mode changes, checks status, recovers failures.
- NixOS host: owns systemd enforcement, hardware, services, and runtime state.
- GPU/display/audio hardware: shared resources with conflicting latency and throughput requirements.
- vLLM: compute workload, active only in
computefor v1. - k3s: platform workload, stable across modes for v1.
- Possible external studio host: future placement for audio/studio capability.
Building Blocks
- Nix flake: declares the host configuration and packaged tools.
modules/dubnium: mode policy, options, targets, slices, controller units, state files, and guard installation.modules/workloads: workload-specific service definitions such as Hyprland, audio, NVIDIA, vLLM, and k3s.modeCLI: operator surface for requests, status, desired/current state, and explanation.- Reconciler: privileged transition executor.
- Observer: evidence-based classifier for current mode.
- Guards: small checks that return pass, policy block, or execution error.
- systemd: target, service, slice, and cgroup enforcement layer.
Runtime View
All mode changes follow the same control-loop shape:
- Authorize the request.
- Write desired state.
- Acquire the controller lock.
- Observe current state from runtime facts.
- Validate target and capability placement.
- Run transition guards.
- Execute bounded actions through systemd and helper scripts.
- Re-observe.
- Classify success, degraded state, blocked state, or failure.
- Write transition and guard records.
Success is never inferred from attempted actions. Success requires post-transition observation that satisfies the target mode predicates.
Deployment View
Primary deployment target:
- one
x86_64-linuxNixOS workstation host namedworkstation - Hyprland desktop
- NVIDIA/CUDA runtime
- planned dual-GPU topology, with hardware-tolerant transitional config
- vLLM model/cache state outside the Nix store
- k3s control-node duties
Runtime state:
- live state under
/run/mode-controller - future persistent audit history under
/var/lib/mode-controlleror/persist/var/lib/mode-controllerwhen impermanence lands
Cross-Cutting Concerns
- Safety: guards block destructive transitions and distinguish policy blocks from execution errors.
- Observability: status must show desired state, observed state, conflicts, guard failures, and latest transition result.
- Auditability: every reconciliation attempt should produce structured records.
- Resource ownership: GPU, CPU, memory, I/O, audio, AI, and platform planes must not silently overlap in conflicting ways.
- Security: unprivileged users must not forge desired/current state or transition success.
Current Risks
- NVIDIA/Wayland GPU release may not be reliable enough for runtime-only compute promotion.
- Mixed runtime states may confuse a shell observer unless conflicts are handled conservatively.
systemctl isolatecan stop required services if target dependencies are not explicit enough.- Rollback must prove restored desktop behavior through observation, not just successful systemd commands.
See also:
Control Plane
Status: living
The control plane reconciles requested mode intent with runtime facts. It is a local privileged authority, not a convenience shell script.
Authority Model
V1 decision:
- transition execution is privileged
- the initial operator path is
sudo mode request <mode>or a root-ownedmode-controller@.service - unprivileged users must not mutate observed state or forge transition success
Future options:
- polkit-mediated request path
- local service endpoint
- richer automation integration
Those options should not be added until the root/sudo path proves the control loop on the target host.
State Model
Live state lives under /run/mode-controller:
desired
current
lock
last-transition.json
last-guards.json
capability-placement.json
hardware-topology.json
Persistent transition history lives under:
/var/lib/mode-controller/events.jsonl
Each line is an append-only JSON event emitted by the reconciler. Initial event types:
transition
Initial event fields:
timestamprequestedpriorfinalsuccessreason
This event stream is intended to become the basis for:
- audit history
- degraded transition diagnosis
- future reconciliation analytics
- operator replay/debug tooling
- higher-level memory/context systems
V1 accepts plain desired and current files for the first bootable milestone.
The hardening path is either:
- migrate to
desired.jsonandcurrent.json, or - explicitly document the plain-text files as stable interface and keep structured metadata in transition records.
When impermanence is introduced, the persistent event path can be mapped to:
/persist/var/lib/mode-controller/events.jsonl
Reconciliation Sequence
Every requested transition follows this sequence:
- acquire lock
- observe current state
- validate requested target
- validate capability placement
- run guards
- execute bounded actions
- re-observe
- classify final state
- record guard, action, timing, and outcome data
- release lock
If target predicates fail after mutation, the controller must attempt rollback,
classify a degraded state, or report failed-transition.
Observer Contract
The observer must derive current state from evidence only. It must not trust desired state as proof of success.
Required output fields for JSON mode:
{
"observed_state": "desktop",
"confidence": "high",
"degraded": false,
"signals": {},
"conflicts": [],
"timestamp": "..."
}
Required signal families:
- graphical session presence
- compositor/display-manager state
compute.targetvllm.servicestudio-local-policy.service- PipeWire/JACK/REAPER indicators
- GPU process/VRAM evidence when available
- controller lock/transition marker
- latest failed transition marker
Conservative rule: report unknown, transitioning, degraded-*, or
failed-transition instead of pretending a stable target has been reached.
Guard Contract
Guards are small deterministic checks with stable exit classes:
0 pass
10-19 policy block
20+ execution error
Initial guard set:
check_target_reachablecheck_audio_idlecheck_graphical_session_terminablecheck_gpu_display_releasedcheck_vllm_drainablecheck_compute_capability_localcheck_studio_capability_localcheck_memory_headroomcheck_persistence_paths_ready
Each guard should emit a reason code and evidence payload suitable for
mode explain and transition logs.
Failure Semantics
Blocked transition:
- a guard returns a policy block
- desired state may remain requested
- current state must not be rewritten to target
Execution error:
- a guard or action could not run reliably
- target should not be considered safe
Degraded state:
- system is usable but does not satisfy all target guarantees
- must be surfaced directly in status
Failed transition:
- no stable or acceptable degraded contract could be established
- rollback failed or final observation remained unsafe/conflicted
Runtime Behavior
Status: living
This document describes how Dubnium behaves while switching between host-local modes.
Modes
desktop
Intent:
- interactive workstation and development mode
Expected runtime facts:
- graphical session available
- ordinary audio available
- display GPU protected for UI
- vLLM inactive in v1
- k3s may remain active with bounded platform pressure
studio-local
Intent:
- low-latency local audio profile when studio capability remains on this host
V1 representation:
- overlay on
desktop studio-local-policy.serviceaudio-priority.service- no first-class
studio-local.target
Expected runtime facts:
- graphical session available
- audio-priority policy active
- AI suppressed or inactive
- heavy background pressure reduced
compute
Intent:
- headless throughput mode for AI/platform work
Expected runtime facts:
- graphical session absent or non-authoritative
- compute target active
- vLLM active when enabled
- AI resources assigned according to configured compute GPU profile
- k3s remains active with mode-appropriate platform budget
Supported V1 Transitions
desktop -> studio-local
studio-local -> desktop
desktop -> compute
compute -> desktop
studio-local -> compute should route through desktop policy unless a future
transition contract explicitly allows direct promotion.
desktop -> studio-local
Actions:
- validate studio capability is local
- stop vLLM if active
- verify or isolate
desktop.target - start
studio-local-policy.service - start
audio-priority.service - re-observe
Success predicates:
- observer reports
studio-local - graphical session is available
- studio policy marker is active
- audio-priority overlay is active
- vLLM is inactive
studio-local -> desktop
Actions:
- stop
audio-priority.service - stop
studio-local-policy.service - isolate or verify
desktop.target - re-observe
Success predicates:
- observer reports
desktop - studio policy marker is inactive
- audio-priority overlay is inactive
- graphical session remains available
desktop -> compute
Actions:
- observe source state
- validate local compute capability
- check audio idle
- check graphical session is terminable
- notify or terminate graphical session when configured
- wait for session exit
- check GPU display release predicate
- stop studio-local overlay services if active
- isolate
compute.target - start or verify
vllm.service - re-observe
Success predicates:
- observer reports
compute - graphical session is absent or non-authoritative
- compute target is active
- vLLM is active when enabled
- GPU ownership evidence satisfies compute profile
Acceptable degraded compute examples:
- vLLM active on a reduced GPU profile while meeting minimum compute policy
- non-critical desktop service remains but does not conflict with compute
- residual display allocation is below configured threshold
Failed transition examples:
- source cannot be classified
- audio guard blocks transition
- graphical session cannot terminate
- GPU release predicate returns execution error or unsafe conflict
- compute target starts but observer remains conflicted
compute -> desktop
Actions:
- observe source state
- check vLLM drainability
- stop
vllm.service - isolate
desktop.target - start or verify graphical/session path
- re-observe
Success predicates:
- observer reports
desktop - vLLM is inactive
- graphical session is available
- no compute-only conflict remains
Rollback must be validated through the same post-action observation rules.
ConfigCTL Home Layering Implementation Plan
Purpose
configctl is a generic home-configuration reconciliation CLI.
Dubnium may package and invoke it, but the CLI must not be Dubnium-specific. It should be usable on:
- Dubnium bare metal
- laptops
- WSL
- future NixOS machines
- CI dry-run environments
Dubnium remains responsible for machine policy, runtime modes, services, and local AI infrastructure. configctl owns layered home configuration reconciliation.
Core Model
Per-tool home configuration is organized into ownership layers:
~/.config/<tool>/
├── managed.* # generated by Home Manager/dotfiles; never edit directly
├── local.* # machine-specific; never automatically promoted
├── custom.d/ # user-authored promotion candidates
└── adopted.d/ # fragments already promoted or represented by managed config
Ownership rules:
managed.* -> governed source of truth
local.* -> machine-specific, ignored by promotion
custom.d/* -> promotion candidates
adopted.d/* -> archived/adopted fragments, ignored during normal load
Initial CLI Surface
Implemented commands:
configctl status [tool]
configctl doctor
configctl init <tool>
configctl promote <tool> <fragment>
configctl reconcile [tool]
Phase 0 — Documentation and Skeleton
Status: complete.
Tasks:
- document the per-tool layering contract
- add
configctlpackage scaffold - add initial
configctlscript - expose
configctlfrom the Dubnium flake packages - install
configctlon the workstation target
Phase 1 — Local Layer Initialization
Goal: safe scaffolding of layer directories.
Status: complete.
Commands:
configctl init hypr
configctl init git
configctl init nvim
configctl init zsh
Behavior:
- create
custom.d/ - create
adopted.d/ - create the tool-appropriate
local.*file - do not overwrite existing files
- do not modify managed files
Phase 2 — Status and Doctor
Goal: inspect local layer state without mutating anything.
Status: complete.
configctl status [tool] reports:
- local layer presence
- custom fragment count
- adopted fragment count
- missing expected directories
- unpromoted files in
custom.d/
configctl doctor reports:
- whether essential tools (git, find) are available
- whether the dotfiles repo is found
- whether XDG state/cache/data roots exist
Phase 3 — Promote
Goal: move local configuration fragments into the dotfiles repository.
Status: complete.
configctl promote <tool> <fragment>:
- identifies the fragment in
custom.d/ - copies it to the equivalent path in
external/dotfiles/files/home/ - stages the file in the dotfiles git repository
Promotion remains review-gated via Git (operator must commit and push).
Phase 4 — Reconcile
Goal: detect drift between local overlays and the dotfiles repository.
Status: initial version complete.
configctl reconcile [tool]:
- compares
custom.d/locally with the dotfiles repository - reports files present in dotfiles but missing locally (suggesting a sync or adoption)
Future Phases
- Adoption Manifest: track promoted fragments by hash across machines.
- Governance Integration: link promotion to review workflows.
- Cleanup: automated garbage collection of adopted fragments.
Non-Goals
configctl should not:
- replace Home Manager
- replace Git
- replace NixOS modules
- become Dubnium-specific
- silently promote local configuration
- automatically delete user-authored fragments without an adopted/archive path
- treat runtime state as governed configuration
Diagrams
Status: living
These diagrams use a C4-inspired structure plus state/runtime views.
System Context
flowchart LR
Operator[Local operator]
Host[Dubnium NixOS host]
GPUs[Display and compute GPUs]
Audio[Audio interface]
Studio[Optional external studio host]
Micrantha[Micrantha / k3s workloads]
Models[Local model bundles / runtime data]
Operator -->|mode request/status| Host
Host --> GPUs
Host --> Audio
Host -->|future placement| Studio
Host --> Micrantha
Host --> Models
Container View
flowchart TD
CLI[mode CLI]
Controller[mode-controller]
Observer[observe-current]
Guards[guard scripts]
Systemd["systemd targets/services/slices"]
Workloads["Hyprland, audio, vLLM, k3s"]
Runtime["/run/mode-controller"]
Audit["/var/lib/mode-controller"]
CLI --> Runtime
CLI --> Controller
Controller --> Observer
Controller --> Guards
Controller --> Systemd
Systemd --> Workloads
Observer --> Systemd
Observer --> Runtime
Controller --> Runtime
Controller --> Audit
Mode State View
stateDiagram-v2
[*] --> bootstrapping
bootstrapping --> desktop: boot default
desktop --> studioLocal: request studio-local
studioLocal --> desktop: request desktop
desktop --> transitioning: request compute
compute --> transitioning: request desktop
transitioning --> desktop: observed desktop
transitioning --> compute: observed compute
transitioning --> studioLocal: observed studio-local
transitioning --> degradedDesktop: partial desktop
transitioning --> degradedCompute: partial compute
transitioning --> failedTransition: unsafe/conflicted
degradedDesktop --> desktop: reconcile
degradedCompute --> compute: reconcile
failedTransition --> desktop: rollback succeeds
Reconciliation Sequence
sequenceDiagram
participant U as Operator
participant C as mode CLI
participant R as Reconciler
participant O as Observer
participant G as Guards
participant S as systemd
U->>C: mode request compute
C->>R: start mode-controller@compute
R->>R: acquire lock
R->>O: observe current
O-->>R: desktop with evidence
R->>G: run transition guards
G-->>R: pass/block/error results
R->>S: terminate session / isolate target / start services
R->>O: re-observe
O-->>R: compute or degraded/failed
R-->>C: transition result
C-->>U: status
Rolling Implementation Design
Status: living draft
This file captures the current implementation design for Dubnium as a rolling reference. It should be updated as hardware facts, control-plane contracts, and mode-transition behavior are validated on the real host.
Documentation framework:
- architecture docs live under
docs/architecture/ - accepted decisions live under
docs/decisions/ - operator procedures live under
docs/runbooks/ - this file remains the rolling synthesis, gap register, and implementation backlog
Architecture Summary
Dubnium is a NixOS host that must behave as one physical machine with multiple operational contracts:
desktop: normal Hyprland workstation/dev mode. GUI and ordinary audio are active. The display GPU is protected. AI is off or tightly bounded in v1.studio-local: conditional low-latency audio profile. It is a policy overlay ondesktop, not the center of the architecture. If studio/audio moves to a Mac mini, the host-local state machine should still make sense.compute: headless throughput mode. GUI is absent or non-authoritative. vLLM and platform workloads may use more of the machine, including both GPUs when present.
The key design rule is that desired state and current state are different things:
- Desired state is operator or automation intent, written under
/run/mode-controller. - Current state is observation-derived from runtime facts, not copied from desired state.
- A reconciler moves the system toward desired state through guarded transitions.
- systemd targets, services, and slices are the enforcement layer.
- Transitions must be bounded, logged, idempotent, and able to report blocked, degraded, or failed outcomes explicitly.
The normative source is the Dubnium control-plane specification. Desired state
is authoritative intent, current state is observer output, no transition runs
without a lock, and success requires post-action re-observation. The local docs
and current repo scaffold already align with the main direction: runtime
switching first, no specialisations yet, desktop.target and compute.target
as first-class targets, studio-local as a desktop overlay, vLLM compute-only
in v1, and k3s stable across modes.
Gaps / Risks
The goal is to keep this section operational. Items should either be resolved for v1, converted into implementation work, or left as explicit open questions with an owner before the first live build.
Contradictions to Resolve
Resolved for v1:
| Topic | Decision | Follow-up |
|---|---|---|
studio-local.target vs overlay | Do not create a first-class studio-local.target in v1. Use studio-local-policy.service and audio-priority.service as a desktop overlay. | Update older checklist wording when touching that file. |
| Root-on-RAM / impermanence | Defer Root-on-RAM, /persist, Home Manager, sops-nix, and impermanence until the base bootable control loop works. | Keep persistent path design compatible with adding /persist later. |
modectl vs mode | Keep the local command name mode. | Treat modectl in upstream notes as an older name unless a rename is explicitly requested. |
| Desktop AI vs compute-only vLLM | Keep vLLM compute-only in v1. | Revisit bounded desktop AI only after reliable desktop <-> compute transitions. |
| Maintenance mode | Do not implement maintenance mode in the first milestone. | Reserve state names and avoid enum designs that make maintenance hard to add later. |
Open compatibility item:
- Desired/current state format remains plain text in the current scaffold. This
is acceptable for the first bootable milestone only if transition records
carry structured metadata. The next hardening pass should move toward
desired.jsonandcurrent.json, or explicitly document why the plain-text files remain the stable interface.
Missing Decisions
Resolved for v1:
| Decision | V1 stance |
|---|---|
| Authority model | Require privileged transition execution. The initial operator path is sudo mode request <mode> or root-owned mode-controller@.service. Unprivileged users must not be able to forge desired/current state or transition success. |
| Reboot policy | Boot normalizes to desktop. Do not replay last desired mode across reboot in v1. |
| vLLM service shape | Use one vllm.service, compute-only. Keep the controller and options shaped so vllm@compute.service can replace it later. |
| k3s lifecycle | Keep k3s.service stable across modes in v1. Express mode pressure through platform.slice budgets before adding start/stop behavior. |
Still open before live compute testing:
| Open item | Concrete next step |
|---|---|
| GPU release predicate | Define a target-host predicate using loginctl, compositor absence, nvidia-smi process evidence, and an acceptable residual VRAM threshold. Record both pass and indeterminate outcomes. |
| Degraded thresholds | Define degraded-compute as safe but incomplete compute operation, such as vLLM active on a reduced GPU profile or residual non-critical display allocation below the configured threshold. Define failed-transition for unsafe, conflicting, or unclassified post-action states. |
| Persistent audit location | Choose /var/lib/mode-controller/events.jsonl now, with an option to move it under /persist/var/lib/mode-controller/events.jsonl when impermanence lands. |
| k3s compute policy | Decide whether v1 only changes platform.slice weights or also applies k3s labels/taints for workload intensity. Do not do both until there is a real workload that needs it. |
Risky Assumptions
| Risk | Failure mode | Mitigation |
|---|---|---|
| NVIDIA/Wayland GPU release is sticky | Compute promotion terminates the GUI but leaves display GPU allocations or ambiguous CUDA/display ownership. | Treat GPU release as an observation predicate, not an assumption. Add bounded timeout, residual threshold, and escalation criteria for specialization/reboot-mediated compute. |
systemctl isolate compute.target stops too much | Important baseline services disappear because target dependencies are incomplete. | Keep compute.target minimal and explicitly list required base services. Test with systemctl list-dependencies compute.target before live switching. |
| Shell observer misclassifies mixed states | Status reports compute while GUI, audio, or conflicting services are still active. | Prefer unknown, transitioning, degraded-*, or failed-transition over false success. Add JSON evidence output and snapshot tests. |
| Rollback does not restore a usable desktop | desktop.target starts but graphical session/audio/display remain broken. | Make rollback success require post-rollback observation, not just successful systemctl commands. Record degraded desktop if partially restored. |
/run loses state on reboot | Recent desired/current files disappear and audit history is lost. | Keep live lock/current/desired in /run; write transition history to /var/lib/mode-controller/events.jsonl before introducing impermanence. |
Gap Closure Backlog
These are the smallest useful implementation/doc tasks to close the current gaps without broadening scope:
- Update older checklist references so
studio-localis consistently described as a desktop overlay, not a v1 target. - Add a short
docs/control-plane-decisions.mdor extend this file with a dated decision log for authority model, reboot policy, vLLM shape, and audit location. - Define the exact
observe-current --jsonschema before adding more transition logic. - Define the GPU release predicate in docs, then implement it in
check_gpu_display_released. - Add persistent audit output to
/var/lib/mode-controller/events.jsonl. - Add observer classifications for
degraded-compute,degraded-desktop, andfailed-transitionbefore relying on rollback. - Keep k3s mode behavior limited to
platform.sliceweights until a concrete platform workload proves that labels, taints, or service restarts are needed.
Proposed Repo Structure
Use the existing scaffold and keep it simple:
.
├── flake.nix
├── hosts/
│ └── workstation/
│ ├── default.nix
│ └── hardware-configuration.nix
├── modules/
│ ├── dubnium/
│ │ ├── default.nix
│ │ ├── options.nix
│ │ ├── state.nix
│ │ ├── targets.nix
│ │ ├── slices.nix
│ │ ├── services.nix
│ │ ├── controller.nix
│ │ └── guards.nix
│ └── workloads/
│ ├── hyprland.nix
│ ├── audio.nix
│ ├── nvidia.nix
│ ├── vllm.nix
│ └── k3s.nix
├── pkgs/
│ └── mode-tools.nix
├── scripts/
│ ├── mode
│ ├── reconcile
│ ├── observe-current
│ ├── lib.sh
│ └── guards/
│ ├── check_audio_idle
│ ├── check_gpu_display_released
│ ├── check_graphical_session_terminable
│ ├── check_vllm_drainable
│ ├── check_compute_capability_local
│ ├── check_studio_capability_local
│ ├── check_memory_headroom
│ └── check_persistence_paths_ready
└── docs/
Flake Design
nixosConfigurations.workstationimportshosts/workstation/default.nix.nixosModules.defaultexposes the Dubnium module.packages.x86_64-linux.mode-toolspackages the CLI, observer, reconciler, and guards.- Add
home-manager,sops-nix, andimpermanencelater only when the base transition loop is proven.
Module Layout
options.nix: all host policy knobs: default mode, GPU topology, vLLM model/profile, studio placement, slice weights.state.nix: creates/run/mode-controller, writes generated topology and placement files, initializes boot default.targets.nix: definesdesktop.targetandcompute.target; no v1studio-local.target.slices.nix: definesinteractive.slice,ai.slice,platform.slice.services.nix: marker/policy services likestudio-local-policy.service,audio-priority.service,mode-observe.service.controller.nix:mode-controller@.service, boot normalization unit, permissions.guards.nix: installs guard scripts and documents exit-code contract.workloads/*.nix: workload-specific units, not mode policy.
systemd Targets and Dependencies
desktop.target
Wants=graphical.target
After=graphical.target
compute.target
Conflicts=graphical.target desktop.target
Wants=vllm.service
After=multi-user.target network-online.target
For studio-local, use:
studio-local-policy.service
Type=oneshot
RemainAfterExit=true
Slice=interactive.slice
audio-priority.service
Type=oneshot
RemainAfterExit=true
ExecStart=systemctl set-property --runtime ...
ExecStop=reset slice weights
Slice Structure
interactive.slice: Hyprland/session-adjacent services, audio priority policy, desktop-critical work.ai.slice: vLLM and future AI workloads.platform.slice: k3s and platform/background services.- Optional later:
maintenance.sliceif maintenance mode becomes real.
Service Layout
vllm.service: compute-only in v1,Slice=ai.slice,WantedBy=compute.target, persistent model/cache path outside the Nix store.k3s.service: stable across modes in v1,Slice=platform.slice; mode differences are resource budgets/policy, not start/stop.- Hyprland/display stack: owned by normal graphical/session machinery;
desktop.targetshould depend on it but not become a giant desktop controller. - Audio/PipeWire: normal desktop user services; studio-local only applies priority policy and blocks compute promotion when active audio is detected.
Control Plane Shape
Mode CLI
mode status
mode request <desktop|studio-local|compute>
mode reconcile [--target <mode>]
mode current [--refresh] [--json]
mode desired
mode dry-run <mode>
mode explain [<mode>]
Recommended additions after the first scaffold:
mode guards <target>
mode history
mode last-transition
mode doctor
mode request should be synchronous in v1: return success only after
post-transition observation satisfies the target. Otherwise it should return
non-zero and show the failed or blocking reason.
Observer / Classifier
The observer should be conservative and evidence-first. It should inspect:
- active graphical sessions via
loginctl - compositor/display-manager state
compute.targetandvllm.servicestudio-local-policy.service- PipeWire/JACK/REAPER indicators
- NVIDIA process/VRAM evidence where available
- controller lock/transition marker
- last failed transition marker
Output should support plain mode for scripts and JSON for status/debug:
{
"observed_state": "desktop",
"confidence": "high",
"degraded": false,
"signals": {
"graphical_session_active": true,
"compute_target_active": false,
"vllm_active": false,
"studio_policy_active": false
},
"conflicts": [],
"timestamp": "..."
}
Classification rule: if signals conflict, report transitioning,
degraded-*, or failed-transition; do not pretend the desired target was
reached.
Guard Layout
- Guards are standalone scripts or subcommands.
- Exit codes:
0: pass10-19: policy block20+: execution/check error
- Each guard emits structured JSON or stable key/value output.
- Guards should check one thing each.
Initial guard set:
check_audio_idle: REAPER/PipeWire/JACK activity blocks compute.check_graphical_session_terminable: pre-action check before killing GUI.check_gpu_display_released: post-action validation after GUI teardown.check_vllm_drainable: compute -> desktop.check_compute_capability_local: placement check.check_studio_capability_local: blocks studio-local if externalized.check_memory_headroom: avoids launching compute under obvious pressure.check_persistence_paths_ready: model store/runtime paths exist and are writable.
First Milestone
The smallest bootable milestone should be narrower than “all modes implemented.”
Goal: boot the flake-managed workstation into desktop, expose the control
plane, and prove an observable/auditable desktop baseline before deep workload
switching.
-
Generate real hardware config into
hosts/workstation/hardware-configuration.nix. -
Confirm host options:
dubnium.boot.defaultMode = "desktop"dubnium.hardware.presentGpusdubnium.hardware.displayGpudubnium.hardware.computeGpus- vLLM disabled or compute-only
- studio placement set to
localonly if local audio is still intended
-
Build without switching:
sudo nixos-rebuild build --flake .#workstation -
Switch only after evaluation succeeds:
sudo nixos-rebuild switch --flake .#workstation -
Verify boot/control-plane files:
mode status mode current mode desired sudo ls -la /run/mode-controller -
Verify systemd skeleton:
systemctl status desktop.target systemctl status compute.target systemctl status studio-local-policy.service systemctl status audio-priority.service systemctl status vllm.service -
Prove observer honesty:
- In desktop,
mode currentshould saydesktop. vllm.serviceshould be inactive.studio-local-policy.serviceshould be inactive unless requested.- If evidence conflicts, status should show conflict/degraded/failed rather than silently reporting success.
- In desktop,
-
Test the safe overlay first:
sudo mode request studio-local mode status sudo mode request desktop mode status -
Only then test
desktop -> computewith vLLM either disabled, stubbed, or known-good:sudo mode request compute mode status sudo mode request desktop mode status -
Milestone success criteria:
- The machine boots from the flake.
mode status/current/desiredwork.- Desired/current separation is visible.
- The controller lock prevents concurrent transitions.
- Guard failures are reported distinctly from execution errors.
desktop -> studio-local -> desktopworks as an overlay.desktop -> compute -> desktopeither works or fails with a clear guard/action/post-observation reason.- No failed transition is reported as a successful target mode.
The next milestone after that should be a real desktop <-> compute control
loop with vLLM active, structured audit records, rollback to desktop, and
explicit degraded-compute thresholds.
System Implementation Plan
Status: living plan
This plan is for implementing Dubnium on the actual workstation host. It expands the short bring-up checklist into a cautious, evidence-driven rollout. The goal is not to turn everything on at once. The goal is to prove one layer at a time: hardware facts, Nix evaluation, boot baseline, observer honesty, overlay mode, compute mode, rollback, then hardening.
Current V1 Assumptions
These assumptions come from the current repo configuration and should be confirmed before the first live switch:
| Area | Current assumption |
|---|---|
| Host flake target | .#workstation |
| Hostname | dubnium-workstation |
| Boot default | desktop |
| Studio placement | local |
studio-local representation | desktop overlay using studio-local-policy.service and audio-priority.service |
| vLLM lifecycle | compute-only in v1 |
| vLLM model | Qwen/Qwen2.5-Coder-14B-Instruct |
| Current GPU phase | planned 2 GPUs, currently present [ 0 ] |
| Display GPU | 0 |
| Compute GPUs | [ 0 ] until second GPU is present |
| k3s | disabled in current host config |
| Bootloader | systemd-boot with EFI variable access |
| Runtime state | /run/mode-controller |
Do not proceed to live transition testing until the hardware facts are confirmed against the actual host.
Phase 0: Safety and Ground Truth
Objective: know enough about the machine to avoid destructive or confusing changes.
0.1 Confirm Installation Path
Decide which path applies:
- existing NixOS machine: use
nixos-rebuild buildthenswitch - fresh install from live USB: use the fresh install runbook first
- non-NixOS current OS: do not use this plan directly until disk/install strategy is decided
Exit criteria:
- install path is explicit
- target disk and boot mode are known if fresh installing
- rollback access path is known
0.2 Confirm Remote/Recovery Access
Before switching system configuration:
ip addr
systemctl status sshd
Confirm:
- local keyboard/display access works
- SSH is enabled or a local console is available
- you know how to select an older NixOS generation at boot
- important local data is backed up
Failure mode to avoid:
- switching into a broken graphical/session state with no recovery path
0.3 Capture Hardware Facts
Run on the target host:
lspci -nn | grep -E 'VGA|3D|Audio|USB'
nvidia-smi
lsblk -f
findmnt
bootctl status
Record:
- actual GPU count
- which GPU drives display
- GPU PCI IDs
- NVIDIA driver visibility through
nvidia-smi - boot disk/filesystem layout
- EFI/systemd-boot status
- audio interface and whether REAPER/local studio is still needed on-host
Exit criteria:
dubnium.hardware.presentGpusmatches real visible GPUsdubnium.hardware.displayGpumatches the display pathdubnium.hardware.computeGpusonly references present GPUs- bootloader assumptions match the host
0.4 Decide First Compute Profile
For first live validation, choose the least surprising compute profile:
- with one GPU: compute may terminate the desktop and use GPU
0 - with two GPUs: compute can target both GPUs, but only after single-GPU behavior is proven
- vLLM should stay compute-only
If VRAM is tight, add vLLM guardrails before compute testing:
dubnium.vllm.extraArgs = [
"--max-model-len" "8192"
"--gpu-memory-utilization" "0.70"
"--enforce-eager"
];
Do not add desktop AI in the first rollout.
0.5 Seed Local Model Bundle
Preferred path:
- copy the selected materialized model bundle from the Dubnium USB seed into
/var/lib/dubnium/models - keep model weights out of Git and out of the Nix store
See docs/runbooks/model-seeding.md for the exact operator flow.
Phase 1: Repo and Host Configuration Review
Objective: make the flake match the real system before any switch.
1.1 Generate Hardware Configuration
On the target NixOS machine:
sudo nixos-generate-config --dir ./hosts/workstation
Review:
- root filesystem and boot filesystem entries
- EFI mount point
- generated hardware imports
- NVIDIA-related hardware detection
Do not preserve the placeholder hardware file if it does not match the target.
1.2 Review Host Config
Inspect:
sed -n '1,220p' hosts/workstation/default.nix
Confirm or update:
networking.hostName- bootloader settings
services.openssh.enabledubnium.capabilityPlacement.studiodubnium.vllm.enabledubnium.vllm.modeldubnium.vllm.extraArgsdubnium.hardware.presentGpusdubnium.hardware.displayGpudubnium.hardware.computeGpusdubnium.k3s.enable
Recommended first-system stance:
- keep
boot.defaultMode = "desktop" - keep
enableDesktopProfile = false - keep
k3s.enable = falseuntil mode control is proven - keep
computeGpus = [ 0 ]if only one GPU is currently installed
1.3 Confirm Module Assertions
The module already asserts:
- display GPU must be present
- desktop AI GPUs must be present
- compute GPUs must be present
- vLLM package and model must be set when vLLM is enabled
These assertions are useful. If they fail, fix the host facts rather than bypassing them.
Exit criteria:
- host config expresses real hardware, not planned hardware
- planned hardware is represented only in
plannedGpuCount - actual services enabled match the first rollout scope
Phase 2: Build Without Switching
Objective: prove Nix evaluation and build before mutating the live system.
Run:
sudo nixos-rebuild build --flake .#workstation
If it fails, classify the failure:
- hardware config mismatch
- unfree/NVIDIA package issue
- vLLM package evaluation issue
- missing module import
- syntax or option error
Do not run switch until build succeeds.
Useful follow-up checks:
nix flake check
nix build .#packages.x86_64-linux.mode-tools
Exit criteria:
- flake builds successfully
mode-toolspackage builds- no host option assertion is failing
Phase 3: First Switch to Desktop Baseline
Objective: switch only into the safe desktop-default posture.
Run:
sudo nixos-rebuild switch --flake .#workstation
Immediately check:
hostname
mode status
mode current
mode desired
sudo ls -la /run/mode-controller
systemctl status desktop.target
systemctl status compute.target
systemctl status studio-local-policy.service
systemctl status audio-priority.service
systemctl status vllm.service
Expected:
- host boots or remains usable
- desired mode is
desktop - current mode is
desktop, or a clearly explained non-desktop state vllm.serviceis inactive in desktopstudio-local-policy.serviceis inactiveaudio-priority.serviceis inactive/run/mode-controllerexists
If mode current reports compute or studio-local unexpectedly, stop and fix
observation before testing transitions.
Exit criteria:
- desktop baseline is usable
- mode CLI works
- observer output matches visible reality
Phase 4: Control-Plane Inspection Before Transitions
Objective: prove the controller can explain the system before it mutates the system.
Run:
mode status
mode current --refresh
mode current --json
mode explain desktop
mode explain studio-local
mode explain compute
sudo cat /run/mode-controller/capability-placement.json
sudo cat /run/mode-controller/hardware-topology.json
Check that the JSON/evidence shape is useful enough to diagnose:
- graphical session active or not
- studio policy active or not
- compute target active or not
- vLLM active or not
- last transition status
If mode current --json is too thin, harden observer output before running
compute transitions. The observer is the foundation of safe switching.
Exit criteria:
- status output distinguishes desired and current
- current state is derived from facts
- hardware and placement files match host configuration
Phase 5: Test desktop -> studio-local -> desktop
Objective: prove the low-risk overlay path before terminating the GUI for compute.
Run:
sudo mode request studio-local
mode status
systemctl status studio-local-policy.service
systemctl status audio-priority.service
systemctl show interactive.slice -p CPUWeight -p IOWeight
systemctl show ai.slice -p CPUWeight -p IOWeight
systemctl show platform.slice -p CPUWeight -p IOWeight
Expected:
- observed mode becomes
studio-local studio-local-policy.serviceis activeaudio-priority.serviceis active- interactive slice weights are raised
- AI/platform slice weights are lowered
- vLLM remains inactive
Return to desktop:
sudo mode request desktop
mode status
systemctl status studio-local-policy.service
systemctl status audio-priority.service
systemctl show interactive.slice -p CPUWeight -p IOWeight
systemctl show ai.slice -p CPUWeight -p IOWeight
systemctl show platform.slice -p CPUWeight -p IOWeight
Expected:
- observed mode becomes
desktop - overlay services are inactive
- slice weights return to baseline
Exit criteria:
- overlay activation and cleanup are repeatable
- observer accurately distinguishes desktop and studio-local
- failure records are useful if a command fails
Phase 6: Precompute Guard Validation
Objective: test compute guards without trusting the full transition yet.
Before running a real compute transition:
mode status
systemctl status vllm.service
loginctl list-sessions
Manually confirm:
- no active REAPER project
- no live audio session you care about
- no long-running foreground job
- model store path has enough space
- vLLM model choice fits current GPU memory plan
Run or inspect guards if exposed through the CLI. If not yet exposed, use the
existing transition path cautiously and rely on last-guards.json.
Compute should be blocked when:
- audio is active
- graphical session is not terminable
- memory headroom is insufficient
- target is not reachable
- required persistence paths are missing
Exit criteria:
- you know which guards are hard blocks
- guard failures are visible in
last-guards.json - no guard silently assumes success
Phase 7: First desktop -> compute Transition
Objective: prove one real promotion into compute, accepting that the first attempt may reveal NVIDIA/session behavior.
Preconditions:
- desktop baseline has already been verified
- studio overlay path has already been verified
- no critical local work is running
- local console or SSH recovery is available
Run:
sudo mode request compute
Then inspect:
mode status
systemctl status compute.target
systemctl status vllm.service
loginctl list-sessions
nvidia-smi
sudo cat /run/mode-controller/last-transition.json
sudo cat /run/mode-controller/last-guards.json
journalctl -u 'mode-controller@*' -b
journalctl -u vllm.service -b
Expected success:
- observed mode is
compute - graphical session is absent or non-authoritative
compute.targetis activevllm.serviceis active if enabled- GPU process evidence matches compute expectations
- transition record says success
Acceptable first degraded outcomes:
- vLLM starts but only on reduced GPU profile
- residual display allocation remains below a documented threshold
- non-critical desktop unit remains active without resource conflict
Hard failures:
- observer cannot classify final state
- audio or GUI conflict remains
- GPU release is indeterminate
- vLLM fails repeatedly and prevents compute contract
- rollback cannot restore desktop
If the transition fails, do not keep retrying blindly. Diagnose the first failed predicate.
Phase 8: First compute -> desktop Return
Objective: prove rollback/restoration before treating compute as usable.
Run:
sudo mode request desktop
Then inspect:
mode status
systemctl status desktop.target
systemctl status vllm.service
loginctl list-sessions
nvidia-smi
Expected:
- observed mode is
desktop vllm.serviceis inactive- graphical session path is usable
- audio returns to ordinary desktop behavior
- no compute-only state remains authoritative
If desktop is only partially restored, classify the result as degraded and fix the observer/controller before more compute testing.
Exit criteria:
- one complete
desktop -> compute -> desktoploop works or fails with a clear documented reason - rollback is evidence-backed
Phase 9: Repeatability and Soak
Objective: distinguish a one-time success from a reliable operating model.
Repeat:
sudo mode request studio-local
sudo mode request desktop
sudo mode request compute
sudo mode request desktop
For each run, record:
- final
mode status - transition duration
- guard output
- whether GPU release was clean
- whether desktop restoration was clean
- whether vLLM startup was reliable
Minimum repeatability bar before broader usage:
- 3 clean studio overlay round trips
- 3 clean compute round trips
- no false-success observer classifications
- no unexplained stale locks
- no manual cleanup needed between runs
Phase 10: Hardening Backlog
Only after the first transition loop is proven, prioritize hardening in this order:
- Richer
observe-current --jsonevidence and conflicts. - Persistent audit log at
/var/lib/mode-controller/events.jsonl. - Explicit GPU release predicate and thresholds.
- Degraded state classification for desktop and compute.
- Guard CLI surface such as
mode guards <target>. - vLLM runtime guardrails and model store persistence.
- k3s enablement and
platform.slicepolicy. - Optional impermanence and
/persistmapping. - Bounded desktop AI after second GPU and stable transitions.
- Specialisation evaluation only if runtime switching fails repeatedly.
Stop Conditions
Stop implementation and return to planning if any of these occur:
- the observer reports false success
- desktop cannot be restored through the controller
- GPU release is repeatedly indeterminate
- target isolation stops recovery-critical services
- vLLM causes repeated OOM or driver instability
- failures require undocumented manual cleanup
The correct response to any stop condition is not more automation. First improve observation, logs, predicates, and rollback.
Evidence to Keep
For each major milestone, keep the following:
mode status
mode current --json
sudo cat /run/mode-controller/last-transition.json
sudo cat /run/mode-controller/last-guards.json
systemctl status desktop.target compute.target vllm.service
nvidia-smi
journalctl -u 'mode-controller@*' -b
For repeated failures, copy the relevant evidence into an issue, planning note, or future runbook update before changing more code.
External Sources
Dotfiles
Dubnium uses the external/dotfiles checkout for user-level Home Manager
configuration.
- Repository: ryjen/dotfiles
- Branch:
feat/nix-migration - Local path:
external/dotfiles
The submodule contract is declared in .gitmodules.
ADR-0001: Runtime Switching First
Status: accepted
Context
Dubnium needs to move between interactive desktop behavior and headless compute behavior. NixOS specialisations may eventually provide stronger separation, but they require reboot-mediated workflows and would slow down early validation.
Decision
Use runtime switching first. Implement mode changes through a local reconciliation loop using systemd targets, services, slices, guards, and post-action observation.
Do not introduce NixOS specialisations in v1.
Consequences
- Rebootless switching can be validated early.
- The observer and guard layer must be conservative.
- GPU release reliability becomes a live risk.
- Specialisations remain an escalation path if runtime switching proves too brittle.
Escalation Criteria
Reconsider specialisations or reboot-mediated compute if:
- display GPU release remains unreliable after bounded iteration
- compute promotion frequently lands in degraded or ambiguous states
- desktop restoration is unreliable
- kernel/module settings diverge materially between modes
ADR-0002: Studio-Local Is a Desktop Overlay
Status: accepted
Context
The host may support local low-latency audio work, but studio capability may move to an external Mac mini or another host. The architecture must not overfit around local studio behavior.
Decision
Represent studio-local as a policy overlay on desktop in v1.
Use:
studio-local-policy.serviceaudio-priority.service
Do not create a first-class studio-local.target in v1.
Consequences
- The host-local state model remains coherent if studio capability moves away.
- Studio policy can be applied and removed without a separate top-level target.
- The observer still reports
studio-localas a mode when overlay predicates are satisfied. - Any direct
studio-local -> computepath should be routed through desktop policy unless a future transition contract explicitly permits it.
ADR-0003: vLLM Is Compute-Only in V1
Status: accepted
Context
Desktop-mode AI is possible in the target architecture, especially when a second GPU is installed. For the first reliable control-loop milestone, desktop AI adds resource contention and observer complexity.
Decision
Keep vLLM compute-only in v1.
Use one vllm.service attached to compute behavior. Shape options and
controller actions so vllm@compute.service and a future bounded desktop
profile can be added later.
Consequences
desktopandstudio-localshould leave vLLM inactive.computeowns vLLM activation.- The first milestone can focus on mode transitions and observation.
- Bounded desktop AI is deferred until
desktop <-> computeswitching is reliable on real hardware.
ADR-0004: Boot Defaults to Desktop
Status: accepted
Context
The control-plane specification asks whether the system should replay the last desired mode after reboot or normalize to a safe default. Replaying compute after reboot could surprise the operator and re-enter a throughput posture without current evidence.
Decision
In v1, boot normalizes to desktop.
Do not replay the last desired mode across reboot.
Consequences
- First boot behavior is predictable and operator-friendly.
/run/mode-controllercan remain ephemeral for live state.- Persistent desired replay can be revisited after transition behavior and audit history are proven.
ADR-0005: k3s Stays Stable Across Modes in V1
Status: accepted
Context
k3s provides platform/control-node duties. Starting and stopping it during every mode transition would add operational churn before there is evidence that it is needed.
Decision
Keep k3s.service stable across desktop, studio-local, and compute in
v1.
Express mode differences through platform.slice budgets first. Defer labels,
taints, workload intensity policies, and service lifecycle changes until a real
platform workload requires them.
Consequences
- Mode switching has fewer moving parts.
- k3s remains available during desktop and compute operation.
- Platform pressure must be bounded through slice policy until richer k3s mode behavior is justified.
ADR-0006: Tailscale Platform Connectivity
Status: accepted
Context
Dubnium needs stable remote reachability for the workstation without moving user-level shell, editor, or agent configuration into the system repository. Tailscale is machine and network identity, so it belongs with Dubnium’s platform policy rather than dotfiles.
Tailscale can also provide subnet routing, exit-node behavior, automatic enrollment, and Tailscale SSH. Those features change routing, firewalling, access control, and trust boundaries, so they should not be enabled as an incidental side effect of installing the client daemon.
Decision
Enable Tailscale as workstation-only platform connectivity in v1.
Dubnium will enable tailscaled and the tailscale CLI on the workstation, but
node enrollment remains manual with sudo tailscale up.
Do not enable auth-key or OAuth enrollment, subnet routing, exit-node behavior, or Tailscale SSH in v1. Document those as future options that require explicit routing, ACL, firewall, and secrets-policy review.
Consequences
- The workstation can join the tailnet with a small, reviewable system change.
- Dotfiles remains responsible for user-level tooling only.
- First enrollment is an operator action instead of a rebuild side effect.
- Future subnet router, exit-node, and Tailscale SSH support has a documented path without widening v1 network exposure.
ADR-0007: WSL Is a Headless Validation Target
Status: accepted
Context
Dubnium needs a fast way to validate shared flake composition, module wiring, and mode-controller behavior before every change has to run on the bare-metal workstation target.
WSL is useful for that loop, but it is not equivalent to the real workstation. It does not validate EFI, bootloader behavior, workstation hardware generation, Hyprland, audio/studio behavior, NVIDIA runtime details, or final GPU topology.
The upstream nix-community/NixOS-WSL project already owns the WSL-specific
boot and integration layer. Reimplementing that locally would create another
platform surface for Dubnium to maintain before there is evidence that it is
needed.
Decision
Keep wsl as a first-class flake host target for headless validation, built on
top of nix-community/NixOS-WSL.
Use .#wsl to validate shared Dubnium composition and headless services inside
an existing NixOS-WSL distro. Set its default Dubnium mode to compute, enable
the shared mode controller, and keep resource-heavy services such as vllm and
k3s disabled by default. Enable those services intentionally when the task is
specifically to validate their WSL runtime behavior.
Do not treat .#wsl as a replacement for .#workstation, the bare-metal
install path, or workstation hardware validation.
Consequences
- Shared module wiring and activation can be exercised from a faster Windows/WSL loop.
- WSL-specific platform support stays delegated to the upstream NixOS-WSL module.
- The WSL target remains intentionally headless, compute-biased, and lightweight.
- Passing WSL validation does not prove workstation graphics, audio, bootloader, EFI, NVIDIA, or final GPU behavior.
- WSL runbooks and checks must stay separate from bare-metal first-bring-up and fresh-install procedures.
Escalation Criteria
Reconsider the WSL target shape if:
- upstream NixOS-WSL no longer supports the required system integration points
- WSL behavior diverges enough from Dubnium’s shared module graph to make the target misleading
- bare-metal validation becomes cheap and reliable enough that a separate WSL target no longer reduces risk or cycle time
ADR-0008: Seed Local vLLM Model Bundles
Status: accepted
Context
Dubnium’s first compute workload uses vLLM with a locally served model bundle. The exact model is host configuration, not part of the USB seed format.
Model weights are large mutable runtime artifacts. Keeping them in Git would inflate the repository and blur source policy with runtime state. Keeping them in the Nix store would make first install, rebuild, and recovery depend on large model fetches during system activation and would couple model bytes to immutable system generations.
Fresh install and recovery should work even when the machine does not yet have
reliable network access. The seed format should not depend on Hugging Face hub
cache internals such as refs, blobs, snapshots, or symlinks.
Decision
Keep model weights out of Git and out of the Nix store.
Treat /var/lib/dubnium/models as the Dubnium-owned runtime model store. Seed
normal local model bundle directories from removable media as the preferred v1
provisioning path.
Use a materialized bundle directory for the selected compute model. The workstation vLLM service serves a path under:
/var/lib/dubnium/models
If a Hugging Face cache is used as the source of the seed, materialize the
snapshot once before putting it on the USB. The runtime seed and installed model
store should be ordinary directories with model files and SHA256SUMS.
Consequences
- The Dubnium repository stays small and source-only.
- Nix continues to own service policy and runtime configuration, not model artifact storage.
- Fresh install and recovery can avoid depending on a large network download.
- Runtime no longer depends on Hugging Face cache layout or symlink behavior.
- Operators must manage the seed media and verify the local bundle before entering compute mode.
- Reproducibility of model bytes depends on the seed contents until a specific model revision is selected and recorded.
- vLLM startup failures may indicate an absent, incomplete, misplaced, or revision-mismatched local model bundle.
Escalation Criteria
Reconsider this policy if:
- model revision pinning becomes mandatory for reproducible evaluation
- a dedicated artifact mirror or cache service becomes available
- install-time network access becomes reliable enough to remove the USB seed path
- model storage needs to support multiple served models, quantized variants, or per-mode model selection
ADR-0009: Manage Runtime Secrets Outside Nix Source
Status: accepted
Context
Dubnium needs private material for several different lifetimes:
- local source payloads for installing this private repository
- host-local identities for services such as Tailscale
- runtime tokens for workloads such as vLLM model downloads
- user-runtime tokens for tools such as Codex and GitHub CLIs after install
- large private or mutable artifacts such as model weights
These are not the same class of data. Treating all of them as Nix source would either leak secrets into Git, copy secret bytes into the Nix store, or make activation depend on external state that belongs to the operator.
Existing Dubnium policy already keeps the repository source-only and keeps vLLM
model weights in runtime cache state. Installer bootstrap should use local
source payloads, such as a git archive tarball or copied working tree, rather
than GitHub credentials in the live installer.
Decision
Use sops-nix with age recipients as the preferred provider for runtime service secrets.
Commit only encrypted SOPS documents and non-secret policy. Decrypt secrets at
activation into runtime paths under /run/secrets or into sops-nix generated
environment files. Services consume those paths; Nix modules declare the
consumer contract, not the secret value.
Keep install source bootstrap separate from runtime secrets. Install media should use a local source payload prepared before booting the target machine. Do not require GitHub credentials during install.
Allow user-runtime secrets after install. Tools such as Codex may need an
OPENAI_API_KEY, and user workflows may later need a GITHUB_TOKEN. Those
tokens belong to the user runtime, not installer bootstrap, and should be
decrypted by Home Manager or another user-scoped secret mechanism at session or
process launch time.
Keep host enrollment identities separate from ordinary workload tokens. Tailscale remains manually enrolled for v1. If unattended enrollment is added later, it must use a short-lived auth key passed once during enrollment rather than a long-lived key committed to source.
Keep model weights out of Git, out of SOPS, and out of the Nix store. The
Dubnium model store under /var/lib/dubnium/models remains mutable runtime
state, not secret state.
Consequences
- The repository can contain secret wiring without containing secret values.
- Host rebuilds can declare which services need secrets without exposing those secrets in derivations or module options.
- Operators must manage age identities and encrypted SOPS files during bring-up.
- Secret rotation is done by updating encrypted SOPS data and rebuilding or restarting affected services.
- Source bootstrap, enrollment, runtime tokens, and model artifacts keep separate handling rules instead of sharing one overloaded mechanism.
Escalation Criteria
Reconsider this policy if:
- Dubnium gains a dedicated external secret manager
- unattended installation needs to handle many machines at once
- secret rotation needs central audit or approval workflows
- Kubernetes-hosted workloads become the primary secret consumers
ADR-0010: Keep Persistent Memory Separate From vLLM Runtime
Status: accepted
Context
Dubnium is evolving from a local vLLM compute node toward longer-lived conversational and agentic workflows. Those workflows need durable recall, replayability, externally observable metadata, lifecycle hooks, and scoped retrieval.
vLLM is already the inference runtime for Dubnium’s compute mode. It is built to serve tokens with batching, prefix caching, streaming, model lifecycle control, and GPU-aware scheduling. It is not the right owner for durable user memory, agent task state, retention policy, or governance metadata.
The target hardware is constrained. Dual 12GB RTX 3060 GPUs leave limited room for oversized context windows, high concurrency, and unnecessary KV-cache pressure. Treating persistent memory as “keep all context in the model” would make latency, reliability, and recovery worse.
Persistent memory also changes the security posture. Model output, retrieved documents, tool results, artifacts, and prior conversation summaries are all untrusted inputs when they cross a new session boundary. Without structured metadata and lifecycle events, a future governance layer cannot inspect, constrain, attest, or replay memory behavior.
Decision
Keep vLLM as the inference runtime only.
Build persistent memory as a separate subsystem owned by orchestration, retrieval, storage, summarization, and compaction layers. Orchestrators assemble prompts from working context, retrieved memories, task state, and artifact references before calling vLLM.
Keep the future governance layer external to the memory/runtime architecture. Dubnium memory/runtime should expose structured records, metadata, and lifecycle hooks for governance to inspect later, but vLLM, vector stores, artifact stores, and MemGPT-style runtimes should not depend directly on that future substrate.
Do not persist transformer KV state as the durable memory mechanism. KV cache state can remain an inference optimization inside vLLM, but durable memory must be replayable from stored events, summaries, artifacts, metadata, and retrieval records.
Use separate memory classes:
- working context for current session continuity
- episodic memory for meaningful historical interactions
- semantic memory for normalized stable facts and conventions
- task state for active workflows, checkpoints, and execution graphs
- artifacts for external files, logs, generated outputs, and large payloads
- metadata for provenance, trust hints, retention hints, sensitivity hints, and scope
The first implementation milestone should use a conservative local stack:
- Postgres for structured memory, sessions, tasks, artifacts, and provenance
- pgvector for local vector search
- Redis for transient working context and queues where useful
- a small embedding model such as bge-small or nomic-embed
- rolling summaries instead of transcript replay
- scoped retrieval before prompt assembly
Treat MemGPT-style self-editing memory as a later orchestration upgrade path, not the first storage substrate. The current maintained framework from that lineage is Letta; evaluate it after Dubnium has stable local memory storage, retrieval filters, redaction, provenance, and replay checks. If adopted, it should sit above the persistent memory subsystem and vLLM runtime instead of replacing Dubnium’s metadata, lifecycle hooks, or runtime-secret boundaries.
Boundaries
The inference layer owns token generation, batching, streaming, prefix caching, model startup, GPU assignment, and service health.
The memory subsystem owns storage, retrieval, summarization, embedding, compaction, artifact references, provenance records, and replay inputs.
The orchestration layer owns prompt assembly, scoped retrieval requests, tool coordination, and task workflow progression.
The future governance layer is adjacent. It may later evaluate policy, provenance, trust, retention, audit, and replay concerns by inspecting the structured records emitted by this layer, but it is not embedded in the vLLM runtime, memory database, vector store, artifact store, or MemGPT-style runtime.
Security Model
Assume all inputs are untrusted, including model output and retrieved memories.
Trust boundaries include:
- user and agent prompts entering the orchestrator
- model output entering summarization or memory extraction
- tool output entering task state or memory storage
- external documents entering retrieval indexes
- retrieved memory entering prompt assembly
- retrieval metadata controlling visibility and retention
Durable memory objects must carry enough metadata to support later policy and audit decisions:
- source identity
- provenance
- validation status or validation hints
- trust score
- sensitivity classification
- retention hint or TTL
- namespace or project scope
- agent boundary
- replay lineage
The first milestone must emit enough structure to support mitigation of:
- memory poisoning through confidence and validation metadata
- persistent prompt injection through instruction classification metadata
- cross-agent leakage through scoped namespaces and retrieval events
- sensitive data retention through redaction markers and TTL metadata
Do not store credentials, raw secret payloads, or private tokens as memories. Secret values remain governed by the runtime-secret policy in ADR-0009.
Consequences
- vLLM workers can stay mostly stateless and focused on low-latency inference.
- Memory behavior can be tested, replayed, audited, and evolved without changing the inference service contract.
- Prompt size stays bounded by retrieval and compression rather than by naive transcript replay.
- Future governance remains possible because memory, retrieval, artifact, and runtime events are structured and externally observable.
- Governance does not become an embedded runtime dependency.
- More infrastructure is required before memory-backed agents are production ready.
- Retrieval quality, memory drift, stale facts, and hallucinated recall become explicit validation targets.
- Binary artifacts remain externalized and are referenced through metadata or on-demand multimodal inference rather than injected into prompts by default.
Escalation Criteria
Reconsider this policy if:
- vLLM gains a production-grade durable memory interface with replayable external metadata
- local hardware changes enough that long-context replay is cheaper than external memory retrieval
- a dedicated Anthesis-aligned memory service becomes the primary Dubnium memory provider
- Letta or another MemGPT-style agent framework can integrate with Dubnium’s storage, metadata, and replay contracts without becoming the source of truth
- compliance requirements demand a concrete external governance authority, attestation system, or retention architecture
References
ADR-0010: External Ownership Boundaries
Status: accepted
Context
Dubnium is evolving into a machine orchestration and runtime policy layer for a hybrid workstation and AI-node environment.
The repository already integrates an external dotfiles source for Home Manager
and user-scoped configuration. At the same time, the local k3s integration in
Dubnium remains intentionally thin and partially placeholder while broader
cluster automation work evolves separately.
Without an explicit ownership boundary, there is a risk that:
- machine policy drifts into user-home concerns
- cluster bootstrap logic becomes duplicated across repositories
- recovery boundaries become unclear
- operational responsibilities overlap
- host rebuilds become fragile or non-reproducible
Decision
Dubnium adopts a layered ownership model.
Dubnium
Dubnium is the authoritative repository for:
- machine identity
- NixOS host composition
- runtime mode control
- systemd orchestration
- hardware policy
- GPU placement policy
- runtime reconciliation
- machine-scoped secrets and service contracts
Dubnium orchestrates external systems but should avoid duplicating their source of truth.
Dotfiles
ryjen/dotfiles is the authoritative repository for:
- Home Manager configuration
- user shell configuration
- editor configuration
- CLI tooling
- user-scoped agent tooling
- workstation UX preferences
- user-scoped secrets materialization
Dubnium may consume dotfiles directly through flake inputs and local checkout paths.
Laboratory
hackelia-micrantha/laboratory is the intended authoritative repository for:
- local cluster bootstrap
- k3s deployment orchestration
- Flux bootstrap and reconciliation
- GitOps substrate configuration
- cluster overlays and platform services
- environment lifecycle workflows
Dubnium may invoke Laboratory entrypoints but should avoid embedding full cluster orchestration logic internally.
Consequences
Positive
- cleaner recovery boundaries
- reduced duplication
- improved source-of-truth clarity
- safer rebuild semantics
- clearer operational ownership
- easier future migration of cluster workflows
Negative
- additional repository coordination
- version pinning discipline becomes important
- submodule or external checkout management complexity
- bootstrap sequencing becomes more explicit
Current Implementation State
Current repository state:
- dotfiles integration exists today
- local
k3swiring remains host-local and intentionally thin - Laboratory integration is planned but not yet fully wired into runtime flows
The current v1 implementation keeps k3s operationally local while explicitly
preparing for externalized cluster bootstrap ownership.
Operational Rules
- machine boot must not depend on successful Laboratory reconciliation
- machine boot must not depend on user-home customization success
- dotfiles failure degrades user experience, not machine orchestration
- Laboratory failure degrades cluster capabilities, not machine orchestration
- Dubnium remains the root machine control plane
Follow-Up Work
- add stable Laboratory bootstrap entrypoints
- add optional
external/laboratorycheckout integration - add bootstrap and validation scripts
- tighten version pinning and provenance validation
- reduce placeholder local cluster assumptions over time
Dubctl Flake Input Manager
dubctl is Dubnium’s small helper for managing top-level flake
inputs. It is intended for quick add, remove, search, list, and update
operations without hand-editing the common inputs = { ... }; block every
time.
dubctl manages only flake inputs. It does not wire new inputs into outputs,
NixOS modules, package sets, overlays, or Home Manager arguments. Make those
call-site changes explicitly after adding an input.
Install and Run
From this repository:
nix run .#dubctl -- list
Install into a profile:
nix profile install .#dubctl
dubctl list --flake /path/to/dubnium
For local development without Nix packaging:
scripts/dubctl list
Commands
List current inputs:
dubctl list
Search input names and definitions:
dubctl search nix
Show one input definition:
dubctl info nixpkgs
Add an input:
dubctl install foo github:owner/repo
Add an input that follows nixpkgs:
dubctl install foo github:owner/repo --follows nixpkgs
Remove an input:
dubctl remove foo
Update all lock entries:
dubctl update
Update one lock entry:
dubctl update nixpkgs
Use a specific flake directory or file:
dubctl --flake /path/to/repo list
dubctl --flake /path/to/repo/flake.nix info nixpkgs
Lockfile Behavior
install and remove run nix flake lock after editing flake.nix. Use
--no-lock when staging or testing a source-only change:
dubctl install foo github:owner/repo --no-lock
dubctl remove foo --no-lock
update runs nix flake update, with an optional input name.
Safety Model
dubctl treats command arguments as untrusted input.
Controls:
- input names must be Nix attr-safe names
- URLs cannot be empty and cannot contain quotes or newlines
- edits are limited to the top-level
inputs = { ... };block - mutations write
flake.nix.bakbefore changingflake.nix - Nix commands are invoked with argv arrays, not shell string concatenation
The backup is local operator safety only. Review the diff before committing.
When Not To Use Dubctl
Do not use dubctl for:
- changing
outputsarguments - adding module imports
- adding overlays
- changing Home Manager extra arguments
- editing nested flakes such as
external/dotfilesunless you pass that flake path explicitly
Those changes are architectural wiring, not package-manager operations.
Runbook: Post-Install Source Reconciliation
Status: living
Use this after a fresh install when the installer source snapshot has produced local changes that should become normal Dubnium repo history.
The custom installer payload is an export-style source snapshot on the USB live
system. It is suitable for running nixos-install, but it does not
automatically become a durable checkout inside the installed OS. Even when the
snapshot is copied into the target filesystem, it is not the long-term working
copy because it does not include .git history.
Desired Shape
- installed system has a normal Git checkout for Dubnium
- install-time changes are reviewed as a Git diff
- host-specific files are committed only when they belong in repo policy
- secrets, tokens, model weights, local caches, and temporary installer state stay out of Git
1. Locate Or Recreate The Install Snapshot
After first boot, start by checking whether the installer source was copied into the installed filesystem:
test -e ~/local/src/dubnium/flake.nix
If it was not copied, boot the custom installer USB or mount the prepared source
media again and import the same source snapshot into a temporary location, such
as ~/local/src/dubnium-install-snapshot. The goal is to recover any
install-time edits, especially the generated hardware config.
If the installed system already has the copied installer source at
~/local/src/dubnium, check whether it is a Git checkout:
cd ~/local/src/dubnium
git rev-parse --is-inside-work-tree
If that fails, keep the snapshot as evidence and make room for a real checkout:
cd ~/local/src
mv dubnium dubnium-install-snapshot
If the source was copied elsewhere, use that path as the snapshot path in the commands below. If there were no install-time source edits to preserve, skip the snapshot and create the canonical checkout directly.
2. Create The Canonical Checkout
Clone the private Dubnium repo using the installed system’s normal operator credential path. Prefer SSH keys or an intentional short-lived HTTPS token; do not reuse live-installer credentials as a persistent access mechanism.
mkdir -p ~/local/src
git clone <dubnium-private-repo-url> ~/local/src/dubnium
cd ~/local/src/dubnium
git submodule update --init --recursive
If the installed machine should use a different source root, keep the same pattern: one normal Git checkout, and one preserved installer snapshot until the diff has been reconciled.
3. Bring Across Intentional Install Changes
Copy only the changes that should become repo state. The most common first install candidate is the generated hardware config:
cp ~/local/src/dubnium-install-snapshot/hosts/workstation/hardware-configuration.nix \
hosts/workstation/hardware-configuration.nix
Review any optional host-local file before copying it. For example,
hosts/workstation/user.nix may be useful on the installed machine, but it
should be committed only if the repo is meant to carry that exact user policy.
For a broader comparison between the preserved snapshot and the canonical checkout:
diff -ruN \
~/local/src/dubnium-install-snapshot/hosts/workstation \
~/local/src/dubnium/hosts/workstation
Prefer copying specific files over bulk-syncing the snapshot into the checkout.
4. Review, Test, Commit, Push
From the canonical checkout:
git status --short
git diff -- hosts/workstation modules docs
nix --extra-experimental-features 'nix-command flakes' \
eval .#nixosConfigurations.workstation.config.networking.hostName
git add hosts/workstation/hardware-configuration.nix
git commit -m "Record workstation hardware configuration"
git push
Use a broader validation command when the reconciled change touches modules, services, or shared policy. If evaluation or rebuild fails, keep the snapshot and the Git checkout separate until the failure is understood.
5. Rebuild From The Canonical Checkout
After the change is committed or intentionally kept as local-only state, rebuild from the normal checkout rather than the installer snapshot:
sudo nixos-rebuild switch --flake ~/local/src/dubnium#workstation
Once the canonical checkout has the needed changes and the system rebuilds from it, the preserved install snapshot can be archived or deleted.
Runbook: Laboratory Bootstrap
Status: living
This runbook describes the current intended integration boundary between:
- Dubnium
ryjen/dotfileshackelia-micrantha/laboratory
Dubnium owns machine orchestration and runtime policy.
Laboratory is the intended source of truth for:
- k3s bootstrap
- Flux bootstrap
- GitOps reconciliation
- local cluster lifecycle operations
Current State
The current Dubnium repository still contains a thin local k3s integration for
v1 bring-up.
The long-term intended direction is:
- Dubnium owns host orchestration
- Laboratory owns cluster orchestration
This runbook defines the current bootstrap contract without pretending the full migration is already complete.
Expected Repository Shape
Typical local source layout:
~/local/src/
├── dubnium/
│ ├── external/dotfiles/
│ └── external/laboratory/
The external/laboratory checkout may be:
- a Git submodule
- a manually managed checkout
- another intentionally pinned local source path
The preferred integration ref today is:
feature/fresh
Bootstrap Flow
After the machine is operational:
- validate Dubnium host state
- validate user environment
- bootstrap Laboratory
- validate cluster state
- fetch kubeconfig
- validate Flux reconciliation
Prerequisites
Laboratory expects tooling such as:
tofuorterraformansiblekubectlfluxjq
See the Laboratory repository for current authoritative prerequisites.
Bootstrap Command
Dubnium exposes a thin wrapper entrypoint:
scripts/bootstrap-lab
The wrapper intentionally:
- validates the checkout exists
- validates the repository shape looks correct
- warns when the checkout ref differs from the preferred ref
- delegates execution into Laboratory
The wrapper intentionally does not duplicate Laboratory internals.
Environment Overrides
Optional overrides:
export DUBNIUM_LAB_PATH=~/local/src/laboratory
export DUBNIUM_LAB_REF=feature/fresh
Override the delegated bootstrap command:
export DUBNIUM_LAB_BOOTSTRAP_CMD='make deploy ENV=local'
Default Delegated Flow
Current default delegated flow:
make deploy ENV=local && \
make local-kubeconfig ENV=local && \
make validate ENV=local
This is intentionally conservative while the integration boundary evolves.
Failure Boundaries
If Laboratory bootstrap fails:
- Dubnium machine orchestration should still function
- mode transitions should still function
- user environment should still function
- only cluster capabilities should be degraded
Machine boot must not require successful Laboratory reconciliation.
Recovery
To retry the bootstrap:
scripts/bootstrap-lab
To validate current cluster state directly through Laboratory:
cd external/laboratory
make validate ENV=local
Runtime Secrets
Dubnium uses sops-nix with age for runtime service secrets. Nix declares which services consume secrets; secret values stay out of Git, module options, and the Nix store.
Secret Classes
Use separate handling for each class:
- Source bootstrap: prepare a local repo archive or copied working tree before install; do not require GitHub credentials in the installer.
- Runtime service tokens: encrypt with SOPS and expose to services through
/run/secretsor generated environment files. - User-runtime tokens: decrypt through the user profile after install for tools such as Codex, GitHub CLIs, or agent workflows.
- Host enrollment identities: enroll interactively for v1 unless a future ADR accepts unattended enrollment.
- Model weights: seed local model bundles into
/var/lib/dubnium/models; do not store them in Git, SOPS, or the Nix store.
Host Age Identity
Create one age identity per host and keep it on that host:
sudo mkdir -p /var/lib/sops-nix
sudo age-keygen -o /var/lib/sops-nix/key.txt
sudo chmod 0600 /var/lib/sops-nix/key.txt
sudo cat /var/lib/sops-nix/key.txt | age-keygen -y
Add the printed public recipient to .sops.yaml when the first encrypted
secrets file is introduced.
Host Secret File
Keep encrypted host secret files under an ignored or carefully reviewed path
such as secrets/hosts/<host>.yaml. Commit encrypted files only after checking
that the cleartext values are not present in the diff.
Example SOPS data shape:
service_name:
token: example
vLLM Model Downloads
The default Dubnium install should not need a Hugging Face token. Dubnium points
vLLM at local model bundle paths under /var/lib/dubnium/models, and the fresh
install path seeds those bundles from USB.
Only add a model-provider token if you intentionally choose an online download workflow for a future host. In that case, prefer an environment file generated by sops-nix:
{ config, ... }:
{
dubnium.secrets.defaultSopsFile = ../../secrets/hosts/workstation.yaml;
sops.secrets.model-provider-token = {
key = "model_provider/token";
};
sops.templates."vllm-model-provider.env".content = ''
HF_TOKEN=${config.sops.placeholder.model-provider-token}
HUGGINGFACE_HUB_TOKEN=${config.sops.placeholder.model-provider-token}
'';
dubnium.vllm.environmentFiles = [
config.sops.templates."vllm-model-provider.env".path
];
}
Do not add provider tokens to the custom installer ISO or USB seed partition.
User Runtime Tokens
User tools are owned by the dotfiles Home Manager profile, not by Dubnium system services. Keep tokens such as these in the user SOPS file:
github_token: ghp_example
openai_api_key: sk-example
The dotfiles profile exposes secret file paths, for example
GITHUB_TOKEN_PATH and OPENAI_API_KEY_PATH. It can also source a
sops-generated shell fragment for interactive user sessions, so tools installed
by the profile inherit variables such as OPENAI_API_KEY without per-tool
wrappers and without putting plaintext values in Nix options.
Codex should get OPENAI_API_KEY this way. A later user workflow can use
GITHUB_TOKEN the same way without changing the installer policy.
Rotation
- Edit the encrypted SOPS file with
sops. - Rebuild the target host.
- Restart any service that consumes the rotated secret if activation did not already restart it.
- Revoke the old token at the provider.
Checks
Before committing, inspect staged changes:
git diff --cached
git diff --check
Do not commit plaintext tokens, private keys, generated age identities, model weights, or local decrypted files.
Tailscale
Tailscale is workstation-only platform connectivity in v1. Dubnium enables the daemon and CLI, but enrollment is manual until secrets and OAuth policy are settled.
First Activation
Build and switch the workstation configuration:
sudo nixos-rebuild switch --flake .#workstation
Enroll the node manually:
sudo tailscale up
Follow the browser/device login flow. Do not pass --ssh,
--advertise-routes, or --advertise-exit-node for v1.
Verification
Check the daemon:
systemctl status tailscaled
Check tailnet state:
tailscale status
tailscale ip -4
Regular OpenSSH can be used over the assigned tailnet IP if SSH is allowed by the host firewall and OpenSSH configuration.
vLLM Over Tailnet
Dubnium exposes vllm.service on port 8000 over the Tailscale interface only.
From another tailnet machine, use the node’s Tailscale IP or MagicDNS name:
curl http://<dubnium-tailnet-name>:8000/v1/models
The local alias ai.dubnium is a host-local convenience entry on Dubnium. To
use that same name from other machines, add a tailnet DNS/hosts alias that
points ai.dubnium at the Dubnium node’s Tailscale IP.
Deferred Automation
Automatic enrollment should use services.tailscale.authKeyFile only after
Dubnium has a settled secrets policy. The intended future shape is:
services.tailscale.authKeyFile = "/run/secrets/tailscale-auth-key";
OAuth or auth-key enrollment should be paired with explicit key scope, expiration, tagging, and rotation decisions.
Deferred Routing Options
Subnet router support would require:
services.tailscale.useRoutingFeatures = "server"or"both"sudo tailscale up --advertise-routes=...- Tailscale admin approval for the advertised routes
- firewall, forwarding, and reverse-path-filtering review
Exit-node support would require:
services.tailscale.useRoutingFeatures = "server"or"both"sudo tailscale up --advertise-exit-node- Tailscale admin approval
- stronger trust and privacy review, because the node can carry client traffic
Deferred Tailscale SSH
Tailscale SSH is not enabled in v1. If enabled later, it should be tied to a written Tailscale ACL policy and explicit operator intent.
Future manual enrollment would use:
sudo tailscale up --ssh
Future declarative enrollment could add:
services.tailscale.extraUpFlags = [ "--ssh" ];
Until that policy exists, use regular OpenSSH over the tailnet IP.
Runbook: Transition Testing
Status: living
Use this after the machine can boot the flake-managed desktop baseline.
Preflight
mode status
mode current
mode desired
systemctl status desktop.target
systemctl status compute.target
systemctl status vllm.service
The expected baseline is:
- observed state is
desktop - vLLM is inactive
- no transition lock is held
- latest transition is not failed
Test Studio Overlay
sudo mode request studio-local
mode status
systemctl status studio-local-policy.service
systemctl status audio-priority.service
sudo mode request desktop
mode status
Expected result:
studio-localis observed only while both overlay services are active- returning to
desktopstops both overlay services - vLLM remains inactive
Test Compute Promotion
Before testing:
- close REAPER and active low-latency audio work
- avoid foreground long-running user jobs
- expect the graphical session to terminate
sudo mode request compute
mode status
systemctl status compute.target
systemctl status vllm.service
Expected result:
- observer reports
computeor an explicit degraded/failed state - graphical session is absent or non-authoritative
- vLLM is active if enabled
- guard and transition records explain any block or failure
Test Desktop Return
sudo mode request desktop
mode status
systemctl status vllm.service
Expected result:
- observer reports
desktop - vLLM is inactive
- graphical/session path is usable
If rollback only partially restores desktop, classify it as degraded rather than successful.
Runbook: Failed Transition Recovery
Status: living
Use this when mode status reports failed-transition, a degraded state, or a
post-action observation mismatch.
Inspect State
mode status
mode current --refresh
sudo cat /run/mode-controller/last-transition.json
sudo cat /run/mode-controller/last-guards.json
journalctl -u 'mode-controller@*' -b
Classify the Failure
Common buckets:
- guard policy block, such as active audio or unsafe user jobs
- guard execution error, such as missing
nvidia-smi - graphical session did not terminate
- GPU release predicate did not pass
- vLLM failed to start or stop
- target isolation stopped required services
- post-action observation remained conflicted
Recover to Desktop
If the system is not in the middle of an active transition:
sudo mode request desktop
mode status
Success requires observer confirmation, not just successful systemd commands.
If desktop recovery fails:
- inspect
journalctl -b - inspect display-manager/session logs
- stop compute-only services manually only if their ownership is clear
- consider rebooting to the v1 boot default,
desktop
Record Evidence
For every failure worth keeping:
- final
mode status - last transition JSON
- last guards JSON
- relevant systemd unit status
- whether rollback restored desktop
- whether the failure suggests runtime switching is insufficient
Repeated GPU release or desktop restoration failures should trigger specialisation/reboot-mediated compute evaluation.
WSL Documentation Boundary
Dubnium uses WSL in two different ways, and the docs should keep those roles separate.
WSL As Build Environment
Use WSL as a convenient Linux build environment. This includes:
- building the custom installer ISO
- preparing the local seed-model bundle
- running Nix commands that do not need bare-metal hardware
This role does not imply the wsl host target is being installed or validated.
For the installer flow, WSL prepares artifacts and the platform writer prepares
the USB unless the USB disk is deliberately exposed to WSL.
Primary docs:
WSL As Validation Target
Use the .#wsl host target only inside an existing
nix-community/NixOS-WSL distro. This target validates shared Dubnium module
composition and activation before touching the real workstation. It keeps
resource-heavy services such as vllm and k3s disabled by default.
This role does not prove bare-metal behavior. Passing WSL validation does not
prove EFI, bootloader, Hyprland, audio, or final GPU behavior for
.#workstation.
Primary docs:
Boundary Rules
- Keep bare-metal install steps in fresh-install and custom-installer docs.
- Keep
.#wslactivation and validation steps in the WSL bring-up runbook. - Do not use the fresh-install checklist for WSL bring-up.
- Do not use WSL results as proof that workstation hardware configuration is correct.
- When a command is meant to run inside WSL, label it as WSL or Bash.
Build Installer Artifacts From WSL
Status: living
Use this when the Dubnium installer ISO and seed-model bundle should be prepared from an existing WSL distro.
This is only a build workflow. It is not .#wsl host activation and does not
validate the WSL target.
Boundary
- build the ISO and prepare the seed model here
- write the USB with the platform’s guarded writer unless the USB disk is deliberately exposed to the WSL distro
Build
Enter the Nix-capable WSL distro:
wsl -d NixOS
Inside the distro:
cd /path/to/dubnium
git status --short
git -C external/dotfiles status --short
scripts/build-installer-iso.sh \
--iso ./dubnium-installer.iso
The script prepares the current Dubnium default seed bundle when no existing
materialized bundle is detected. Use --seed-model to point at a different
bundle, --no-seed-download to require an existing bundle, or --no-seed-model
to build installer-only media.
Write The USB
After the ISO exists in the shared checkout, use the platform writer from the custom installer runbook. For Windows PowerShell:
.\scripts\write-installer-usb.ps1 `
-IsoPath .\dubnium-installer.iso `
-DiskNumber 7 `
-ExpectedFriendlyName "USB SanDisk 3.2Gen1" `
-SeedModelPath ..\models\selected-model-bundle
Each writer still checks the USB disk identity and requires the typed erase confirmation.
Related Docs
Runbook: WSL Bring-Up
Status: living
Use this when the target environment is the wsl host, running inside an
existing nix-community/NixOS-WSL distro.
This is separate from the bare-metal install and first-bring-up flow because the commands, platform assumptions, and validation steps are materially different.
This runbook assumes you are already using the community WSL base:
nix-community/NixOS-WSL- setup docs: https://nix-community.github.io/NixOS-WSL/
The dubnium .#wsl target layers on top of that base. It is not a
replacement for the initial NixOS-WSL installation process.
When To Use This
Use this runbook when:
- you are already inside the
NixOSWSL distro - you want to switch that distro to
dubnium’s.#wsltarget - you want to validate shared Dubnium wiring in WSL before touching the
bare-metal
workstationtarget
Do not use this runbook for:
- bare-metal install
hosts/workstation/hardware-configuration.nixgeneration- EFI or bootloader validation
- Hyprland or audio/studio validation
Preconditions
- WSL is installed on Windows
- a
NixOSWSL distro based onnix-community/NixOS-WSLalready exists and boots successfully - this repo is available inside the distro. Examples:
/mnt/c/Users/<user>/Projects/dubnium
~/src/dubnium
- flakes are available, either through system config or explicit flags
Success Criteria
nixos-rebuild switch --flake .#wslsucceeds inside the WSL distrogitis available from the switched system generationmode status,mode current, andmode desiredworkdubnium.k3s.enableanddubnium.vllm.enableevaluate tofalsecompute.targetevaluates without pulling ink3sorvllm.service- the runtime state directory exists at
/run/mode-controller
1. Enter The NixOS WSL Distro
If you do not already have a working nix-community/NixOS-WSL distro, stop
here and install that first. This runbook starts after that base is already in
place.
Enter the distro:
wsl -d NixOS
Inside the distro, go to the repo:
cd /path/to/dubnium
pwd
git status --short
Use the actual checkout path for the machine. Avoid hardcoding personal paths in reusable docs or scripts.
2. Evaluate The WSL Target
If your shell does not already have flakes enabled, use explicit flags:
nix --extra-experimental-features "nix-command flakes" flake show .
Confirm the new target exists:
nixosConfigurations.wsl
Optional targeted checks:
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.wsl.config.wsl.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.wsl.config.dubnium.boot.defaultMode
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.wsl.config.dubnium.k3s.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.wsl.config.dubnium.vllm.enable
nix --extra-experimental-features "nix-command flakes" eval .#nixosConfigurations.wsl.config.systemd.targets.compute.wants
Expected:
wsl.enable = true- default mode is
compute dubnium.k3s.enable = falsedubnium.vllm.enable = falsecompute.targethas novllm.servicedependency
This confirms the dubnium host is using the upstream community WSL module,
not an ad hoc local WSL implementation.
3. Switch The Running Distro To .#wsl
Use:
sudo nixos-rebuild switch --flake .#wsl
If flakes are not enabled globally in the current shell:
sudo nixos-rebuild switch --extra-experimental-features "nix-command flakes" --flake .#wsl
This is the main WSL install/activation command.
Keep config truth and live runtime truth separate. A targeted nix eval proves
the flake expression, while nixos-rebuild switch and systemctl prove the
running distro. If a full switch fails with an environmental WSL error, record
that separately from whether the flake evaluated correctly.
4. Verify Dubnium Runtime Basics
Check mode/runtime state:
mode status
mode current
mode desired
sudo ls -la /run/mode-controller
Check that the heavy services are not part of the lightweight WSL profile:
systemctl status k3s --no-pager
systemctl status vllm --no-pager
Both units should be absent or inactive in the default WSL target. Enable them intentionally in a local override only when the task is specifically to validate their WSL runtime behavior.
Check the WSL target’s current interpretation:
wslis headlesscomputeis the default desired modek3sandvllmare disabled by default to keep WSL activation lightweightworkstation-only graphics/audio expectations do not apply here
5. Known Differences From workstation
Important differences:
.#wslassumes the distro itself was originally created withnix-community/NixOS-WSL- do not run
nixos-generate-config --dir ./hosts/workstationfor WSL testing - do not expect
.#workstationto build cleanly until a real bare-metal hardware config has replaced the placeholder - do not use the fresh-install checklist for WSL bring-up
- do not treat WSL results as proof of EFI, bootloader, Hyprland, or audio correctness
The wsl target is for:
- flake composition
- lightweight activation
- shared Dubnium control-plane behavior
6. Common Failure Buckets
- flakes not enabled in the current shell
- repo is present but the running system has not been switched to
.#wsl - Windows PATH injection adds noisy warnings during WSL startup
- if repo tooling looks missing after switch, check
git --version - Home Manager activation can fail if the active WSL login user does not match
the configured Home Manager home; check
whoami,getent passwd, and/etc/wsl.confbefore changing modules modedesired/current state is seeded but not yet reconciled automatically at boot
If .#workstation fails during WSL development, first check whether the failure
comes from the placeholder hosts/workstation/hardware-configuration.nix
instead of the new wsl target.
Related Docs
- WSL Documentation Boundary
- First Bring-Up
- Fresh Install
- Custom Installer ISO
- First Bring-Up Checklist
Dual-Mode NixOS Workstation / AI Node
Unified Planning + Mode State Machine Document (v0.3 — Living)
1. Purpose
Design a single NixOS system that operates as a policy-driven multi-mode host with support for future workload externalization:
- Desktop / Dev workstation
- Optional local Studio / Audio profile
- Compute / Headless AI node
The broader workstation environment may also externalize selected capabilities, especially Studio/Audio, to a separate machine such as a Mac mini.
The system must support:
- low-latency audio workloads (DAW / live)
- GUI desktop usage via Hyprland
- GPU inference via vLLM
- k3s control-plane duties for Micrantha Laboratory / Hyperion
- explicit, auditable, reproducible transitions between modes
This document defines:
- planning assumptions
- architectural boundaries
- host-local mode definitions
- capability placement model
- invariants
- state machine
- guards and guard functions
- source-of-truth model
- reconciliation model
- implementation mapping to systemd
- design alternatives and tradeoffs
2. Core Principles
2.1 Modes Are Operational Contracts
A mode is not just a set of enabled services. A mode defines:
- resource ownership
- permitted workloads
- latency/throughput expectations
- security posture
- transition preconditions
2.2 Explicit Over Implicit
Mode transitions should be:
- explicit when possible
- observable
- reversible
- logged
- idempotent
Automation may request a transition, but the controller must decide whether it is safe.
2.3 Latency and Throughput Are Competing Objectives
- Desktop / Studio-Local optimize for responsiveness and bounded latency
- Compute optimizes for throughput and hardware utilization
The design must not pretend both can be maximized simultaneously.
2.4 One Physical Host, Multiple Logical Planes
This system is treated as:
one shared substrate hosting multiple logical operating modes
2.5 Declarative First, Runtime Reconciliation Second
- NixOS declares steady-state intent and system structure
- a mode controller reconciles runtime state toward desired operational mode
2.6 Host-Local Modes Must Survive Capability Relocation
The host-local state model should remain coherent even if some capabilities, especially Studio/Audio, move to another machine.
3. System Overview
flowchart TD
HW[Hardware]
subgraph BaseOS[NixOS Base Layer]
Kernel
Drivers[NVIDIA / CUDA]
Network
Storage
Nix
systemd
end
subgraph Control[Mode Control Plane]
Desired[Desired State]
Current[Current State]
Reconcile[Reconciler]
Guards[Guard Checks]
end
subgraph LocalModes[Host-Local Modes]
Desktop[Desktop / Dev]
StudioLocal[Studio-Local / Audio-Priority]
Compute[Compute / Headless]
end
subgraph Placement[Capability Placement]
StudioCap[Studio Capability]
AICap[AI Capability]
PlatformCap[Platform Capability]
end
subgraph Workloads[Workloads]
Hyprland
PipeWire
Reaper
vLLM
k3s
end
HW --> BaseOS
BaseOS --> Control
Control --> LocalModes
LocalModes --> Workloads
LocalModes --> Placement
4. Mode Definitions and Capability Placement
This document distinguishes between:
- host-local operational modes for the NixOS machine
- capability placement for functions that may later move to another machine
4.1 Host-Local Modes
Desktop / Dev Mode
Intent
Balanced interactive mode for programming, office work, light desktop use, and bounded AI.
Properties
- GUI enabled
- audio enabled for ordinary desktop use
- GPU0 reserved for display/compositor
- GPU1 may be used by AI workloads
- vLLM constrained to single-GPU operation or disabled
- k3s control plane may remain active
- CPU/RAM contention must remain bounded
Studio-Local / Audio-Priority Profile
Intent
A stricter local operating profile for low-latency audio work when Studio remains on the NixOS host.
Properties
- modeled as a protected interactive profile closely related to Desktop
- GUI enabled
- audio stack prioritized
- display GPU reserved exclusively for desktop responsibilities
- AI workloads disabled or reduced to near-zero
- heavy I/O and background maintenance jobs disallowed
- scheduler and system policy biased toward stable audio behavior
Design note
This profile is considered conditional and potentially temporary. It exists so the NixOS host can support local audio/studio workflows now, without assuming that Studio remains a permanent first-class local mode forever.
Implementation note
For the first implementation pass, studio-local should be modeled as a policy overlay on desktop, not as a first-class top-level systemd target. The operational state still exists in the controller/state model, but its enactment should initially be handled by marker/helper units layered onto the desktop path.
Compute / Headless Mode
Intent
Throughput-oriented headless mode for AI serving and platform duties.
Properties
- GUI disabled
- audio stack off or irrelevant
- both GPUs available to AI workloads
- vLLM may use both GPUs
- k3s workloads may run more aggressively
- CPU/RAM/storage can be utilized much more aggressively than in interactive modes
4.2 Capability Placement Model
Certain capabilities may be placed either:
- locally on the NixOS host
- externally on another machine
Capability: Studio / Audio
Possible placements:
localexternal-mac-mini
Capability: AI / Inference
Expected placement:
- primarily
local-nixos-host
Capability: Platform / k3s Control
Expected placement:
- primarily
local-nixos-host
4.3 Design Implication
The host-local state machine should remain valid even if Studio/Audio is moved to a Mac mini. That means Studio-specific policy should be represented as a local profile or conditional mode, not as the permanent center of the entire host architecture.
5. Resource Ownership Model
5.0 Implementation Note — Hardware-Tolerant Bring-Up
The architecture should continue to plan for the intended dual-GPU topology, but the NixOS implementation should remain tolerant of transitional hardware states while the second GPU is not yet installed or configured.
That means:
- the policy model may still describe the intended two-GPU end state
- module options should encode planned GPU ownership explicitly
- active service profiles must only reference GPUs that are currently present
- missing future hardware must not cause ordinary evaluation or steady-state services to fail unnecessarily
5.1 GPU Ownership
| Mode | GPU0 | GPU1 |
|---|---|---|
| Desktop | Display / compositor | AI optional |
| Studio-Local | Display / compositor (protected) | AI off or minimal |
| Compute | AI | AI |
5.2 CPU Ownership
- Shared via cgroups/systemd slices
- interactive slices retain priority/headroom in Desktop and Studio-Local
- compute slices may saturate cores in Compute
5.3 Memory Ownership
- bounded AI memory usage in Desktop
- stricter constraints in Studio-Local
- relaxed/high utilization in Compute
5.4 Storage Ownership
- heavy background I/O restricted in Studio-Local
- permitted but bounded in Desktop
- broadly permitted in Compute
5.5 Audio Ownership
- effectively exclusive in Studio-Local
- protected in Desktop
- not guaranteed in Compute
6. Invariants
These are system-level properties that must remain true regardless of transition path or future Studio placement.
6.1 Safety Invariants
- At most one host-local operational mode is authoritative at a time.
- A transition must either complete to a stable target state or abort back to a known-safe prior state.
- Mode transitions must be idempotent. Re-running a transition toward an already-satisfied state must not cause harm.
- When Studio-Local is active, heavyweight compute workloads must not materially jeopardize audio latency.
- Compute mode must not require a running graphical session.
- GPU0 must not be simultaneously treated as both protected display GPU and unrestricted compute GPU.
- The controller must not promote the system into Compute if guard failures indicate active user/audio risk.
- The system must always expose a way to determine current mode, desired mode, and last transition result.
- The host-local mode model must remain coherent if Studio/Audio capability is externalized to another machine.
6.2 State Invariants
- Desired state is authoritative intent.
- Current state is observed runtime fact.
- Reconciliation moves current state toward desired state; it never rewrites observed state to match wishful intent.
- A guard failure blocks transition, but does not silently change desired state unless policy explicitly says so.
6.3 Operational Invariants
- Models and mutable runtime data must live outside the Nix store.
- Dotfiles may influence user experience, not machine-critical mode policy.
- Mode policy must remain expressible and inspectable via systemd and Nix configuration.
- Capability placement decisions must not silently invalidate host-local invariants.
7. Desired State vs Current State
7.1 Desired State
The host-local mode the user or automation wants the system to be in.
Examples:
desktopstudio-localcompute
7.2 Current State
The host-local mode the system is actually in, as determined by observation.
Examples:
- graphical target active, PipeWire active, vLLM limited → likely
desktop - graphical target inactive, compute services active, both GPUs exposed to AI → likely
compute - GUI active, audio priority raised, compute services reduced → likely
studio-local
7.3 Why This Split Matters
Without this split, the system can lie to itself:
- a command says “switch to compute”
- but GPU is still held by compositor
- vLLM failed to scale up
- audio services are still active
In that case:
- desired state =
compute - current state =
transitioningordesktop (degraded)
The control plane must detect and reconcile this rather than assuming success.
8. Source of Truth for Mode
The system needs one authoritative representation of requested host-local mode.
8.1 Options Considered
Option A — File-Based Source of Truth
Example:
/run/mode-controller/desired/var/lib/mode-controller/desired
Pros
- simple
- easy to inspect
- works outside active user session
- easy for scripts and systemd units
Cons
- can drift from actual runtime state
- needs permissions and lifecycle handling
Option B — Environment Variable Source of Truth
Example:
MODE=compute
Pros
- simple for one-shot commands
- easy in shell contexts
Cons
- poor system-wide authority
- ephemeral
- fragile across sessions/reboots
- bad fit for authoritative machine state
Option C — systemd State as Source of Truth
Example:
compute.targetactive implies desired mode is compute
Pros
- tightly aligned with implementation
- introspectable
- avoids duplicate state stores
Cons
- desired state and current state can become conflated
- harder to represent “requested but not yet achieved”
- recovery/abort semantics become more awkward
8.2 Recommended Model
Use a hybrid model:
- Desired state source of truth: file in
/run/mode-controller/desired - Current state source of truth: observed systemd/runtime facts
- Transition machinery: systemd targets + controller service
This cleanly separates:
- intent
- observation
- enforcement
8.3 Proposed Files
/run/mode-controller/desired/run/mode-controller/current/run/mode-controller/last-transition.json
current may be a cached observation, but observation should always be derivable from system state.
9. State Machine
9.1 States
S0: Boot
Initial state before default operating mode is established.
S1: Desktop
Interactive general-purpose mode.
S2: StudioLocal
Strict interactive low-latency local audio profile.
S3: Compute
Headless throughput-oriented mode.
S4: Transitioning
Ephemeral reconciliation state while moving toward desired mode.
S5: FailedTransition
A recoverable error state indicating that desired state was not achieved.
9.2 State Diagram
stateDiagram-v2
[*] --> Boot
Boot --> Desktop : default boot
Desktop --> StudioLocal : request(studio-local)
StudioLocal --> Desktop : request(desktop)
Desktop --> Transitioning : request(compute)
StudioLocal --> Transitioning : request(compute)
Compute --> Transitioning : request(desktop)
Desktop --> Transitioning : request(desktop) / reconcile
StudioLocal --> Transitioning : request(studio-local) / reconcile
Compute --> Transitioning : request(compute) / reconcile
Transitioning --> Desktop : reached(desktop)
Transitioning --> StudioLocal : reached(studio-local)
Transitioning --> Compute : reached(compute)
Transitioning --> FailedTransition : guard_fail / action_fail / timeout
FailedTransition --> Desktop : recover(previous=desktop)
FailedTransition --> StudioLocal : recover(previous=studio-local)
FailedTransition --> Compute : recover(previous=compute)
9.3 Notes
- Direct
StudioLocal -> Computemay be allowed only through guarded reconciliation, not blind immediate promotion. - Reconciliation should be able to handle “already in desired mode” as a no-op success.
- Externalized Studio capability must not require redesign of the host-local state machine; it should only disable or deprecate
studio-localusage.
10. Guards
Guards are explicit check functions. They return exit codes and optionally structured diagnostics.
10.1 Guard Interface
Each guard function should follow a predictable interface:
check_<name>
exit 0 = pass
exit 10+ = policy failure / guard blocked
exit 20+ = check execution error / indeterminate
Structured output should ideally emit JSON or key=value diagnostics to stdout/stderr for logs.
10.2 Guard Set
G1: check_audio_idle
Purpose:
- verify no active low-latency local audio session that would make compute transition unsafe
Possible checks:
- no active REAPER process
- no active PipeWire/JACK graph beyond baseline
Exit codes:
0pass10audio active20unable to inspect audio graph
G2: check_gpu_display_released
Purpose:
- verify display/compositor has released GPU before compute promotion
Possible checks:
- no active Hyprland session
- no relevant graphical GPU consumers
Exit codes:
0pass11display GPU still owned by GUI21GPU inspection failure
G3: check_cpu_load_safe
Purpose:
- ensure transition is not occurring during obviously unsafe heavy local activity when policy requires quieting first
Exit codes:
0pass12CPU load too high22unable to inspect load
G4: check_user_jobs_safe
Purpose:
- detect known long-running interactive/user jobs that should block auto-transition
Possible checks:
- selected process patterns
- optional allowlist/denylist
Exit codes:
0pass13user jobs active23inspection failure
G5: check_memory_headroom
Purpose:
- ensure sufficient memory exists to perform transition or launch target services
Exit codes:
0pass14insufficient headroom24inspection failure
G6: check_vllm_drainable
Purpose:
- ensure compute workloads can be safely reduced when returning to Desktop/Studio-Local
Exit codes:
0pass15compute workload not drainable25inspection failure
G7: check_studio_capability_local
Purpose:
- verify that local Studio capability is still available on the NixOS host before allowing
studio-local
Possible checks:
- local policy flag indicates studio capability still hosted locally
- local audio stack and workflow prerequisites are not intentionally disabled due to externalization
Exit codes:
0pass19requested local studio capability not available29inspection failure
10.3 Guard Policy by Transition
| Transition | Required Guards |
|---|---|
| Desktop -> StudioLocal | check_target_reachable, check_studio_capability_local, check_user_jobs_safe (optional policy), compute downscale checks |
| StudioLocal -> Desktop | check_target_reachable |
| Desktop -> Compute | check_target_reachable, check_audio_idle, check_gpu_display_released, check_cpu_load_safe, check_user_jobs_safe, check_memory_headroom |
| StudioLocal -> Compute | check_target_reachable, check_audio_idle, check_gpu_display_released, check_cpu_load_safe, check_user_jobs_safe, check_memory_headroom |
| Compute -> Desktop | check_target_reachable, check_vllm_drainable, check_memory_headroom |
| Compute -> StudioLocal | check_target_reachable, check_studio_capability_local, check_vllm_drainable, check_memory_headroom |
11. Actions and Transition Semantics
Actions are the concrete operations used to move from one state to another.
11.1 Action Vocabulary
- stop/terminate GUI session
- isolate a target
- stop/start units
- wait for quiescence
- update desired/current state files
- restart services with different environment/policies
11.2 Action Interface
Each action should return:
0success- non-zero failure with logged reason
12. Exact Transition Mapping to systemd Operations
This is the implementation-oriented mapping.
12.1 Assumptions
Systemd targets:
desktop.targetcompute.target
studio-local is intentionally not a first-class target in v1. It is represented
as a desktop overlay through studio-local-policy.service and
audio-priority.service.
Supporting services:
mode-controller.servicevllm.servicek3s.servicepipewire.service/ user session services- graphical session manager or direct Hyprland session
Helper oneshot services/scripts:
mode-prepare-compute.servicemode-prepare-desktop.servicemode-prepare-studio-local.servicemode-observe.service
12.2 Desktop -> StudioLocal
Desired change
- desired mode file =
studio-local
systemd operations
systemctl start mode-controller.service(with target=studio-local)- controller runs guard set for Desktop -> StudioLocal
- controller verifies local Studio capability still exists
- controller stops or constrains AI workloads as needed
- v1 policy:
systemctl stop vllm.service
- v1 policy:
- controller isolates or verifies
desktop.target - controller starts
studio-local-policy.service - controller starts
audio-priority.service - controller updates current state observation
Example exact operations
write /run/mode-controller/desired = studio-local
systemctl start mode-controller@studio-local.service
systemctl stop vllm.service
systemctl isolate desktop.target
systemctl start studio-local-policy.service
systemctl start audio-priority.service
12.3 StudioLocal -> Desktop
Desired change
- desired mode file =
desktop
systemd operations
- write desired state
- start controller
- restore normal interactive policies
- optionally allow bounded AI services
- stop
audio-priority.service - stop
studio-local-policy.service systemctl isolate desktop.target- update current observation
Example exact operations
write /run/mode-controller/desired = desktop
systemctl start mode-controller@desktop.service
systemctl stop audio-priority.service
systemctl stop studio-local-policy.service
systemctl isolate desktop.target
12.4 Desktop -> Compute
Desired change
- desired mode file =
compute
systemd operations
- write desired state
- start controller for compute
- run guards:
- check_target_reachable
- check_audio_idle
- check_gpu_display_released (or prepare to release)
- check_cpu_load_safe
- check_user_jobs_safe
- check_memory_headroom
- if interactive session exists, controller requests/forces session termination
loginctl terminate-session <id>
- wait until compositor releases GPU
- stop or de-prioritize audio services if needed
- stop desktop-specific services not wanted in compute
- set service environment/profile for dual-GPU vLLM
systemctl isolate compute.target- start/restart
vllm.service - verify current state
Example exact operations
write /run/mode-controller/desired = compute
systemctl start mode-controller@compute.service
loginctl terminate-session <desktop-session>
systemctl stop graphical-session.target # if such target exists in design
systemctl isolate compute.target
systemctl restart vllm.service
12.5 Compute -> Desktop
Desired change
- desired mode file =
desktop
systemd operations
- write desired state
- start controller for desktop
- run guards:
- check_target_reachable
- check_vllm_drainable
- check_memory_headroom
- drain/stop or downscale vLLM
- constrain compute workloads
systemctl isolate desktop.target- start GUI path
- ensure GPU0 reserved for display
- start/restore audio path
- verify current state
Example exact operations
write /run/mode-controller/desired = desktop
systemctl start mode-controller@desktop.service
systemctl stop vllm.service # or restart single-GPU profile
systemctl isolate desktop.target
12.6 StudioLocal -> Compute
Two possible policies:
Policy A — direct guarded transition
Allowed if all compute guards pass and Studio-Local resources are cleanly relinquished.
Policy B — normalize through Desktop first
Transition path:
studio-local -> desktop -> compute
Recommendation: Use Policy A in implementation, but conceptually treat it as the same reconciliation pipeline with stricter guards.
13. Reconciliation Model
13.1 Motivation
A single mode request compute command should not blindly assume success. The system should:
- record desired mode
- observe current state
- compare desired vs current
- compute required transition plan
- execute actions
- re-observe
- either declare success or enter failed transition state
13.2 Reconciliation Loop
flowchart TD
Req[Request mode] --> Write[Write desired state]
Write --> Observe[Observe current state]
Observe --> Compare{Desired == Current?}
Compare -->|Yes| Done[No-op success]
Compare -->|No| Plan[Select transition plan]
Plan --> Guards[Run guards]
Guards -->|Fail| Fail[Record failure]
Guards -->|Pass| Act[Execute actions]
Act --> Reobserve[Observe current state again]
Reobserve --> Verify{Reached desired?}
Verify -->|Yes| Success[Record success]
Verify -->|No| RetryOrFail[Retry boundedly or fail]
13.3 Reconciliation Semantics
- bounded retries only
- no infinite loops
- every failure is logged with:
- desired state
- prior state
- failing guard or action
- timestamp
13.4 Why This Matters
This lets you support:
- manual requests
- idle-triggered auto-switching
- boot-time default mode
- recovery after partial failures
all through one mechanism.
14. Specialisations vs Runtime Switching
This is the main architectural fork.
14.1 Option A — Runtime Switching Only
Use one host definition with multiple systemd targets and runtime policies.
Pros
- fast transitions
- no reboot required
- best UX for switching between Desktop and Studio-Local
- simpler for day-to-day operation
Cons
- weaker isolation
- harder to fully guarantee all services/resources are cleanly re-bound
- risk of state leakage between modes
- some kernel/driver tuning differences are awkward live
Best fit
- Desktop <-> Studio-Local
- Desktop <-> Compute where flexibility matters more than hard isolation
14.2 Option B — NixOS Specialisations Only
Use separate NixOS specialisations for Desktop and Compute (and possibly Studio-Local).
Pros
- stronger isolation between role profiles
- easier to vary deeper system settings, kernel params, service sets
- clearer recovery story
- closer to “logical separate machines”
Cons
- slower transitions, often reboot-oriented in practice
- poorer UX for frequent switching
- more configuration duplication risk if not structured well
Best fit
- Desktop vs Compute if you want very strong separation
- not ideal for rapid Studio-Local toggling
14.3 Option C — Hybrid Model
Use:
- runtime switching for Desktop <-> Studio-Local
- specialisation boundary between Interactive and Compute families
Example:
- default specialisation = interactive
- runtime modes inside it: desktop, studio-local
- compute specialisation = headless compute
Pros
- strongest overall architecture
- preserves good UX for Studio-Local transitions
- lets Compute differ more deeply if needed
- handles future externalization of Studio more cleanly than treating Studio as a permanent top-level host identity
Cons
- more design complexity
- transition from interactive to compute may become reboot-oriented or at least heavier
- more machinery to maintain
14.4 Recommendation
For your current goal, use runtime switching first, with the design shaped so it can later evolve into a hybrid model.
Reasoning
- you need to learn actual contention boundaries first
- Desktop <-> Studio-Local benefits heavily from live switching
- Desktop <-> Compute can start as runtime-switched
- if the system proves too “sticky” or leaky, you can later promote Compute into a specialisation without redesigning the higher-level state machine
- if Studio moves to a Mac mini, the host-local model remains intact
Practical recommendation
Phase the design like this:
- Phase 1: one host, runtime switching only
- Phase 2: strong slices/targets/guards
- Phase 3: evaluate whether Compute should become a specialisation
- Phase 4: if Studio is externalized, deprecate or disable
studio-localwithout changing the operator-facing control model
This preserves velocity while keeping the abstraction clean.
15. Service Placement
15.1 Host-Level Services
- Hyprland
- PipeWire
- Reaper
- NVIDIA drivers/runtime
- mode controller
- possibly vLLM initially
- SSH / system services
15.2 k3s-Level Services
- Hyperion services
- platform/orchestration services
- dashboards and supporting workloads
- possibly model-serving abstractions later
First-pass implementation note
In v1, prefer keeping k3s.service continuously available while varying:
platform.sliceresource budgets- which workloads are allowed to run aggressively
- how much local compute capacity cluster workloads may consume
This is preferable to stopping and starting the cluster runtime during ordinary mode transitions.
15.3 Externalized Services (Possible Future)
- Studio/Audio workflows on Mac mini
- DAW/plugin-heavy sessions
- live audio interfaces and controllers
15.4 Recommendation
Keep hardware-near, latency-sensitive, and GPU-debug-sensitive components on the host first. Move services into k3s only after the host-level mode model is stable. Treat Mac mini externalization as a placement decision, not as a redesign trigger for the host-local state machine.
16. Idle Detection Policy
16.1 Role of Idle Detection
Idle detection is an input signal to the reconciler, not authority on its own.
16.2 Signals
- input inactivity
- audio activity
- GPU utilization / ownership
- CPU load
- selected user-job checks
16.3 Policy
Idle-triggered promotion to Compute should:
- update desired state to
compute - run the normal reconciliation pipeline
- abort safely if guards fail
It must never bypass guards.
16.4 Studio-Local Policy
Auto-promotion from studio-local to compute should generally be disabled unless explicitly requested. This remains true even if Studio capability later moves off-box.
17. Security Boundaries
Zones
- user desktop zone
- system service zone
- AI workload zone
- cluster service zone
- optional external Studio zone
Controls
- bind services to appropriate interfaces
- keep secrets outside dotfiles, e.g. SOPS/agenix
- keep mode control operations privileged and auditable
- do not let externalized capability assumptions silently weaken host-local controls
18. Risks and Failure Modes
18.1 Audio Degradation
Cause:
- background contention
Mitigation:
- Studio-Local invariants
- strict guard/action policy
18.2 GPU Contention
Cause:
- compositor and AI workloads racing for ownership
Mitigation:
- explicit GPU ownership model
- guard checks before Compute promotion
18.3 Partial Transition
Cause:
- GUI exits but vLLM fails to restart
- desired state written but current state never converges
Mitigation:
- reconciliation loop
- bounded retries
- failed-transition state
18.4 Configuration Drift
Cause:
- policy split across ad hoc scripts and dotfiles
Mitigation:
- keep mode policy in Nix + systemd-controlled scripts
18.5 Capability Drift
Cause:
- Studio capability moved to Mac mini, but local state machine or guards still assume it is local
Mitigation:
- explicit capability placement model
check_studio_capability_local- ADR-backed deprecation path for
studio-local
19. Open Questions
- Should vLLM be host-managed or profile-switched through separate unit templates?
- When should Compute graduate into a NixOS specialisation?
- How strict should auto-transition be about user jobs and unsaved work heuristics?
- Should
currentstate be derived on demand only, or also cached to/run/mode-controller/current? - At what point should local Studio capability be considered officially externalized to a Mac mini?
- What data/project sync model is required if Studio is split across machines?
19.1 Resolved Near-Term Decision
For v1:
studio-localis not a first-class targetstudio-localis represented as a protected interactive policy overlay ondesktopdesktopandcomputeare the only first-class top-level target families
This keeps the first implementation smaller while preserving the higher-level operational model and leaving room to strengthen Studio semantics later if needed.
19.2 Future Alternatives
Alternative A — Keep studio-local as an overlay permanently
Pros:
- less target duplication
- easier future deprecation if Studio moves to a Mac mini
- simpler runtime switching model
Cons:
- weaker systemd-level separability
- more policy encoded in helper units and controller logic
Alternative B — Promote studio-local into a first-class target later
Pros:
- stronger explicitness in systemd
- easier inspection of Studio-specific dependencies
- potentially clearer resource-policy boundaries
Cons:
- higher maintenance cost
- more duplication with
desktop - less aligned with the likely future externalization path
Recommendation
Start with the overlay model. Revisit only if empirical evidence shows that audio-protection policy is too hard to express or validate without a dedicated target.
19.3 Resolved Near-Term Decision — vLLM Service Shape
Target architecture:
vllm@desktop.servicevllm@compute.service
However, for the first implementation pass, a single vllm.service is acceptable if:
- desktop and compute profiles are still modeled explicitly in configuration
- controller actions remain profile-aware
- observation logic can still determine which profile is active
This allows the first bootable milestone to stay small without locking the architecture into a monolithic service model.
19.4 Resolved Near-Term Decision — k3s Service Shape
For v1:
k3s.serviceshould remain stable across host-local modes- mode differences should be expressed through:
- slice/resource budgets
- workload-placement or workload-intensity policy
- optional node labels/taints later
This keeps the control plane smaller and avoids coupling every host-mode transition to cluster-runtime teardown and recovery.
Future alternative
If empirical operation shows that stable-across-modes k3s still creates unacceptable interference or ambiguity, stronger k3s mode switching can be introduced later. That should be treated as a deliberate escalation, not the default starting point.
19.5 Resolved Near-Term Decision — Desktop AI Policy
For v1:
- keep vLLM off in
desktopfor the first convergence milestone - prove
desktop↔computetransitions before enabling bounded desktop-mode AI
Future alternative
After the control plane is reliable, bounded desktop-mode AI may be introduced as an explicit profile with clear GPU1 ownership and resource limits.
19.6 Resolved Near-Term Decision — studio-local Overlay Shape
For v1, represent studio-local with:
studio-local-policy.serviceaudio-priority.service
This gives the controller and observation logic a clear marker plus an explicit enforcement unit without promoting Studio into a first-class top-level target.
Future alternative
If this proves too implicit, studio-local can later be promoted into a stronger grouped target or target-like overlay.
19.7 Resolved Near-Term Decision — Capability Placement Source
For v1, capability-placement.json should be generated from Nix configuration rather than edited ad hoc at runtime.
Rationale
- keeps placement policy reproducible
- avoids silent runtime drift
- matches the design goal that machine-critical policy remain inspectable in Nix and systemd-managed artifacts
Future alternative
If operational experimentation later requires it, an explicit runtime override layer may be added with well-defined precedence and auditability.
19.8 Resolved Near-Term Decision — mode force
For v1, defer mode force.
Rationale
- keeps attention on making the ordinary reconciliation path correct
- avoids masking immature guard or transition logic
- reduces the chance of bypassing safety boundaries during initial bring-up
Future alternative
Add mode force later only after hard-vs-soft guard semantics are stable and well tested.
19.9 Resolved Near-Term Decision — GUI Teardown Semantics
For v1, compute promotion should require:
- graphical session absence
- explicit GPU-release verification
It should not initially depend on forcibly stopping every greeter or display-manager path unless empirical testing shows those components interfere with reliable GPU handoff.
19.10 Resolved Near-Term Decision — Desktop Target Ownership
For v1, desktop.target should not directly own the greeter/login path.
Rationale
- keeps mode ownership focused on operational policy rather than full session-manager orchestration
- reduces coupling to whichever login/session stack is chosen
- lets session presence remain an observed fact rather than an aggressively managed requirement
Future alternative
If desktop recovery proves unreliable without tighter control, greeter or display-manager paths can later be pulled under stronger mode ownership.
19.11 Resolved Near-Term Decision — studio-local-policy.service Scope
For v1, studio-local-policy.service should be:
- a reliable marker for observation/classification
- a light policy-application unit
- explicitly limited in scope
It should not become a giant all-in-one Studio behavior controller.
Rationale
- preserves clear observability
- avoids burying controller logic inside a catch-all helper unit
- keeps Studio overlay behavior inspectable and decomposable
19.12 Resolved Near-Term Decision — observe-current Implementation Language
For v1, implement observe-current in shell.
Constraints
- keep the output contract stable:
- plain mode name for shell use
- structured JSON for diagnostics
- structure the implementation so it can later be replaced by a typed helper without changing callers
Future alternative
If classifier complexity or JSON handling becomes unwieldy, replace only the classifier implementation with a small typed helper while keeping the same external contract.
19.13 Resolved Near-Term Decision — mode CLI Packaging
For v1:
- keep the script sources in the repository
- package them in
pkgs/ - install them through the NixOS module
Rationale
- keeps the tool packaging clean and testable
- avoids scattering ad hoc scripts directly into module definitions
- preserves a clean path to reuse across hosts later
19.14 Resolved Near-Term Decision — Reconciler Trigger Model
For v1:
- use parameterized oneshot reconciliation only
- do not enable timer-driven or path-triggered background reconciliation yet
Rationale
- keeps failure behavior easier to understand during bring-up
- avoids masking transition bugs behind background retries
- lets manual transitions prove the model first
Future alternative
After manual transitions are reliable, add periodic or path-triggered reconciliation for self-healing behavior.
19.15 Resolved Near-Term Decision — Boot Policy
For v1:
- normalize to
desktopon boot - do not replay persistent desired mode across reboot
Rationale
- gives the system a predictable safe recovery posture
- avoids booting directly back into a problematic compute path while the controller is still maturing
- keeps early operational behavior easier to reason about
Future alternative
Once transitions are reliable, desired-state persistence across reboot can be introduced as an explicit policy feature.
19A. Architectural Decision Record — Potential Studio Externalization
Context
There is a realistic possibility that low-latency Studio/Audio workloads will migrate from the NixOS machine to a Mac mini.
Decision
The NixOS host architecture should treat Studio as a conditional local profile (studio-local) rather than a permanently central host mode.
Consequences
- the host-local state machine remains stable if Studio moves off-box
- Compute and Desktop remain the durable primary host-local modes
- Studio capability can be represented separately through workload placement decisions
- local audio support can still exist now without overcommitting the architecture to a permanent local Studio role
Follow-on Design Implications
- add
check_studio_capability_localguard for anystudio-localtransition - keep local audio policy isolated from core Compute/Desktop mechanics where practical
- document future sync, control, and workflow boundaries if Studio becomes externalized
20. Control Interface and Implementation Contract
20.1 mode CLI Contract
The system should expose a single operator-facing interface:
mode status
mode request <desktop|studio-local|compute>
mode reconcile
mode current
mode desired
mode explain <desktop|studio-local|compute>
mode dry-run <desktop|studio-local|compute>
mode force <desktop|studio-local|compute>
Command Semantics
mode status
Returns:
- desired mode
- observed current mode
- whether reconciliation is needed
- last transition result
- blocking guard failures, if any
mode request <mode>
Behavior:
- write desired state
- invoke reconciliation
- return success only if reconciliation converged
mode reconcile
Behavior:
- observe current state
- compare to desired
- select transition plan
- run guards
- execute actions
- record results
mode current
Returns only the observed current mode.
mode desired
Returns only the desired mode file contents.
mode explain <mode>
Prints:
- target state properties
- expected services
- resource ownership rules
- guards required for entering that mode
- capability placement assumptions, where relevant
mode dry-run <mode>
Simulates the full reconciliation plan without mutating state.
mode force <mode>
Privileged path that bypasses selected non-safety guards, but must never bypass hard safety guards such as GPU/display or active audio protections unless explicitly designed to allow that.
Implementation note:
- defer this command in v1
- keep it in the long-term interface contract so the design remains forward-compatible
21. State Storage Layout
21.1 Runtime State Paths
/run/mode-controller/
desired
current
lock
last-transition.json
last-guards.json
reconcile.pid
capability-placement.json
hardware-topology.json
21.2 File Semantics
desired
Contains the requested mode:
desktopstudio-localcompute
current
Cached observation of current state. This is convenience state only; it must be derivable from system facts.
lock
Used to serialize reconciliation so only one transition runs at a time.
last-transition.json
Stores:
- requested mode
- prior observed mode
- final observed mode
- success/failure
- guard results
- action results
- timestamps
last-guards.json
Stores latest guard results for diagnostics.
capability-placement.json
Stores environment-level placement facts, for example:
studio: localstudio: external-mac-mini
This file is not the host-local mode source of truth. It is an environment metadata input used by guards and planning logic.
hardware-topology.json
Stores the currently configured hardware view, for example:
- planned GPU count
- currently present GPU indexes
- display GPU assignment
- desktop-mode AI GPU set
- compute-mode AI GPU set
This allows the implementation to preserve the intended dual-GPU architecture while remaining tolerant of temporary single-GPU bring-up phases.
22. systemd Unit and Target Layout
22.1 Targets
desktop.target
Wants:
- graphical-session target path
- bounded interactive services
- optional constrained AI services
First-pass implementation note:
- do not make
desktop.targetdirectly own greeter/login-manager startup in v1 - treat graphical session presence as an observed runtime fact
- strengthen ownership later only if empirical recovery behavior requires it
compute.target
Wants:
- headless service profile
- vLLM compute profile
- k3s compute-allowed policy/profile
22.2 Core Services
mode-controller@.service
Parameterized oneshot service.
Instance values:
mode-controller@desktop.servicemode-controller@studio-local.servicemode-controller@compute.service
Responsibilities:
- load desired mode
- observe current mode
- run reconciliation
- update state files and logs
First-pass implementation note:
- use this parameterized oneshot service as the sole reconciler trigger in v1
- defer timer/path-triggered background reconciliation until manual operation is proven reliable
mode-observe.service
Optional oneshot helper to compute observed current mode and refresh /run/mode-controller/current.
vllm@.service
Optional templated service for profile-specific operation:
vllm@desktop.servicevllm@studio-local.servicevllm@compute.service
Alternative:
- single
vllm.servicewith environment file switching
First-pass implementation guidance:
- prefer separate desktop and compute profiles conceptually
studio-localshould not require its own dedicated vLLM unit in v1 if Studio is implemented as a desktop overlay- a single
vllm.serviceis acceptable initially if it preserves a clean migration path to templated units later - keep desktop-mode vLLM disabled for the first transition-proof milestone
mode-guard@.service
Optional wrapper pattern for reusable guard execution, though plain scripts may be simpler initially.
studio-local overlay units
Recommended first-pass representation:
audio-priority.servicestudio-local-policy.service- optional environment/policy file consumed by observation and guard logic
These units should layer on top of desktop.target rather than replacing it with a distinct top-level target in v1.
Recommended scope for studio-local-policy.service:
- expose a clear mode marker
- apply only light, explicit Studio-specific policy
- delegate heavyweight orchestration to the controller or dedicated helper units
22.3 Suggested Slice Layout
system.slice
├── interactive.slice
│ ├── graphical-session scope/services
│ ├── audio-related helpers
│ └── bounded desktop workloads
├── ai.slice
│ ├── vllm service
│ └── AI helpers
└── platform.slice
├── k3s service
└── supporting infra services
Slice Intent
interactive.slicegets priority and headroom in Desktop/Studio-Localai.sliceis heavily constrained in Studio-Local, moderately constrained in Desktop, relaxed in Computeplatform.sliceremains comparatively stable but may have tighter resource budgets in interactive modes and relaxed budgets in Compute
23. Current State Observation Logic
Current state must be observed, not assumed.
23.1 Observation Inputs
GUI Indicators
graphical.targetor session-specific equivalent active- active user session via
loginctl - Hyprland process/session present
Audio Indicators
- PipeWire user service active
- active audio clients or REAPER process
- optional JACK graph activity
AI Indicators
vllm*.serviceactive- environment/profile indicates single-GPU or dual-GPU mode
- optional
nvidia-smi-based observation of active GPU usage
Platform Indicators
k3s.serviceactive- optional workload-class indicators
23.2 Observation Heuristic
Observed mode should be derived using a deterministic classifier.
Proposed classifier logic
Observe compute
If all of the following are true:
- no active graphical session
- compute target active or compute service profile active
- vLLM compute profile active or both GPUs assigned to AI policy
Then observed current mode = compute
Observe studio-local
If all of the following are true:
- graphical session active
- audio stack active
- studio-local policy marker active
- AI profile disabled or highly constrained
Then observed current mode = studio-local
Observe desktop
If all of the following are true:
- graphical session active
- desktop policy marker active
- no studio-local policy marker
Then observed current mode = desktop
Observe transitioning
If:
- desired != inferred stable mode
- controller is running or lock file exists
Then observed current mode = transitioning
Observe failed-transition
If:
- last transition failed
- current does not match desired
- no controller currently reconciling
Then observed current mode = failed-transition
23.3 Recommendation
Use a small classifier script:
/usr/local/libexec/mode-controller/observe-current
Outputs:
- plain mode name for shell use
- optional JSON with evidence for debugging
First-pass implementation note:
- implement this in shell first
- preserve a stable output contract so the implementation language can change later without changing the control plane
24. Guard Function Contract
24.1 Guard Naming
check_audio_idle
check_gpu_display_released
check_cpu_load_safe
check_user_jobs_safe
check_memory_headroom
check_vllm_drainable
check_graphical_session_absent
check_graphical_session_present
check_target_reachable
check_studio_capability_local
24.2 Exit Code Convention
0 pass
10 policy block: audio active
11 policy block: display GPU still owned
12 policy block: CPU load too high
13 policy block: user jobs active
14 policy block: insufficient memory headroom
15 policy block: vLLM not drainable
16 policy block: graphical session absent when required
17 policy block: graphical session present when forbidden
18 policy block: target unreachable / invalid request
19 policy block: requested local studio capability not available
20+ execution/inspection errors
30+ internal controller misuse
24.3 Guard Output Contract
Each guard should emit a concise structured line or JSON object such as:
{"guard":"check_audio_idle","ok":false,"code":10,"reason":"reaper process active"}
24.4 Hard vs Soft Guards
Hard guards
Must never be bypassed by ordinary automation:
- active audio protection for Studio-Local -> Compute or Desktop -> Compute
- GPU/display ownership guard
- target validity checks
- local Studio capability checks for
studio-local
Soft guards
May be bypassed by privileged operator action or policy:
- generic CPU load threshold
- selected user-job heuristics
- non-critical memory thresholds
25. Transition Plans with Exact Operations
This section normalizes each transition into explicit steps.
25.1 Common Transition Framework
All transitions should follow:
- acquire lock
- observe current state
- validate requested mode
- if current == desired, exit success
- select transition plan
- run transition guards
- execute pre-actions
- isolate or start target
- execute post-actions
- re-observe current state
- record success/failure
- release lock
25.2 Plan: Desktop -> StudioLocal
Preconditions
- desktop currently observed
- request = studio-local
- local Studio capability is still hosted on the NixOS machine
Guards
check_target_reachablecheck_studio_capability_local- optional
check_user_jobs_safe
Exact operations
write desired=studio-local
flock /run/mode-controller/lock
observe current
run guards
systemctl start audio-priority.service # if modeled separately
systemctl start studio-local-policy.service
observe current
record result
Notes
- GUI remains up
- audio policy is strengthened
- AI capacity is reduced or removed
- if Studio capability has been externalized, this transition must fail cleanly with an explanatory reason
25.3 Plan: StudioLocal -> Desktop
Guards
check_target_reachable
Exact operations
write desired=desktop
flock /run/mode-controller/lock
observe current
run guards
systemctl stop audio-priority.service # if separate helper exists
systemctl stop studio-local-policy.service
systemctl isolate desktop.target
observe current
record result
25.4 Plan: Desktop -> Compute
Guards
check_target_reachablecheck_audio_idlecheck_cpu_load_safecheck_user_jobs_safecheck_memory_headroom
Pre-actions
- terminate graphical session
- wait for GUI disappearance
- verify GPU/display release
Exact operations
write desired=compute
flock /run/mode-controller/lock
observe current
run initial guards
loginctl terminate-session <session-id>
wait until observe-current no longer sees graphical session
run check_gpu_display_released
systemctl isolate compute.target
systemctl start vllm@compute.service
observe current
record result
Additional notes
systemctl isolate compute.targetshould conflict with interactive/graphical targets in your target design- GPU release must be verified after GUI shutdown, not merely assumed
25.5 Plan: Compute -> Desktop
Guards
check_target_reachablecheck_vllm_drainablecheck_memory_headroom
Exact operations
write desired=desktop
flock /run/mode-controller/lock
observe current
run guards
systemctl stop vllm@compute.service # or downscale path
systemctl isolate desktop.target
systemctl start vllm@desktop.service # optional bounded single-GPU profile
observe current
record result
Notes
- graphical session may be started by display manager or login path depending on design
- GPU0 becomes protected for display once Desktop converges
25.6 Plan: StudioLocal -> Compute
Preferred behavior
Treat as a direct guarded transition using the same compute-entry pipeline.
Guards
check_target_reachablecheck_audio_idlecheck_cpu_load_safecheck_user_jobs_safecheck_memory_headroom
Exact operations
write desired=compute
flock /run/mode-controller/lock
observe current
run guards
loginctl terminate-session <session-id>
wait until graphical session absent
run check_gpu_display_released
systemctl isolate compute.target
systemctl start vllm@compute.service
observe current
record result
Policy note
Because Studio-Local is the most protected interactive mode, auto-promotion from Studio-Local to Compute should generally be disabled unless explicitly requested.
26. NixOS Specialisations vs Runtime Switching — Decision Guidance
26.1 Decision Matrix
| Criterion | Runtime Switching | Specialisations | Hybrid |
|---|---|---|---|
| Desktop <-> Studio-Local speed | Excellent | Poor | Excellent |
| Desktop <-> Compute isolation | Moderate | Strong | Stronger |
| Complexity | Lower | Moderate | Highest |
| Early experimentation | Best | Slower | Moderate |
| Deep kernel/boot divergence | Weak | Strong | Strong |
| Operational convenience | High | Lower | Moderate |
| Future externalization of Studio | Good | Good | Best |
26.2 Recommended Decision Rule
Adopt runtime switching now unless one or more of the following become true:
- compute mode needs materially different kernel parameters or boot-time config
- graphical/interactive teardown proves unreliable in practice
- GPU role handoff remains too leaky under runtime-only switching
- you want Compute to be operationally closer to a dedicated server persona than a temporary mode
If any two of the above become persistent problems, promote Compute into a specialisation.
26.3 Recommended Architecture Path
Phase 1
- single NixOS host definition
- runtime switching only
- targets + slices + controller + guards
Phase 2
- strengthen target separation
- gather empirical failure/latency data
Phase 3
- if needed, introduce
specialisation.compute - preserve same desired/current/reconcile interface so operator UX does not change
Phase 4
- if Studio is externalized, deprecate or disable
studio-local - retain the same operator-facing control model for the host-local system
That means mode request compute could later choose:
- runtime reconcile, or
- request/reboot into compute specialisation
without changing the higher-level model.
27. Recommended Next Implementation Steps
- define exact systemd target dependencies/conflicts in Nix
- implement
modeCLI wrapper script - implement
observe-current - implement guard scripts with fixed exit-code contract
- choose between:
vllm@desktop.service/vllm@compute.service- one service with profile env file
- define slice resource policies for interactive vs AI
- wire idle detector to
mode request compute - validate transition behavior manually before enabling automation
- add a capability-placement flag/model for future Studio externalization
28. Summary
This system should behave like a reconciled state machine for host-local operational modes.
The core model is:
- desired mode is explicit runtime intent
- current mode is observed reality
- reconciliation closes the gap
- guards prevent unsafe transitions
- systemd targets/services perform the actual mode enactment
The implementation should start with runtime switching, but preserve a clean path to hybrid specialisation if operational evidence justifies stronger separation later.
Studio/Audio should be treated as a conditional local profile plus a capability-placement decision, so that a future move to a Mac mini does not invalidate the host-local architecture.
Mode State Machine Design (v0.1 — Living)
Purpose
Define an explicit, enforceable state machine governing operational modes for a dual-use NixOS system (desktop + AI compute), including states, transitions, guards, and actions.
1. State Definitions
S0: Boot
- Initial system state
- Minimal services active
- Transitions automatically to default mode
S1: Desktop (Dev)
- Interactive workstation mode
- Balanced resource usage
- GUI + audio enabled
- Limited AI workloads allowed
S2: Studio (Audio)
- Strict low-latency mode
- Audio prioritized
- AI workloads disabled or near-zero
S3: Compute (Headless)
- Throughput-oriented mode
- No GUI
- Full AI utilization (multi-GPU)
S4: Transitioning
- Temporary state
- Ensures safe handoff between modes
2. State Diagram
stateDiagram-v2
[*] --> Boot
Boot --> Desktop : default
Desktop --> Studio : enter_studio
Studio --> Desktop : exit_studio
Desktop --> Transitioning : to_compute
Transitioning --> Compute : success
Transitioning --> Desktop : abort
Compute --> Transitioning : to_desktop
Transitioning --> Desktop : success
Studio --> Desktop : enforced_exit
3. State Properties
Desktop
- GUI: ON
- Audio: ON
- GPU0: Display
- GPU1: AI (optional)
- vLLM: constrained (1 GPU)
- k3s: control plane only
Studio
- GUI: ON
- Audio: RT priority
- GPU0: Display (exclusive)
- GPU1: disabled or minimal
- vLLM: OFF
- k3s: minimal
Compute
- GUI: OFF
- Audio: OFF/minimal
- GPU0 + GPU1: AI
- vLLM: multi-GPU
- k3s: full workloads
4. Transitions
T1: Desktop → Studio
Trigger: user command
Guards:
- No active compute jobs above threshold
Actions:
- Reduce/stop vLLM
- Raise audio priority
- Restrict background jobs
T2: Studio → Desktop
Trigger: user command
Guards: none
Actions:
- Restore normal scheduling
- Allow background workloads
T3: Desktop → Compute
Trigger:
- manual command
- idle-triggered event
Guards:
- No active audio sessions (PipeWire graph empty)
- No REAPER process OR project inactive
- GPU not held by compositor
- CPU load below threshold
- No long-running user jobs
Actions:
- Notify user (if interactive)
- Terminate GUI session
- Wait for GPU release
- Stop audio services
- Expand vLLM to multi-GPU
- Enable compute services (k3s workloads)
T4: Compute → Desktop
Trigger: user command
Guards:
- vLLM can scale down OR be stopped
- GPU memory can be reclaimed
Actions:
- Drain or stop AI workloads
- Reduce vLLM to single GPU or stop
- Start graphical target
- Reassign GPU0 to display
- Start audio stack
T5: Studio → Compute
Trigger: (not allowed)
Policy:
- Must transition via Desktop
5. Guards (Detailed)
G1: Audio Idle
- PipeWire graph contains no active nodes
- No JACK clients
G2: GPU Availability
- No compositor process using GPU
- Low GPU utilization
G3: CPU Load
- Load average below threshold (configurable)
G4: User Workload Safety
- No known long-running dev tasks
- Optional: no foreground terminals
G5: Memory Headroom
- Sufficient free RAM for mode switch
6. Actions (Atomic Steps)
A1: Stop GUI
loginctl terminate-session
A2: Release GPU
- Wait until no graphical processes hold GPU
A3: Adjust Services
- systemd isolate target
A4: Adjust Resource Limits
- Modify cgroups/slices
A5: Scale AI Services
- Adjust CUDA_VISIBLE_DEVICES
- Restart vLLM
7. Failure Handling
Abort Conditions
- Guard failure
- Timeout waiting for GPU release
- Service failure
Behavior
- Log reason
- Return to previous stable state
8. Observability
Required Signals
- Current mode
- Last transition
- Guard evaluation results
- Resource usage snapshot
Interfaces
- CLI:
mode status - Logs: journald
9. Extensibility
Future states may include:
- Maintenance mode
- Remote-only desktop mode
- GPU-partitioned mode
10. Notes
- This state machine should be implemented via systemd targets + controller script
- Transitions must be idempotent
- Guards should be configurable
- Prefer dry-run capability before execution
Summary
This system treats operational modes as a formal state machine with:
- explicit states
- guarded transitions
- deterministic actions
This enables safe coexistence of:
- low-latency desktop workloads
- high-throughput AI services
Dual-Mode NixOS Workstation AI Node — Unified Planning and Mode State Machine
Implementation Checklist Plan
This is structured to get you from doc → bootable system with minimal thrash.
Phase 0 — Ground Truth (before touching Nix)
Hardware + constraints
- Confirm GPU topology (which is GPU0 vs GPU1)
- Confirm display wiring (which GPU drives monitor)
- Confirm audio interface + latency requirements
- Validate NVIDIA driver compatibility with NixOS + Wayland/Hyprland
Decisions to lock
-
Use runtime switching (no specialisations yet)
-
Studio =
studio-local(conditional policy overlay ondesktop, not a first-class target in v1) -
Source of truth =
/run/mode-controller/desired -
mode requestis synchronous: return success only after convergence -
Choose vLLM unit model for v1:
- v1 fast path: single compute-only
vllm.service - target architecture:
vllm@desktop.serviceandvllm@compute.service
- v1 fast path: single compute-only
-
k3s policy for v1:
- keep
k3s.servicerunning across modes - change slice budgets and allowed workload intensity by mode
- defer full k3s mode switching unless operational evidence justifies it
- keep
-
desktop-mode AI policy for v1:
- keep vLLM off in
desktopfor the first convergence milestone - only add bounded desktop-mode AI after
desktop↔computeswitching is reliable
- keep vLLM off in
-
studio-localoverlay representation for v1:studio-local-policy.serviceaudio-priority.service
-
capability-placement.jsonsource for v1:- generated from Nix configuration
- no runtime override unless a real need emerges
-
defer
mode forcein v1 -
GUI teardown policy for
computetransitions:- require graphical session absence
- require explicit GPU-release verification
- only add display-manager/greeter stop logic if testing proves it necessary
-
desktop.targetshould not directly own greeter/login in v1 -
studio-local-policy.serviceshould be:- a reliable marker for observation
- a light policy-application unit
- not a giant all-in-one Studio controller
-
observe-currentimplementation for v1:- shell first
- stable plain-text + JSON output contract
- replace with typed helper later only if complexity justifies it
-
package
modetools inpkgs/and install them through the module -
controller trigger model for v1:
- parameterized oneshot only
- no timer/path-triggered reconcile until manual transitions are proven
-
boot policy for v1:
- normalize to
desktopon boot - defer persistent desired-state replay across reboot
- normalize to
-
Define hard vs soft guards before automation
Phase 0.5 — Control Contract (before full workload integration)
Runtime state contract
-
Define
/run/mode-controller/- desired
- current
- lock
- last-transition.json
- last-guards.json
- capability-placement.json
CLI contract
-
Implement or stub:
mode requestmode statusmode reconcilemode currentmode desiredmode dry-runmode explain
-
defer
mode forceuntil guard policy is battle-tested
Observation contract
-
Classifier can return:
desktopstudio-localcomputetransitioningfailed-transition
Guard contract
- Add
check_target_reachable - Standardize exit codes
- Standardize structured output
- Mark guards as hard vs soft
Phase 1 — Base NixOS System
Core system
- Create flake repo (if not already)
- Install NixOS (minimal)
- Enable flakes + nix-command
- Add SSH + basic hardening
GPU + CUDA
- Install NVIDIA drivers (matching kernel)
- Validate
nvidia-smi - Validate CUDA runtime
Desktop
- Install Hyprland
- Configure login/session (greetd or similar)
- Validate Wayland stability with NVIDIA
Audio
- Install PipeWire + WirePlumber
- Validate low-latency config
- Test REAPER baseline
Phase 2 — systemd Mode Skeleton
Targets / policy markers
-
Define first-class targets:
desktop.targetcompute.target
-
Define
studio-localas a policy overlay ondesktop -
Add explicit policy marker/service for
studio-local -
Decide whether
studio-localis represented by:audio-priority.servicestudio-local-policy.servicelayered overdesktop- another lightweight marker unit
Relationships
-
Add
Conflicts=between:- compute ↔ graphical targets
-
Add
Wants=/After=dependencies
Slices
-
Define:
interactive.sliceai.sliceplatform.slice
-
Assign services to slices
Phase 3 — Mode Controller (Core)
Core controller
-
mode-controller@.service -
observe-current -
reconcile - lock handling
- state-file updates
- dry-run path
Failure model
- Record
failed-transition - Record prior mode
- Record guard/action failures
- Verify abort-to-safe-state behavior
Phase 4 — Workload Layer
AI / vLLM
-
Package or install vLLM
-
Create profile-specific config/env for:
- desktop profile
- compute profile
-
Implement either:
- v1 fast path: single
vllm.service - target path:
vllm@desktop.service+vllm@compute.service
- v1 fast path: single
-
keep vLLM disabled in
desktopfor the first bootable transition milestone -
Validate single-GPU mode
-
Validate dual-GPU mode
-
Keep controller actions profile-aware so later split is mechanical
Platform / k3s
-
Install k3s
-
Configure control node
-
Validate cluster health
-
Deploy minimal workload
-
Keep
k3s.servicestable acrossdesktopandcomputein v1 -
Express mode differences via:
platform.slicebudgets- workload policy / allowed intensity
- optional node labels / taints later
Phase 5 — State Observation
Implement classifier
-
observe-currentscript
Detect:
- graphical session (loginctl / process)
- PipeWire / audio activity
- vLLM service state
- GPU usage (optional:
nvidia-smi)
Output
- plain mode
- optional JSON (debug)
- classify
transitioning - classify
failed-transition
Phase 6 — Guards
Implement guards (scripts)
-
check_target_reachable -
check_audio_idle -
check_gpu_display_released -
check_cpu_load_safe -
check_user_jobs_safe -
check_memory_headroom -
check_vllm_drainable -
check_studio_capability_local
Standardize
- exit codes
- JSON output
- logging
- hard vs soft guard policy
Phase 7 — Transition Execution
Implement transition flows
- Desktop → StudioLocal
- StudioLocal → Desktop
- Desktop → Compute
- Compute → Desktop
- StudioLocal → Compute
Verify explicitly
- graphical session absence before compute promotion
- GPU release after GUI shutdown
- vLLM profile switching
- audio protection works
- transitions are idempotent
- failed guard returns to prior safe state
- failed action records
failed-transition
Phase 8 — Idle + Automation
Idle detection
- implement idle signal (input + audio + load)
- threshold tuning
Policy
- idle →
mode request compute - guard failures → no transition
Safety
- never auto-promote from
studio-local
Phase 9 — Observability
Logging
-
structured logs for:
- transitions
- guards
- failures
Status
-
mode statusshows:- desired
- current
- last transition
- blocking guards
- capability placement
Phase 10 — Hardening
Failure handling
- retry logic (bounded)
- failed-transition state handling
Resource tuning
- CPU quotas per slice
- memory limits
- I/O priority
- tune
platform.sliceconservatively fordesktop/studio-local, relaxed forcompute
Security
- restrict mode controller to root
- audit transitions
- isolate AI services
Phase 11 — Optional Evolution
If runtime switching is insufficient
- introduce
specialisation.compute - keep same
modeinterface - optionally promote
studio-localoverlay into a stronger first-class target only if operational evidence justifies the added complexity - consider stronger k3s mode-switching only if slice-governed steady-state behavior is inadequate
If Studio moves to Mac mini
- set
capability-placement.json - disable
studio-local - keep controller intact
Critical Path (short version)
If you want the fastest path to something real:
- Base NixOS + GPU + Hyprland
- vLLM working (single GPU)
- Define targets (
desktop,compute) - Simple
modeCLI + desired file - Hardcoded transitions (no guards yet)
- Add guards + observation
- Add idle automation
- Add
studio-locallast
Where this can go wrong (worth calling out)
-
GPU release is the hardest boundary → don’t assume, always verify
-
Audio is fragile → treat StudioLocal invariants as strict
-
systemd isolate can surprise you → test with minimal configs first
-
too much cleverness early → get a dumb working version first, then refine
First Bring-Up Checklist
This is the shortest practical path to getting the first live build onto a real NixOS machine.
It assumes:
- this repo is available on the target machine
- the target machine is the intended
workstationhost - the current v1 policy remains:
- boot default =
desktop studio-localis an overlay ondesktop- vLLM is compute-only when explicitly enabled
- boot default =
1. Put the Repo on the Target Machine
git clone <repo-url> /path/to/dubnium
cd /path/to/dubnium
If the repo is already local:
cd /path/to/dubnium
2. Generate Real Hardware Configuration
The scaffold currently contains a placeholder hardware file.
On the target NixOS machine:
sudo nixos-generate-config --dir ./hosts/workstation
This should populate:
hosts/workstation/hardware-configuration.nix
Review that file and make sure:
- it matches the actual boot disk/filesystem layout
- it does not remove the existing import structure in
hosts/workstation/default.nix
3. Review Host-Specific Settings Before First Build
Check hosts/workstation/default.nix.
Important values to confirm:
networking.hostNamedubnium.hardware.presentGpusdubnium.hardware.displayGpudubnium.hardware.computeGpusdubnium.vllm.enabledubnium.vllm.model
Current intended first live model:
Qwen/Qwen2.5-Coder-14B-Instruct
Current intended first hardware phase:
- planned architecture: 2 GPUs
- currently present: GPU
0 - compute GPU set:
[ 0 ]
4. Build Without Switching First
Do a dry build first:
sudo nixos-rebuild build --flake .#workstation
If this fails:
- fix Nix evaluation issues first
- do not jump into
switch
Common first-failure areas:
- hardware configuration mismatch
- NVIDIA options
- package evaluation problems
- typos in host-local settings
5. Switch to the New Configuration
If the build succeeds:
sudo nixos-rebuild switch --flake .#workstation
6. Verify Core Pieces After Switch
Check the mode CLI:
mode status
mode current
mode desired
Check runtime state files:
sudo ls -la /run/mode-controller
sudo cat /run/mode-controller/desired
sudo cat /run/mode-controller/current
sudo cat /run/mode-controller/capability-placement.json
sudo cat /run/mode-controller/hardware-topology.json
Check systemd units:
systemctl status desktop.target
systemctl status compute.target
systemctl status studio-local-policy.service
systemctl status audio-priority.service
systemctl status vllm.service
Notes:
vllm.serviceshould not be active indesktop- with default workstation settings,
vllm.serviceshould not exist untildubnium.vllm.enable = true studio-local-policy.serviceandaudio-priority.serviceshould not be active unlessstudio-localis requested
7. Test desktop -> studio-local
sudo mode request studio-local
mode status
systemctl status studio-local-policy.service
systemctl status audio-priority.service
Expected result:
- current mode becomes
studio-local studio-local-policy.serviceis activeaudio-priority.serviceis active
Then return:
sudo mode request desktop
mode status
8. Test desktop -> compute
Before testing:
- close REAPER
- avoid active audio work
- avoid long-running foreground development jobs
- seed the local model bundle from USB
- explicitly enable
dubnium.vllm.enable = trueif this test should exercise the vLLM service
Then:
sudo mode request compute
mode status
systemctl status compute.target
systemctl status vllm.service
Expected result:
- graphical session is terminated
- system converges to
compute - if vLLM is enabled,
vllm.serviceis started bycompute.target
Important caveat:
Seed the local model bundle from USB before the first compute transition. If the bundle is absent, vLLM should fail clearly rather than relying on a first-run network download.
9. Test compute -> desktop
sudo mode request desktop
mode status
systemctl status vllm.service
Expected result:
vllm.serviceis stopped- system converges back to
desktop
10. If Something Fails
Check:
mode status
sudo cat /run/mode-controller/last-transition.json
sudo cat /run/mode-controller/last-guards.json
journalctl -u 'mode-controller@*' -b
journalctl -u vllm.service -b
Most useful first diagnosis buckets:
- guard blocked transition
- graphical session did not terminate cleanly
- GPU did not look released
- vLLM service failed to start
- model/runtime/CUDA issue
11. First Successful Milestone
You should consider first bring-up successful when all of the following are true:
nixos-rebuild switch --flake .#workstationsucceedsmode statusworksdesktop -> studio-local -> desktopworksdesktop -> compute -> desktopworkslast-transition.jsonandlast-guards.jsonare useful for failures
At that point, the next iteration is:
- tighten NVIDIA/vLLM runtime behavior
- improve
observe-current - tune
audio-priority.service - refine slice policy
- add second GPU when ready
Fresh Install Checklist
This checklist is for installing dubnium onto a machine from scratch using a NixOS live USB.
Use this when:
- the target machine does not already run NixOS
- you are replacing the current OS
- you want the flake to be the source of truth from first boot
If the machine already runs NixOS, use docs/first-bring-up-checklist.md instead.
Each top-level step has:
- Start when: what must already be true before starting the step
- Outcomes: what should be true when the step is complete
1. Prepare a NixOS Installer USB
Current preferred path: use the Dubnium custom installer USB, not a stock ISO.
The custom installer bakes a source export of this private repo plus
external/dotfiles into the live image. Write it to USB as a raw disk image,
matching Rufus “DD image mode”. Use separate writable media for a local model
seed bundle.
Build the ISO and prepare the seed model:
scripts/build-installer-iso.sh \
--iso ./dubnium-installer.iso
This writes ./dubnium-installer.iso into the checkout. By default the helper
uses the current Dubnium default model bundle, but the USB layout only requires a
materialized model directory with config.json and SHA256SUMS. Pass
--seed-model when using a different local bundle.
Then prepare the USB with the guarded writer for the current platform.
Windows PowerShell:
.\scripts\write-installer-usb.ps1 `
-IsoPath .\dubnium-installer.iso `
-DiskNumber 7 `
-ExpectedFriendlyName "USB SanDisk 3.2Gen1"
The writer requires the disk identity check and final y/N confirmation. It
overwrites the whole USB disk with the ISO image.
Optional one-shot Windows path:
.\scripts\build-installer-usb.ps1 `
-DiskNumber 7 `
-ExpectedFriendlyName "USB SanDisk 3.2Gen1"
Optional one-shot Linux or macOS path:
bash scripts/build-installer-usb.sh \
--disk /dev/sdX \
--expected SanDisk
On macOS, use a whole disk such as /dev/diskN. On Linux, use a whole USB disk
such as /dev/sdX, not a partition.
Manual Linux USB write path:
scripts/write-installer-usb.sh \
--iso ./dubnium-installer.iso \
--disk /dev/sdX \
--expected SanDisk
Expected USB layout:
dubnium-installer.iso -> whole USB disk
Verify the installer media from whichever drive letter Windows assigns:
Test-Path I:\EFI\BOOT\BOOTX64.EFI
Test-Path I:\nix-store.squashfs
Get-Volume -DriveLetter I
Separate seed media should contain:
models/selected-model-bundle/
See docs/runbooks/custom-installer-iso.md for the full USB process and docs/runbooks/model-seeding.md for the seed bundle commands.
Start when
- existing Nix-capable build machine with the Dubnium repo checkout
- USB stick that can be erased
- materialized model bundle is available locally, if seeding the model now
- permission to run the guarded USB writer for the platform
Outcomes
- custom Dubnium ISO is built from the intended flake source
- USB device identity was checked by the platform helper before erase
- USB has a bootable raw-written Dubnium installer image
- separate seed media has the local model bundle, if seeding now
- the install path requires no GitHub token, private SSH key, or Hugging Face download in the live installer
1.1 USB Security And Drift Check
Before leaving the build machine, confirm:
git status --short
git -C external/dotfiles status --short
The ISO bakes tracked flake source, including external/dotfiles, into the
installer. Stage or commit intentional changes before building, and do not bake
decrypted secrets, long-lived tokens, SSH private keys, local caches, or model
weights into the repo.
The USB is private media. The installer payload contains private source, and separate seed media contains unencrypted model files.
1.2 Seamless USB Acceptance Check
Before booting the target, verify the prepared stick:
EFI/BOOT/BOOTX64.EFI
nix-store.squashfs
seed-media/models/selected-model-bundle/config.json
seed-media/models/selected-model-bundle/SHA256SUMS
If the model bundle is not on the USB yet, use docs/runbooks/model-seeding.md before booting the target.
1.3 Stock ISO Fallback
A stock NixOS ISO remains useful for rescue, but it is not the preferred fresh Dubnium install path. If using stock media, you must bring the Dubnium source and initialized dotfiles submodule on separate private media, then install from that local checkout. Do not depend on live-session GitHub credentials for a private-repo install.
2. Boot the Target Machine From USB
Start when
- prepared NixOS installer USB
- physical access to the target machine
- firmware access or boot-menu access
Outcomes
- target machine is booted into the NixOS live environment
- firmware boot mode and target disk visibility are confirmed
- keyboard, display, disk visibility, and network are usable
- source import tools are available before repo setup steps begin
- private repo source is reachable from the live environment
- optional SSH access to the live environment is available if needed
2.1 Confirm Firmware Settings
Before booting the installer, review firmware settings:
- boot mode should be UEFI, not legacy/CSM
- Secure Boot should be disabled unless you intentionally handle it
- internal install disk should be visible
- primary display GPU should be the one you expect
- Above 4G decoding should be enabled if the firmware exposes it and you plan to use multiple GPUs
- virtualization/IOMMU can be enabled if you expect to use it later
Do not proceed if the firmware cannot see the target install disk.
2.2 Enter the Boot Menu
Insert the USB stick into the target machine, power it on, and enter the firmware boot menu.
Common boot-menu keys:
F8F11F12EscDel
Choose the USB entry. Prefer the UEFI entry if the firmware shows both legacy and UEFI options.
2.3 Confirm Live Environment Basics
After the NixOS live environment boots, open a terminal.
Check that the machine sees CPU, memory, disks, and network devices:
lscpu | head
free -h
lsblk -o NAME,SIZE,MODEL,TYPE,MOUNTPOINTS
ip link
Check network connectivity:
ip addr
ping -c 3 1.1.1.1
ping -c 3 github.com
If networking is not up:
- connect Ethernet if available
- use the graphical network manager in the GNOME ISO
- on the minimal ISO, use
nmtuiif available:
sudo nmtui
Exit criteria:
- keyboard works
- display works
- target install disk is visible
- internet access works
2.4 Ensure Source Import Tools Are Available In the Live Environment
On the custom Dubnium installer USB, confirm the baked source helper exists:
command -v unpack-dubnium
tar --version
If unpack-dubnium is available, section 3 can use the baked source snapshot
directly and does not need GitHub credentials.
Before importing the repo source from the live USB session, confirm git and
basic archive tools are available:
git --version
tar --version
If git is missing and you need it for validation, install it in the live
environment:
nix-shell -p git --run 'git --version'
If you need git for more than one command, enter a shell with it available:
nix-shell -p git
git --version
Exit criteria:
git --versionsucceeds in the current shell or in the shell you will use to inspect the repotar --versionsucceeds if you are extracting an archive
2.5 Optional: Enable SSH Into the Live Environment
Use this if the target machine is easier to drive from another computer. Note: If using the Custom Installer, your SSH keys may already be authorized. Otherwise:
- Set a temporary password:
passwd - Or add your key:
mkdir -p ~/.ssh && echo "ssh-ed25519 ..." >> ~/.ssh/authorized_keys
Start SSH:
sudo systemctl start sshd
ip addr
Then connect from another machine using the live environment IP address:
ssh nixos@<target-ip>
This access is temporary and only applies to the live USB environment.
3. Make the Repo Available in the Live Environment
Start when
- live NixOS environment is running
- the custom installer source snapshot is available, or a separate private source export is attached to the machine
Outcomes
- Dubnium repo exists in the live environment
- repo contains
flake.nix - repo contains
hosts/workstation/default.nix - repo contains
external/dotfiles/flake.nix - commands are being run from the repo root
3.0 Preferred: Unpack From Custom Installer Media
For the current one-shot install path, run the guarded installer helper:
install-dubnium-from-usb
This replaces the manual section 3 through section 9 flow for the simple
unencrypted layout. The helper prints lsblk, prompts for the target whole
disk, and asks for final y/N confirmation before erasing anything. Defaults
are btrfs, dubnium home profile, passwd password mode, and copying the
install snapshot into the installed system. Use --password-mode hash to write
a host-local initial password hash before install, or --password-mode skip
when another login path already exists. Use --dry-run first if disk identity
is not yet obvious.
If booted from the Dubnium custom installer USB, use the baked source snapshot:
unpack-dubnium
cd ~/local/src/dubnium
This is the token-free private repo path. It does not clone from GitHub during install.
To choose the installed normal user, create hosts/workstation/user.nix before
install:
{
dubnium.user.name = "alice";
dubnium.user.description = "Example User";
}
3.1 Alternate: Copy Source From Local Media
If you brought the repo on separate media, attach it now and identify it:
lsblk -o NAME,SIZE,MODEL,TRAN,TYPE,MOUNTPOINTS
Mount the removable media read-only if practical, then extract or copy the exported source into your working directory.
Example for a separate git archive-style export:
mkdir -p ~/installer-src
cd ~/installer-src
tar -xzf /path/to/dubnium-installer-src.tgz
cd dubnium
Example for a plain copied export tree:
mkdir -p ~/Projects
cp -a /path/to/dubnium ~/Projects/dubnium
cd ~/Projects/dubnium
This path avoids depending on live-session GitHub credentials. Prefer the custom installer payload when available, because it keeps the source path and helper behavior consistent.
3.2 Alternate: Extract A Separate Source Archive
If you are not using the current custom installer payload, bring a separate source archive and extract it to the same live-session path:
mkdir -p ~/local/src
tar -xzf /path/to/dubnium-installer-src.tgz -C ~/local/src
cd ~/local/src/dubnium
3.3 Verify Repo Contents
pwd
ls
git status --short
test -f flake.nix
test -f hosts/workstation/default.nix
test -f external/dotfiles/flake.nix
Exit criteria:
- repo is present locally, whether copied, extracted, or imported
flake.nixexistshosts/workstation/default.nixexistsexternal/dotfiles/flake.nixexists
4. Partition the Target Disk
Start when
- target install disk is visible in
lsblk - disk encryption decision is made
- swap/hibernation decision is made
- target disk has been positively identified and is safe to erase
Outcomes
- target disk has a new GPT partition table
- EFI system partition exists
- root partition exists
EFI_PARTandROOT_PARTpoint to real block devices- no partitioning commands have touched the USB installer
This repo does not yet prescribe a disk layout.
The example below uses a simple UEFI layout:
- EFI system partition: 1 GiB, FAT32, mounted at
/boot - root partition: rest of disk, ext4, mounted at
/
This example does not create a separate /home partition and does not create a
swap partition. Add those only if you deliberately want them.
This example does not enable disk encryption. If you want LUKS or a separate encrypted data layout, stop here and use a different partition/filesystem plan.
This example also does not create a swap partition. If hibernation is required, stop here and design swap explicitly. If hibernation is not required, zram can be handled later in NixOS configuration.
4.1 Identify the Install Disk
List disks:
lsblk -o NAME,SIZE,MODEL,TRAN,TYPE,MOUNTPOINTS
Example NVMe disk:
nvme0n1 1.8T Samsung_SSD disk
Example SATA/SAS/USB-style disk:
sda 1.8T Samsung_SSD disk
Set the target disk variable:
DISK=/dev/nvme0n1
or:
DISK=/dev/sda
Important:
- this must be the internal install disk
- this must not be the USB installer
- all data on this disk will be destroyed once partitioning begins
4.2 Confirm Existing Layout
Before touching the disk:
echo "$DISK"
lsblk -o NAME,SIZE,MODEL,TYPE,FSTYPE,MOUNTPOINTS "$DISK"
sudo fdisk -l "$DISK"
Before touching disks, decide:
- disk device name
- EFI size
- root filesystem choice
- whether you want swap or zram only
- whether you want a separate
/home
Minimum sane layout:
- EFI system partition
- root partition
Example tools:
lsblkblkidfdiskpartedgdisk
Do not proceed until you are sure which disk you are installing to.
4.3 Preview and Clear Existing Signatures
Preview existing filesystem and partition signatures:
sudo wipefs -n "$DISK"
If the disk is definitely the install target, clear old signatures:
sudo wipefs -a "$DISK"
This is destructive. Do not run it against the USB installer or any disk you intend to preserve.
4.4 Create a GPT Partition Table
This is destructive. Only run it after confirming DISK.
echo "About to partition: $DISK"
lsblk -o NAME,SIZE,MODEL,TYPE,MOUNTPOINTS "$DISK"
Create the partition table and partitions:
sudo parted "$DISK" -- mklabel gpt
sudo parted "$DISK" -- mkpart ESP fat32 1MiB 1025MiB
sudo parted "$DISK" -- set 1 esp on
sudo parted "$DISK" -- mkpart primary ext4 1025MiB 100%
Ask the kernel to re-read the partition table:
sudo partprobe "$DISK"
sleep 2
lsblk -o NAME,SIZE,MODEL,TYPE,FSTYPE,MOUNTPOINTS "$DISK"
4.5 Set Partition Variables
For NVMe disks, partitions are usually named with p1 / p2:
EFI_PART="${DISK}p1"
ROOT_PART="${DISK}p2"
For SATA/SAS-style disks, partitions are usually named 1 / 2:
EFI_PART="${DISK}1"
ROOT_PART="${DISK}2"
Verify:
echo "EFI_PART=$EFI_PART"
echo "ROOT_PART=$ROOT_PART"
test -b "$EFI_PART"
test -b "$ROOT_PART"
lsblk -o NAME,SIZE,MODEL,TYPE,FSTYPE,MOUNTPOINTS "$DISK"
5. Create Filesystems and Mount Them
Start when
EFI_PARTpoints to the EFI partitionROOT_PARTpoints to the root partition- both partition variables have been verified with
test -b
Outcomes
- EFI partition is formatted FAT32
- root partition is formatted ext4
- root partition is mounted at
/mnt - EFI partition is mounted at
/mnt/boot - mount layout matches the future NixOS filesystem config
5.1 Format the Partitions
This is destructive to the selected partitions.
sudo mkfs.fat -F 32 -n NIXBOOT "$EFI_PART"
sudo mkfs.ext4 -L nixos "$ROOT_PART"
5.2 Mount the Root Filesystem
sudo mount "$ROOT_PART" /mnt
5.3 Mount the EFI Filesystem
The current host config expects systemd-boot, so mount the EFI filesystem at
/mnt/boot:
sudo mkdir -p /mnt/boot
sudo mount "$EFI_PART" /mnt/boot
5.4 Verify Mount Layout
Once mounted, verify:
findmnt /mnt
findmnt /mnt/boot
lsblk -o NAME,SIZE,FSTYPE,LABEL,MOUNTPOINTS "$DISK"
Expected:
- root partition mounted at
/mnt - EFI partition mounted at
/mnt/boot
6. Generate Hardware Configuration Into the Repo
Start when
- repo is available and current shell is at repo root
- target root filesystem is mounted at
/mnt - target EFI filesystem is mounted at
/mnt/boot
Outcomes
hosts/workstation/hardware-configuration.nixreflects the target hardware- generated filesystem entries match
/mntand/mnt/boot - placeholder hardware config has been replaced
- git diff shows the hardware config change
6.1 Generate Config
From the repo root:
sudo nixos-generate-config --root /mnt --dir ./hosts/workstation
This should populate:
hosts/workstation/hardware-configuration.nix
Important:
- this file must reflect the real disk layout you just mounted
- this replaces the scaffold placeholder currently in the repo
6.2 Review Generated Hardware Config
sed -n '1,220p' hosts/workstation/hardware-configuration.nix
Confirm:
- root filesystem points at the root partition or its filesystem label/UUID
/bootpoints at the EFI partition- generated imports look normal
- no obvious reference to the USB installer disk exists
6.3 Confirm Git Diff
git diff -- hosts/workstation/hardware-configuration.nix
Exit criteria:
- hardware config changed from placeholder to real host config
- filesystem entries match the mounted target disk
7. Review Host Config Before Install
Start when
- generated hardware config exists
- host config exists at
hosts/workstation/default.nix - hardware facts are known well enough to set GPU options accurately
- login/access strategy is known
Outcomes
- hostname, bootloader, SSH, GPU, vLLM, and k3s settings are reviewed
- GPU settings reference only installed/visible GPUs
- vLLM first-install stance is explicit
- k3s first-install stance is explicit
- at least one installed-system login path is known
7.1 Inspect Host Config
Check hosts/workstation/default.nix.
sed -n '1,240p' hosts/workstation/default.nix
At minimum confirm:
- hostname
- current GPU assumptions
- vLLM model choice
- any network or SSH expectations
- bootloader settings
- k3s enablement
Current scaffold assumptions:
- boot default is
desktop studio-localis adesktopoverlay- vLLM is compute-only
- planned topology is 2 GPUs
- currently present GPU set defaults to
[ 0 ]
7.2 Confirm GPU Settings
If the target currently has only one NVIDIA GPU:
dubnium.hardware.presentGpus = [ 0 ];
dubnium.hardware.displayGpu = 0;
dubnium.hardware.computeGpus = [ 0 ];
If the target has two NVIDIA GPUs and you are ready to expose both to compute,
update only after confirming nvidia-smi ordering:
dubnium.hardware.presentGpus = [ 0 1 ];
dubnium.hardware.displayGpu = 0;
dubnium.hardware.computeGpus = [ 0 1 ];
For first bring-up, prefer the most conservative accurate setting. Do not list a GPU that is not installed and visible.
7.3 Confirm vLLM Settings
Current host config disables vLLM by default so the workstation can prove the base desktop system before model/runtime work:
dubnium.vllm.enable = false;
If opting into vLLM for compute testing, set dubnium.vllm.enable = true and
consider explicit first-run guardrails:
dubnium.vllm.extraArgs = [
"--max-model-len" "8192"
"--gpu-memory-utilization" "0.70"
"--enforce-eager"
];
7.4 Confirm k3s Settings
Current host config has:
dubnium.k3s.enable = false;
Keep k3s disabled for the first install unless you specifically want to validate k3s during the first boot.
7.5 Confirm User and Access Settings
Before installing, confirm how you will log into the installed system:
rg -n "users\\.users|openssh|authorizedKeys|initialPassword|hashedPassword" hosts modules
The current host config enables SSH, but this checklist should not assume a normal user account exists unless the NixOS config declares it.
Choose one access strategy before install:
- root password set by
nixos-install - declared normal user with password or SSH key
- SSH key access configured in NixOS
For the default workstation user, keep the password hash local by adding
hosts/workstation/user.nix before install:
{
users.users.ryjen.initialHashedPassword = "$y$j9T$...";
}
Generate the hash in the live environment with:
mkpasswd -m yescrypt
Do not reboot into the installed system without knowing at least one login path.
8. Optional Dry Evaluation Before Install
Start when
- repo is at install-ready state
- generated hardware config exists
- network access is working in the live environment
- Nix can evaluate flakes in the live environment
Outcomes
- flake evaluation has been attempted
mode-toolspackage build has been attempted- any evaluation/build failure is understood before install
- no unknown evaluation error is carried into
nixos-install
8.1 Build the Target System
If the live environment has working Nix daemon support and networking, try:
sudo nixos-rebuild build --flake .#workstation
This is optional but useful.
If it fails:
- fix evaluation problems before running the installer
8.2 Build the Mode Tools Package
nix build .#packages.x86_64-linux.mode-tools
8.3 Inspect Common Evaluation Failures
Common buckets:
- hardware configuration references the wrong disk
- NVIDIA package/options fail to evaluate
- vLLM package is unavailable or expensive to build in the live environment
- unfree packages are blocked
- host option assertions fail
Exit criteria:
- the flake evaluates
- the system build either succeeds or fails for a known reason you have decided
to accept before
nixos-install
9. Install From the Flake
Start when
/mntand/mnt/bootare mounted correctly- hardware config and host config are reviewed
- dirty repo state is intentional
- installed-system login path is known
- repo persistence plan is explicit
Outcomes
- NixOS is installed from
.#workstation - bootloader installation result is known
- root password or equivalent access path is established
- repo is copied into the installed filesystem or a post-boot source import plan is explicit
9.1 Final Preinstall Check
Before installing:
findmnt /mnt
findmnt /mnt/boot
lsblk -o NAME,SIZE,FSTYPE,LABEL,MOUNTPOINTS "$DISK"
git status --short
Confirm:
/mntis the target root filesystem/mnt/bootis the target EFI filesystem- generated hardware config is present
- host config is reviewed
- any dirty repo state is intentional
9.2 Confirm Repo Persistence Plan
The live USB environment is temporary. The install itself uses the live checkout
at ~/local/src/dubnium, but that path does not automatically become an
installed-system checkout.
If you want the flake source to be available immediately after first boot, copy the current repo into the target filesystem before installing.
Example target location:
sudo mkdir -p /mnt/home/<user>/Projects
sudo cp -a "$(pwd)" /mnt/home/<user>/Projects/dubnium
If the installed system will have a different user or home path, adjust the destination.
If you prefer not to copy from the live environment, plan how you will import the repo source again after first boot. Do not assume the live-environment checkout survives reboot.
The custom installer source payload belongs to the USB live system. It is enough to install from, but it does not automatically become a checkout on the installed system. If install-time changes need to go back to the private Dubnium repo, reconcile them after first boot using Post-Install Source Reconciliation.
9.3 Run Installer
From the repo root:
sudo nixos-install --flake .#workstation
If the installer asks for a root password, set one unless you have already configured another access path.
9.4 Capture Install Result
If install succeeds, note:
- whether bootloader installation succeeded
- whether any warnings appeared
- whether a root password was set
If install fails, do not reboot yet. Inspect the error while still in the live environment.
10. Reboot Into the Installed System
Start when
nixos-install --flake .#workstationcompleted successfully- bootloader result is known
- root password or other access path exists
- no unresolved install error remains
Outcomes
- machine boots from the internal disk
- USB installer is removed or not selected
- installed NixOS system reaches a login/session path
- if boot fails, rescue path is known and documented
10.1 Unmount and Reboot
If install succeeded:
sync
sudo reboot
Remove the USB stick when appropriate so the machine boots from disk.
10.2 Select Installed Disk
If the machine boots back into the USB installer:
- remove the USB stick
- enter firmware boot menu
- select the internal disk or
Linux Boot Manager
10.3 Recovery If Boot Fails
If the installed system does not boot:
- boot the USB installer again
- mount root and EFI partitions back under
/mnt - inspect
/mnt/etc/nixosand the generated hardware config - check firmware boot entries with
bootctlfrom a chroot if needed
Concrete rescue mount:
sudo mount "$ROOT_PART" /mnt
sudo mount "$EFI_PART" /mnt/boot
Enter the installed system:
sudo nixos-enter --root /mnt
Inside the chroot:
bootctl status
nixos-rebuild boot --flake /home/<user>/Projects/dubnium#workstation
exit
If the repo was not copied into the installed filesystem, use the path where it actually exists or import it again from your prepared source media.
11. First Boot Verification
Start when
- installed system has booted from internal disk
- operator can log in locally or over SSH
- repo exists on the installed system or can be imported immediately
Outcomes
- installed system identity is verified
- repo source is available on the installed system
- mode CLI works
- runtime state files exist
- first observed mode is
desktop - vLLM and studio overlay services are inactive in desktop
- NVIDIA basics are verified before any compute testing
11.1 Verify Basic System Identity
After booting the installed system:
hostname
uname -a
ip addr
11.2 Verify Repo Location
If you copied the repo before install:
test -d ~/Projects/dubnium
cd ~/Projects/dubnium
git status --short
If the repo is missing, import it now before treating the system as fully owned by the flake source.
11.3 Verify Mode CLI
mode status
mode current
mode desired
11.4 Verify systemd Units
systemctl status desktop.target
systemctl status compute.target
systemctl status vllm.service
sudo ls -la /run/mode-controller
11.5 Verify Runtime State Files
sudo cat /run/mode-controller/desired
sudo cat /run/mode-controller/current
sudo cat /run/mode-controller/capability-placement.json
sudo cat /run/mode-controller/hardware-topology.json
Expected first-boot posture:
- current mode should be
desktop - vLLM should not be active in
desktop studio-local-policy.serviceshould not be activeaudio-priority.serviceshould not be active
11.6 Verify NVIDIA Before Compute Testing
Before testing compute, verify NVIDIA basics:
nvidia-smi
lsmod | grep nvidia
Do not run mode request compute from the fresh-install checklist. Compute
transition testing belongs in the bring-up and transition-testing runbooks after
the desktop baseline, observer, and NVIDIA runtime all look correct.
12. Continue With Bring-Up
Start when
- fresh install success criteria are satisfied
- desktop baseline is usable
- mode CLI and runtime state files work
Outcomes
- ownership transfers to the first bring-up checklist
- transition testing is not started from the fresh-install checklist
- compute testing is gated behind the bring-up/transition runbooks
After the machine is installed and boots correctly, continue with:
That covers:
- dry build vs switch
- mode transition tests
studio-localcheckscomputechecks- failure inspection paths
13. Common Failure Areas
Start when
- an install, boot, or first verification step failed
- error output or observed failure is available
Outcomes
- failure is categorized before more changes are made
- recovery work targets the likely failure bucket
- repeated failures are recorded with evidence
Fresh installs usually fail in one of these buckets:
- wrong disk selected during partitioning
- incorrect mount layout before
nixos-generate-config - hardware config not regenerated into the repo
- bootloader/EFI mismatch
- NVIDIA/runtime issues after first boot
- vLLM/model/runtime issues once
computemode is exercised
14. Success Criteria
Start when
- all previous steps either passed or were intentionally skipped with a reason
- first boot verification has been completed
Outcomes
- the machine is installed from the flake
- the system boots from disk
- the repo-based configuration owns the machine
- the machine is ready for first bring-up, not yet full compute operation
A successful fresh install means:
- the machine boots from disk into the flake-managed system
mode statusworks- the repo-based configuration owns the system from first boot
- you can move on to the bring-up checklist without reinstalling