Dual-Mode NixOS Workstation / AI Node

flowchart TD
    HW[Hardware]

    subgraph BaseOS[NixOS Base Layer]
        Kernel
        Drivers[NVIDIA / CUDA]
        Network
        Storage
        Nix
        systemd
    end

    subgraph Control[Mode Control Plane]
        Desired[Desired State]
        Current[Current State]
        Reconcile[Reconciler]
        Guards[Guard Checks]
    end

    subgraph LocalModes[Host-Local Modes]
        Desktop[Desktop / Dev]
        StudioLocal[Studio-Local / Audio-Priority]
        Compute[Compute / Headless]
    end

    subgraph Placement[Capability Placement]
        StudioCap[Studio Capability]
        AICap[AI Capability]
        PlatformCap[Platform Capability]
    end

    subgraph Workloads[Workloads]
        Hyprland
        PipeWire
        Reaper
        vLLM
        k3s
    end

    HW --> BaseOS
    BaseOS --> Control
    Control --> LocalModes
    LocalModes --> Workloads
    LocalModes --> Placement

4. Mode Definitions and Capability Placement

This document distinguishes between:

host-local operational modes for the NixOS machine
capability placement for functions that may later move to another machine

4.1 Host-Local Modes

Desktop / Dev Mode

Intent

Balanced interactive mode for programming, office work, light desktop use, and bounded AI.

Properties

GUI enabled
audio enabled for ordinary desktop use
GPU0 reserved for display/compositor
GPU1 may be used by AI workloads
vLLM constrained to single-GPU operation or disabled
k3s control plane may remain active
CPU/RAM contention must remain bounded

Studio-Local / Audio-Priority Profile

Intent

A stricter local operating profile for low-latency audio work when Studio remains on the NixOS host.

Properties

modeled as a protected interactive profile closely related to Desktop
GUI enabled
audio stack prioritized
display GPU reserved exclusively for desktop responsibilities
AI workloads disabled or reduced to near-zero
heavy I/O and background maintenance jobs disallowed
scheduler and system policy biased toward stable audio behavior

This profile is considered conditional and potentially temporary. It exists so the NixOS host can support local audio/studio workflows now, without assuming that Studio remains a permanent first-class local mode forever.

Implementation note

For the first implementation pass, studio-local should be modeled as a policy overlay on desktop, not as a first-class top-level systemd target. The operational state still exists in the controller/state model, but its enactment should initially be handled by marker/helper units layered onto the desktop path.

Compute / Headless Mode

Intent

Throughput-oriented headless mode for AI serving and platform duties.

Properties

GUI disabled
audio stack off or irrelevant
both GPUs available to AI workloads
vLLM may use both GPUs
k3s workloads may run more aggressively
CPU/RAM/storage can be utilized much more aggressively than in interactive modes

4.2 Capability Placement Model

Certain capabilities may be placed either:

locally on the NixOS host
externally on another machine

Capability: Studio / Audio

Possible placements:

local
external-mac-mini

Capability: AI / Inference

Expected placement:

primarily local-nixos-host

Capability: Platform / k3s Control

Expected placement:

primarily local-nixos-host

4.3 Design Implication

The host-local state machine should remain valid even if Studio/Audio is moved to a Mac mini. That means Studio-specific policy should be represented as a local profile or conditional mode, not as the permanent center of the entire host architecture.

5. Resource Ownership Model

5.0 Implementation Note — Hardware-Tolerant Bring-Up

The architecture should continue to plan for the intended dual-GPU topology, but the NixOS implementation should remain tolerant of transitional hardware states while the second GPU is not yet installed or configured.

That means:

the policy model may still describe the intended two-GPU end state
module options should encode planned GPU ownership explicitly
active service profiles must only reference GPUs that are currently present
missing future hardware must not cause ordinary evaluation or steady-state services to fail unnecessarily

5.1 GPU Ownership

Mode	GPU0	GPU1
Desktop	Display / compositor	AI optional
Studio-Local	Display / compositor (protected)	AI off or minimal
Compute	AI	AI

5.2 CPU Ownership

Shared via cgroups/systemd slices
interactive slices retain priority/headroom in Desktop and Studio-Local
compute slices may saturate cores in Compute

5.3 Memory Ownership

bounded AI memory usage in Desktop
stricter constraints in Studio-Local
relaxed/high utilization in Compute

5.4 Storage Ownership

heavy background I/O restricted in Studio-Local
permitted but bounded in Desktop
broadly permitted in Compute

5.5 Audio Ownership

effectively exclusive in Studio-Local
protected in Desktop
not guaranteed in Compute

6. Invariants

These are system-level properties that must remain true regardless of transition path or future Studio placement.

6.1 Safety Invariants

At most one host-local operational mode is authoritative at a time.
A transition must either complete to a stable target state or abort back to a known-safe prior state.
Mode transitions must be idempotent. Re-running a transition toward an already-satisfied state must not cause harm.
When Studio-Local is active, heavyweight compute workloads must not materially jeopardize audio latency.
Compute mode must not require a running graphical session.
GPU0 must not be simultaneously treated as both protected display GPU and unrestricted compute GPU.
The controller must not promote the system into Compute if guard failures indicate active user/audio risk.
The system must always expose a way to determine current mode, desired mode, and last transition result.
The host-local mode model must remain coherent if Studio/Audio capability is externalized to another machine.

6.2 State Invariants

Desired state is authoritative intent.
Current state is observed runtime fact.
Reconciliation moves current state toward desired state; it never rewrites observed state to match wishful intent.
A guard failure blocks transition, but does not silently change desired state unless policy explicitly says so.

6.3 Operational Invariants

Models and mutable runtime data must live outside the Nix store.
Dotfiles may influence user experience, not machine-critical mode policy.
Mode policy must remain expressible and inspectable via systemd and Nix configuration.
Capability placement decisions must not silently invalidate host-local invariants.

7. Desired State vs Current State

7.1 Desired State

The host-local mode the user or automation wants the system to be in.

Examples:

desktop
studio-local
compute

7.2 Current State

The host-local mode the system is actually in, as determined by observation.

Examples:

graphical target active, PipeWire active, vLLM limited → likely desktop
graphical target inactive, compute services active, both GPUs exposed to AI → likely compute
GUI active, audio priority raised, compute services reduced → likely studio-local

7.3 Why This Split Matters

Without this split, the system can lie to itself:

a command says “switch to compute”
but GPU is still held by compositor
vLLM failed to scale up
audio services are still active

In that case:

desired state = compute
current state = transitioning or desktop (degraded)

The control plane must detect and reconcile this rather than assuming success.

8. Source of Truth for Mode

The system needs one authoritative representation of requested host-local mode.

8.1 Options Considered

Option A — File-Based Source of Truth

Example:

/run/mode-controller/desired
/var/lib/mode-controller/desired

Pros

simple
easy to inspect
works outside active user session
easy for scripts and systemd units

Cons

can drift from actual runtime state
needs permissions and lifecycle handling

Option B — Environment Variable Source of Truth

Example:

MODE=compute

Pros

simple for one-shot commands
easy in shell contexts

Cons

poor system-wide authority
ephemeral
fragile across sessions/reboots
bad fit for authoritative machine state

Option C — systemd State as Source of Truth

Example:

compute.target active implies desired mode is compute

Pros

tightly aligned with implementation
introspectable
avoids duplicate state stores

Cons

desired state and current state can become conflated
harder to represent “requested but not yet achieved”
recovery/abort semantics become more awkward

8.2 Recommended Model

Use a hybrid model:

Desired state source of truth: file in /run/mode-controller/desired
Current state source of truth: observed systemd/runtime facts
Transition machinery: systemd targets + controller service

This cleanly separates:

intent
observation
enforcement

8.3 Proposed Files

/run/mode-controller/desired
/run/mode-controller/current
/run/mode-controller/last-transition.json

current may be a cached observation, but observation should always be derivable from system state.

9. State Machine

9.1 States

S0: Boot

Initial state before default operating mode is established.

S1: Desktop

Interactive general-purpose mode.

S2: StudioLocal

Strict interactive low-latency local audio profile.

S3: Compute

Headless throughput-oriented mode.

S4: Transitioning

Ephemeral reconciliation state while moving toward desired mode.

S5: FailedTransition

A recoverable error state indicating that desired state was not achieved.

9.2 State Diagram

stateDiagram-v2
    [*] --> Boot

    Boot --> Desktop : default boot

    Desktop --> StudioLocal : request(studio-local)
    StudioLocal --> Desktop : request(desktop)

    Desktop --> Transitioning : request(compute)
    StudioLocal --> Transitioning : request(compute)
    Compute --> Transitioning : request(desktop)
    Desktop --> Transitioning : request(desktop) / reconcile
    StudioLocal --> Transitioning : request(studio-local) / reconcile
    Compute --> Transitioning : request(compute) / reconcile

    Transitioning --> Desktop : reached(desktop)
    Transitioning --> StudioLocal : reached(studio-local)
    Transitioning --> Compute : reached(compute)
    Transitioning --> FailedTransition : guard_fail / action_fail / timeout

    FailedTransition --> Desktop : recover(previous=desktop)
    FailedTransition --> StudioLocal : recover(previous=studio-local)
    FailedTransition --> Compute : recover(previous=compute)

9.3 Notes

Direct StudioLocal -> Compute may be allowed only through guarded reconciliation, not blind immediate promotion.
Reconciliation should be able to handle “already in desired mode” as a no-op success.
Externalized Studio capability must not require redesign of the host-local state machine; it should only disable or deprecate studio-local usage.

10. Guards

Guards are explicit check functions. They return exit codes and optionally structured diagnostics.

10.1 Guard Interface

Each guard function should follow a predictable interface:

check_<name>
exit 0   = pass
exit 10+ = policy failure / guard blocked
exit 20+ = check execution error / indeterminate

Structured output should ideally emit JSON or key=value diagnostics to stdout/stderr for logs.

10.2 Guard Set

G1: check_audio_idle

Purpose:

verify no active low-latency local audio session that would make compute transition unsafe

Possible checks:

no active REAPER process
no active PipeWire/JACK graph beyond baseline

Exit codes:

0 pass
10 audio active
20 unable to inspect audio graph

G2: check_gpu_display_released

Purpose:

verify display/compositor has released GPU before compute promotion

Possible checks:

no active Hyprland session
no relevant graphical GPU consumers

Exit codes:

0 pass
11 display GPU still owned by GUI
21 GPU inspection failure

G3: check_cpu_load_safe

Purpose:

ensure transition is not occurring during obviously unsafe heavy local activity when policy requires quieting first

Exit codes:

0 pass
12 CPU load too high
22 unable to inspect load

G4: check_user_jobs_safe

Purpose:

detect known long-running interactive/user jobs that should block auto-transition

Possible checks:

selected process patterns
optional allowlist/denylist

Exit codes:

0 pass
13 user jobs active
23 inspection failure

G5: check_memory_headroom

Purpose:

ensure sufficient memory exists to perform transition or launch target services

Exit codes:

0 pass
14 insufficient headroom
24 inspection failure

G6: check_vllm_drainable

Purpose:

ensure compute workloads can be safely reduced when returning to Desktop/Studio-Local

Exit codes:

0 pass
15 compute workload not drainable
25 inspection failure

G7: check_studio_capability_local

Purpose:

verify that local Studio capability is still available on the NixOS host before allowing studio-local

Possible checks:

local policy flag indicates studio capability still hosted locally
local audio stack and workflow prerequisites are not intentionally disabled due to externalization

Exit codes:

0 pass
19 requested local studio capability not available
29 inspection failure

10.3 Guard Policy by Transition

Transition	Required Guards
Desktop -> StudioLocal	check_target_reachable, check_studio_capability_local, check_user_jobs_safe (optional policy), compute downscale checks
StudioLocal -> Desktop	check_target_reachable
Desktop -> Compute	check_target_reachable, check_audio_idle, check_gpu_display_released, check_cpu_load_safe, check_user_jobs_safe, check_memory_headroom
StudioLocal -> Compute	check_target_reachable, check_audio_idle, check_gpu_display_released, check_cpu_load_safe, check_user_jobs_safe, check_memory_headroom
Compute -> Desktop	check_target_reachable, check_vllm_drainable, check_memory_headroom
Compute -> StudioLocal	check_target_reachable, check_studio_capability_local, check_vllm_drainable, check_memory_headroom

11. Actions and Transition Semantics

Actions are the concrete operations used to move from one state to another.

11.1 Action Vocabulary

stop/terminate GUI session
isolate a target
stop/start units
wait for quiescence
update desired/current state files
restart services with different environment/policies

11.2 Action Interface

Each action should return:

0 success
non-zero failure with logged reason

12. Exact Transition Mapping to systemd Operations

This is the implementation-oriented mapping.

12.1 Assumptions

Systemd targets:

desktop.target
compute.target

studio-local is intentionally not a first-class target in v1. It is represented as a desktop overlay through studio-local-policy.service and audio-priority.service.

Supporting services:

mode-controller.service
vllm.service
k3s.service
pipewire.service / user session services
graphical session manager or direct Hyprland session

Helper oneshot services/scripts:

mode-prepare-compute.service
mode-prepare-desktop.service
mode-prepare-studio-local.service
mode-observe.service

12.2 Desktop -> StudioLocal

Desired change

desired mode file = studio-local

systemd operations

systemctl start mode-controller.service (with target=studio-local)
controller runs guard set for Desktop -> StudioLocal
controller verifies local Studio capability still exists
controller stops or constrains AI workloads as needed
- v1 policy: systemctl stop vllm.service
controller isolates or verifies desktop.target
controller starts studio-local-policy.service
controller starts audio-priority.service
controller updates current state observation

Example exact operations

write /run/mode-controller/desired = studio-local
systemctl start mode-controller@studio-local.service
systemctl stop vllm.service
systemctl isolate desktop.target
systemctl start studio-local-policy.service
systemctl start audio-priority.service

12.3 StudioLocal -> Desktop

Desired change

desired mode file = desktop

systemd operations

write desired state
start controller
restore normal interactive policies
optionally allow bounded AI services
stop audio-priority.service
stop studio-local-policy.service
systemctl isolate desktop.target
update current observation

Example exact operations

write /run/mode-controller/desired = desktop
systemctl start mode-controller@desktop.service
systemctl stop audio-priority.service
systemctl stop studio-local-policy.service
systemctl isolate desktop.target

12.4 Desktop -> Compute

Desired change

desired mode file = compute

systemd operations

write desired state
start controller for compute
run guards:
- check_target_reachable
- check_audio_idle
- check_gpu_display_released (or prepare to release)
- check_cpu_load_safe
- check_user_jobs_safe
- check_memory_headroom
if interactive session exists, controller requests/forces session termination
- loginctl terminate-session <id>
wait until compositor releases GPU
stop or de-prioritize audio services if needed
stop desktop-specific services not wanted in compute
set service environment/profile for dual-GPU vLLM
systemctl isolate compute.target
start/restart vllm.service
verify current state

Example exact operations

write /run/mode-controller/desired = compute
systemctl start mode-controller@compute.service
loginctl terminate-session <desktop-session>
systemctl stop graphical-session.target   # if such target exists in design
systemctl isolate compute.target
systemctl restart vllm.service

12.5 Compute -> Desktop

Desired change

desired mode file = desktop

systemd operations

write desired state
start controller for desktop
run guards:
- check_target_reachable
- check_vllm_drainable
- check_memory_headroom
drain/stop or downscale vLLM
constrain compute workloads
systemctl isolate desktop.target
start GUI path
ensure GPU0 reserved for display
start/restore audio path
verify current state

Example exact operations

write /run/mode-controller/desired = desktop
systemctl start mode-controller@desktop.service
systemctl stop vllm.service              # or restart single-GPU profile
systemctl isolate desktop.target

12.6 StudioLocal -> Compute

Two possible policies:

Policy A — direct guarded transition

Allowed if all compute guards pass and Studio-Local resources are cleanly relinquished.

Policy B — normalize through Desktop first

Transition path:

studio-local -> desktop -> compute

Recommendation: Use Policy A in implementation, but conceptually treat it as the same reconciliation pipeline with stricter guards.

13. Reconciliation Model

13.1 Motivation

A single mode request compute command should not blindly assume success. The system should:

record desired mode
observe current state
compare desired vs current
compute required transition plan
execute actions
re-observe
either declare success or enter failed transition state

13.2 Reconciliation Loop

flowchart TD
    Req[Request mode] --> Write[Write desired state]
    Write --> Observe[Observe current state]
    Observe --> Compare{Desired == Current?}
    Compare -->|Yes| Done[No-op success]
    Compare -->|No| Plan[Select transition plan]
    Plan --> Guards[Run guards]
    Guards -->|Fail| Fail[Record failure]
    Guards -->|Pass| Act[Execute actions]
    Act --> Reobserve[Observe current state again]
    Reobserve --> Verify{Reached desired?}
    Verify -->|Yes| Success[Record success]
    Verify -->|No| RetryOrFail[Retry boundedly or fail]

13.3 Reconciliation Semantics

bounded retries only
no infinite loops
every failure is logged with:
- desired state
- prior state
- failing guard or action
- timestamp

13.4 Why This Matters

This lets you support:

manual requests
idle-triggered auto-switching
boot-time default mode
recovery after partial failures

all through one mechanism.

14. Specialisations vs Runtime Switching

This is the main architectural fork.

14.1 Option A — Runtime Switching Only

Use one host definition with multiple systemd targets and runtime policies.

Pros

fast transitions
no reboot required
best UX for switching between Desktop and Studio-Local
simpler for day-to-day operation

Cons

weaker isolation
harder to fully guarantee all services/resources are cleanly re-bound
risk of state leakage between modes
some kernel/driver tuning differences are awkward live

Best fit

Desktop <-> Studio-Local
Desktop <-> Compute where flexibility matters more than hard isolation

14.2 Option B — NixOS Specialisations Only

Use separate NixOS specialisations for Desktop and Compute (and possibly Studio-Local).

Pros

stronger isolation between role profiles
easier to vary deeper system settings, kernel params, service sets
clearer recovery story
closer to “logical separate machines”

Cons

slower transitions, often reboot-oriented in practice
poorer UX for frequent switching
more configuration duplication risk if not structured well

Best fit

Desktop vs Compute if you want very strong separation
not ideal for rapid Studio-Local toggling

14.3 Option C — Hybrid Model

Use:

runtime switching for Desktop <-> Studio-Local
specialisation boundary between Interactive and Compute families

Example:

default specialisation = interactive
- runtime modes inside it: desktop, studio-local
compute specialisation = headless compute

Pros

strongest overall architecture
preserves good UX for Studio-Local transitions
lets Compute differ more deeply if needed
handles future externalization of Studio more cleanly than treating Studio as a permanent top-level host identity

Cons

more design complexity
transition from interactive to compute may become reboot-oriented or at least heavier
more machinery to maintain

14.4 Recommendation

For your current goal, use runtime switching first, with the design shaped so it can later evolve into a hybrid model.

Reasoning

you need to learn actual contention boundaries first
Desktop <-> Studio-Local benefits heavily from live switching
Desktop <-> Compute can start as runtime-switched
if the system proves too “sticky” or leaky, you can later promote Compute into a specialisation without redesigning the higher-level state machine
if Studio moves to a Mac mini, the host-local model remains intact

Practical recommendation

Phase the design like this:

Phase 1: one host, runtime switching only
Phase 2: strong slices/targets/guards
Phase 3: evaluate whether Compute should become a specialisation
Phase 4: if Studio is externalized, deprecate or disable studio-local without changing the operator-facing control model

This preserves velocity while keeping the abstraction clean.

15. Service Placement

15.1 Host-Level Services

Hyprland
PipeWire
Reaper
NVIDIA drivers/runtime
mode controller
possibly vLLM initially
SSH / system services

15.2 k3s-Level Services

Hyperion services
platform/orchestration services
dashboards and supporting workloads
possibly model-serving abstractions later

First-pass implementation note

In v1, prefer keeping k3s.service continuously available while varying:

platform.slice resource budgets
which workloads are allowed to run aggressively
how much local compute capacity cluster workloads may consume

This is preferable to stopping and starting the cluster runtime during ordinary mode transitions.

15.3 Externalized Services (Possible Future)

Studio/Audio workflows on Mac mini
DAW/plugin-heavy sessions
live audio interfaces and controllers

15.4 Recommendation

Keep hardware-near, latency-sensitive, and GPU-debug-sensitive components on the host first. Move services into k3s only after the host-level mode model is stable. Treat Mac mini externalization as a placement decision, not as a redesign trigger for the host-local state machine.

16. Idle Detection Policy

16.1 Role of Idle Detection

Idle detection is an input signal to the reconciler, not authority on its own.

16.2 Signals

input inactivity
audio activity
GPU utilization / ownership
CPU load
selected user-job checks

16.3 Policy

Idle-triggered promotion to Compute should:

update desired state to compute
run the normal reconciliation pipeline
abort safely if guards fail

It must never bypass guards.

16.4 Studio-Local Policy

Auto-promotion from studio-local to compute should generally be disabled unless explicitly requested. This remains true even if Studio capability later moves off-box.

17. Security Boundaries

Zones

user desktop zone
system service zone
AI workload zone
cluster service zone
optional external Studio zone

Controls

bind services to appropriate interfaces
keep secrets outside dotfiles, e.g. SOPS/agenix
keep mode control operations privileged and auditable
do not let externalized capability assumptions silently weaken host-local controls

18. Risks and Failure Modes

18.1 Audio Degradation

Cause:

background contention

Mitigation:

Studio-Local invariants
strict guard/action policy

18.2 GPU Contention

Cause:

compositor and AI workloads racing for ownership

Mitigation:

explicit GPU ownership model
guard checks before Compute promotion

18.3 Partial Transition

Cause:

GUI exits but vLLM fails to restart
desired state written but current state never converges

Mitigation:

reconciliation loop
bounded retries
failed-transition state

18.4 Configuration Drift

Cause:

policy split across ad hoc scripts and dotfiles

Mitigation:

keep mode policy in Nix + systemd-controlled scripts

18.5 Capability Drift

Cause:

Studio capability moved to Mac mini, but local state machine or guards still assume it is local

Mitigation:

explicit capability placement model
check_studio_capability_local
ADR-backed deprecation path for studio-local

19. Open Questions

Should vLLM be host-managed or profile-switched through separate unit templates?
When should Compute graduate into a NixOS specialisation?
How strict should auto-transition be about user jobs and unsaved work heuristics?
Should current state be derived on demand only, or also cached to /run/mode-controller/current?
At what point should local Studio capability be considered officially externalized to a Mac mini?
What data/project sync model is required if Studio is split across machines?

19.1 Resolved Near-Term Decision

For v1:

studio-local is not a first-class target
studio-local is represented as a protected interactive policy overlay on desktop
desktop and compute are the only first-class top-level target families

This keeps the first implementation smaller while preserving the higher-level operational model and leaving room to strengthen Studio semantics later if needed.

19.2 Future Alternatives

Alternative A — Keep `studio-local` as an overlay permanently

Pros:

less target duplication
easier future deprecation if Studio moves to a Mac mini
simpler runtime switching model

Cons:

weaker systemd-level separability
more policy encoded in helper units and controller logic

Alternative B — Promote `studio-local` into a first-class target later

Pros:

stronger explicitness in systemd
easier inspection of Studio-specific dependencies
potentially clearer resource-policy boundaries

Cons:

higher maintenance cost
more duplication with desktop
less aligned with the likely future externalization path

Recommendation

Start with the overlay model. Revisit only if empirical evidence shows that audio-protection policy is too hard to express or validate without a dedicated target.

19.3 Resolved Near-Term Decision — vLLM Service Shape

Target architecture:

vllm@desktop.service
vllm@compute.service

However, for the first implementation pass, a single vllm.service is acceptable if:

desktop and compute profiles are still modeled explicitly in configuration
controller actions remain profile-aware
observation logic can still determine which profile is active

This allows the first bootable milestone to stay small without locking the architecture into a monolithic service model.

19.4 Resolved Near-Term Decision — k3s Service Shape

For v1:

k3s.service should remain stable across host-local modes
mode differences should be expressed through:
- slice/resource budgets
- workload-placement or workload-intensity policy
- optional node labels/taints later

This keeps the control plane smaller and avoids coupling every host-mode transition to cluster-runtime teardown and recovery.

Future alternative

If empirical operation shows that stable-across-modes k3s still creates unacceptable interference or ambiguity, stronger k3s mode switching can be introduced later. That should be treated as a deliberate escalation, not the default starting point.

19.5 Resolved Near-Term Decision — Desktop AI Policy

For v1:

keep vLLM off in desktop for the first convergence milestone
prove desktop ↔ compute transitions before enabling bounded desktop-mode AI

Future alternative

After the control plane is reliable, bounded desktop-mode AI may be introduced as an explicit profile with clear GPU1 ownership and resource limits.

19.6 Resolved Near-Term Decision — `studio-local` Overlay Shape

For v1, represent studio-local with:

studio-local-policy.service
audio-priority.service

This gives the controller and observation logic a clear marker plus an explicit enforcement unit without promoting Studio into a first-class top-level target.

Future alternative

If this proves too implicit, studio-local can later be promoted into a stronger grouped target or target-like overlay.

19.7 Resolved Near-Term Decision — Capability Placement Source

For v1, capability-placement.json should be generated from Nix configuration rather than edited ad hoc at runtime.

Rationale

keeps placement policy reproducible
avoids silent runtime drift
matches the design goal that machine-critical policy remain inspectable in Nix and systemd-managed artifacts

Future alternative

If operational experimentation later requires it, an explicit runtime override layer may be added with well-defined precedence and auditability.

19.8 Resolved Near-Term Decision — `mode force`

For v1, defer mode force.

Rationale

keeps attention on making the ordinary reconciliation path correct
avoids masking immature guard or transition logic
reduces the chance of bypassing safety boundaries during initial bring-up

Future alternative

Add mode force later only after hard-vs-soft guard semantics are stable and well tested.

19.9 Resolved Near-Term Decision — GUI Teardown Semantics

For v1, compute promotion should require:

graphical session absence
explicit GPU-release verification

It should not initially depend on forcibly stopping every greeter or display-manager path unless empirical testing shows those components interfere with reliable GPU handoff.

19.10 Resolved Near-Term Decision — Desktop Target Ownership

For v1, desktop.target should not directly own the greeter/login path.

Rationale

keeps mode ownership focused on operational policy rather than full session-manager orchestration
reduces coupling to whichever login/session stack is chosen
lets session presence remain an observed fact rather than an aggressively managed requirement

Future alternative

If desktop recovery proves unreliable without tighter control, greeter or display-manager paths can later be pulled under stronger mode ownership.

19.11 Resolved Near-Term Decision — `studio-local-policy.service` Scope

For v1, studio-local-policy.service should be:

a reliable marker for observation/classification
a light policy-application unit
explicitly limited in scope

It should not become a giant all-in-one Studio behavior controller.

Rationale

preserves clear observability
avoids burying controller logic inside a catch-all helper unit
keeps Studio overlay behavior inspectable and decomposable

19.12 Resolved Near-Term Decision — `observe-current` Implementation Language

For v1, implement observe-current in shell.

Constraints

keep the output contract stable:
- plain mode name for shell use
- structured JSON for diagnostics
structure the implementation so it can later be replaced by a typed helper without changing callers

Future alternative

If classifier complexity or JSON handling becomes unwieldy, replace only the classifier implementation with a small typed helper while keeping the same external contract.

19.13 Resolved Near-Term Decision — `mode` CLI Packaging

For v1:

keep the script sources in the repository
package them in pkgs/
install them through the NixOS module

Rationale

keeps the tool packaging clean and testable
avoids scattering ad hoc scripts directly into module definitions
preserves a clean path to reuse across hosts later

19.14 Resolved Near-Term Decision — Reconciler Trigger Model

For v1:

use parameterized oneshot reconciliation only
do not enable timer-driven or path-triggered background reconciliation yet

Rationale

keeps failure behavior easier to understand during bring-up
avoids masking transition bugs behind background retries
lets manual transitions prove the model first

Future alternative

After manual transitions are reliable, add periodic or path-triggered reconciliation for self-healing behavior.

19.15 Resolved Near-Term Decision — Boot Policy

For v1:

normalize to desktop on boot
do not replay persistent desired mode across reboot

Rationale

gives the system a predictable safe recovery posture
avoids booting directly back into a problematic compute path while the controller is still maturing
keeps early operational behavior easier to reason about

Future alternative

Once transitions are reliable, desired-state persistence across reboot can be introduced as an explicit policy feature.

19A. Architectural Decision Record — Potential Studio Externalization

Context

There is a realistic possibility that low-latency Studio/Audio workloads will migrate from the NixOS machine to a Mac mini.

Decision

The NixOS host architecture should treat Studio as a conditional local profile (studio-local) rather than a permanently central host mode.

Consequences

the host-local state machine remains stable if Studio moves off-box
Compute and Desktop remain the durable primary host-local modes
Studio capability can be represented separately through workload placement decisions
local audio support can still exist now without overcommitting the architecture to a permanent local Studio role

Follow-on Design Implications

add check_studio_capability_local guard for any studio-local transition
keep local audio policy isolated from core Compute/Desktop mechanics where practical
document future sync, control, and workflow boundaries if Studio becomes externalized

20. Control Interface and Implementation Contract

20.1 `mode` CLI Contract

The system should expose a single operator-facing interface:

mode status
mode request <desktop|studio-local|compute>
mode reconcile
mode current
mode desired
mode explain <desktop|studio-local|compute>
mode dry-run <desktop|studio-local|compute>
mode force <desktop|studio-local|compute>

Command Semantics

`mode status`

Returns:

desired mode
observed current mode
whether reconciliation is needed
last transition result
blocking guard failures, if any

`mode request <mode>`

Behavior:

write desired state
invoke reconciliation
return success only if reconciliation converged

`mode reconcile`

Behavior:

observe current state
compare to desired
select transition plan
run guards
execute actions
record results

`mode current`

Returns only the observed current mode.

`mode desired`

Returns only the desired mode file contents.

`mode explain <mode>`

Prints:

target state properties
expected services
resource ownership rules
guards required for entering that mode
capability placement assumptions, where relevant

`mode dry-run <mode>`

Simulates the full reconciliation plan without mutating state.

`mode force <mode>`

Privileged path that bypasses selected non-safety guards, but must never bypass hard safety guards such as GPU/display or active audio protections unless explicitly designed to allow that.

Implementation note:

defer this command in v1
keep it in the long-term interface contract so the design remains forward-compatible

21. State Storage Layout

21.1 Runtime State Paths

/run/mode-controller/
  desired
  current
  lock
  last-transition.json
  last-guards.json
  reconcile.pid
  capability-placement.json
  hardware-topology.json

21.2 File Semantics

`desired`

Contains the requested mode:

desktop
studio-local
compute

`current`

Cached observation of current state. This is convenience state only; it must be derivable from system facts.

`lock`

Used to serialize reconciliation so only one transition runs at a time.

`last-transition.json`

Stores:

requested mode
prior observed mode
final observed mode
success/failure
guard results
action results
timestamps

`last-guards.json`

Stores latest guard results for diagnostics.

`capability-placement.json`

Stores environment-level placement facts, for example:

studio: local
studio: external-mac-mini

This file is not the host-local mode source of truth. It is an environment metadata input used by guards and planning logic.

`hardware-topology.json`

Stores the currently configured hardware view, for example:

planned GPU count
currently present GPU indexes
display GPU assignment
desktop-mode AI GPU set
compute-mode AI GPU set

This allows the implementation to preserve the intended dual-GPU architecture while remaining tolerant of temporary single-GPU bring-up phases.

22. systemd Unit and Target Layout

22.1 Targets

`desktop.target`

Wants:

graphical-session target path
bounded interactive services
optional constrained AI services

First-pass implementation note:

do not make desktop.target directly own greeter/login-manager startup in v1
treat graphical session presence as an observed runtime fact
strengthen ownership later only if empirical recovery behavior requires it

`compute.target`

Wants:

headless service profile
vLLM compute profile
k3s compute-allowed policy/profile

22.2 Core Services

`mode-controller@.service`

Parameterized oneshot service.

Instance values:

mode-controller@desktop.service
mode-controller@studio-local.service
mode-controller@compute.service

Responsibilities:

load desired mode
observe current mode
run reconciliation
update state files and logs

First-pass implementation note:

use this parameterized oneshot service as the sole reconciler trigger in v1
defer timer/path-triggered background reconciliation until manual operation is proven reliable

`mode-observe.service`

Optional oneshot helper to compute observed current mode and refresh /run/mode-controller/current.

`vllm@.service`

Optional templated service for profile-specific operation:

vllm@desktop.service
vllm@studio-local.service
vllm@compute.service

Alternative:

single vllm.service with environment file switching

First-pass implementation guidance:

prefer separate desktop and compute profiles conceptually
studio-local should not require its own dedicated vLLM unit in v1 if Studio is implemented as a desktop overlay
a single vllm.service is acceptable initially if it preserves a clean migration path to templated units later
keep desktop-mode vLLM disabled for the first transition-proof milestone

`mode-guard@.service`

Optional wrapper pattern for reusable guard execution, though plain scripts may be simpler initially.

`studio-local` overlay units

Recommended first-pass representation:

audio-priority.service
studio-local-policy.service
optional environment/policy file consumed by observation and guard logic

These units should layer on top of desktop.target rather than replacing it with a distinct top-level target in v1.

Recommended scope for studio-local-policy.service:

expose a clear mode marker
apply only light, explicit Studio-specific policy
delegate heavyweight orchestration to the controller or dedicated helper units

22.3 Suggested Slice Layout

system.slice
├── interactive.slice
│   ├── graphical-session scope/services
│   ├── audio-related helpers
│   └── bounded desktop workloads
├── ai.slice
│   ├── vllm service
│   └── AI helpers
└── platform.slice
    ├── k3s service
    └── supporting infra services

Slice Intent

interactive.slice gets priority and headroom in Desktop/Studio-Local
ai.slice is heavily constrained in Studio-Local, moderately constrained in Desktop, relaxed in Compute
platform.slice remains comparatively stable but may have tighter resource budgets in interactive modes and relaxed budgets in Compute

23. Current State Observation Logic

Current state must be observed, not assumed.

23.1 Observation Inputs

GUI Indicators

graphical.target or session-specific equivalent active
active user session via loginctl
Hyprland process/session present

Audio Indicators

PipeWire user service active
active audio clients or REAPER process
optional JACK graph activity

AI Indicators

vllm*.service active
environment/profile indicates single-GPU or dual-GPU mode
optional nvidia-smi-based observation of active GPU usage

Platform Indicators

k3s.service active
optional workload-class indicators

23.2 Observation Heuristic

Observed mode should be derived using a deterministic classifier.

Proposed classifier logic

Observe `compute`

If all of the following are true:

no active graphical session
compute target active or compute service profile active
vLLM compute profile active or both GPUs assigned to AI policy

Then observed current mode = compute

Observe `studio-local`

If all of the following are true:

graphical session active
audio stack active
studio-local policy marker active
AI profile disabled or highly constrained

Then observed current mode = studio-local

Observe `desktop`

If all of the following are true:

graphical session active
desktop policy marker active
no studio-local policy marker

Then observed current mode = desktop

Observe `transitioning`

If:

desired != inferred stable mode
controller is running or lock file exists

Then observed current mode = transitioning

Observe `failed-transition`

If:

last transition failed
current does not match desired
no controller currently reconciling

Then observed current mode = failed-transition

23.3 Recommendation

Use a small classifier script:

/usr/local/libexec/mode-controller/observe-current

Outputs:

plain mode name for shell use
optional JSON with evidence for debugging

First-pass implementation note:

implement this in shell first
preserve a stable output contract so the implementation language can change later without changing the control plane

24. Guard Function Contract

24.1 Guard Naming

check_audio_idle
check_gpu_display_released
check_cpu_load_safe
check_user_jobs_safe
check_memory_headroom
check_vllm_drainable
check_graphical_session_absent
check_graphical_session_present
check_target_reachable
check_studio_capability_local

24.2 Exit Code Convention

0   pass
10  policy block: audio active
11  policy block: display GPU still owned
12  policy block: CPU load too high
13  policy block: user jobs active
14  policy block: insufficient memory headroom
15  policy block: vLLM not drainable
16  policy block: graphical session absent when required
17  policy block: graphical session present when forbidden
18  policy block: target unreachable / invalid request
19  policy block: requested local studio capability not available
20+ execution/inspection errors
30+ internal controller misuse

24.3 Guard Output Contract

Each guard should emit a concise structured line or JSON object such as:

{"guard":"check_audio_idle","ok":false,"code":10,"reason":"reaper process active"}

24.4 Hard vs Soft Guards

Hard guards

Must never be bypassed by ordinary automation:

active audio protection for Studio-Local -> Compute or Desktop -> Compute
GPU/display ownership guard
target validity checks
local Studio capability checks for studio-local

Soft guards

May be bypassed by privileged operator action or policy:

generic CPU load threshold
selected user-job heuristics
non-critical memory thresholds

25. Transition Plans with Exact Operations

This section normalizes each transition into explicit steps.

25.1 Common Transition Framework

All transitions should follow:

acquire lock
observe current state
validate requested mode
if current == desired, exit success
select transition plan
run transition guards
execute pre-actions
isolate or start target
execute post-actions
re-observe current state
record success/failure
release lock

25.2 Plan: Desktop -> StudioLocal

Preconditions

desktop currently observed
request = studio-local
local Studio capability is still hosted on the NixOS machine

Guards

check_target_reachable
check_studio_capability_local
optional check_user_jobs_safe

Exact operations

write desired=studio-local
flock /run/mode-controller/lock
observe current
run guards
systemctl start audio-priority.service      # if modeled separately
systemctl start studio-local-policy.service
observe current
record result

Notes

GUI remains up
audio policy is strengthened
AI capacity is reduced or removed
if Studio capability has been externalized, this transition must fail cleanly with an explanatory reason

25.3 Plan: StudioLocal -> Desktop

Guards

check_target_reachable

Exact operations

write desired=desktop
flock /run/mode-controller/lock
observe current
run guards
systemctl stop audio-priority.service       # if separate helper exists
systemctl stop studio-local-policy.service
systemctl isolate desktop.target
observe current
record result

25.4 Plan: Desktop -> Compute

Guards

check_target_reachable
check_audio_idle
check_cpu_load_safe
check_user_jobs_safe
check_memory_headroom

Pre-actions

terminate graphical session
wait for GUI disappearance
verify GPU/display release

Exact operations

write desired=compute
flock /run/mode-controller/lock
observe current
run initial guards
loginctl terminate-session <session-id>
wait until observe-current no longer sees graphical session
run check_gpu_display_released
systemctl isolate compute.target
systemctl start vllm@compute.service
observe current
record result

Additional notes

systemctl isolate compute.target should conflict with interactive/graphical targets in your target design
GPU release must be verified after GUI shutdown, not merely assumed

25.5 Plan: Compute -> Desktop

Guards

check_target_reachable
check_vllm_drainable
check_memory_headroom

Exact operations

write desired=desktop
flock /run/mode-controller/lock
observe current
run guards
systemctl stop vllm@compute.service         # or downscale path
systemctl isolate desktop.target
systemctl start vllm@desktop.service        # optional bounded single-GPU profile
observe current
record result

Notes

graphical session may be started by display manager or login path depending on design
GPU0 becomes protected for display once Desktop converges

25.6 Plan: StudioLocal -> Compute

Preferred behavior

Treat as a direct guarded transition using the same compute-entry pipeline.

Guards

check_target_reachable
check_audio_idle
check_cpu_load_safe
check_user_jobs_safe
check_memory_headroom

Exact operations

write desired=compute
flock /run/mode-controller/lock
observe current
run guards
loginctl terminate-session <session-id>
wait until graphical session absent
run check_gpu_display_released
systemctl isolate compute.target
systemctl start vllm@compute.service
observe current
record result

Policy note

Because Studio-Local is the most protected interactive mode, auto-promotion from Studio-Local to Compute should generally be disabled unless explicitly requested.

26. NixOS Specialisations vs Runtime Switching — Decision Guidance

26.1 Decision Matrix

Criterion	Runtime Switching	Specialisations	Hybrid
Desktop <-> Studio-Local speed	Excellent	Poor	Excellent
Desktop <-> Compute isolation	Moderate	Strong	Stronger
Complexity	Lower	Moderate	Highest
Early experimentation	Best	Slower	Moderate
Deep kernel/boot divergence	Weak	Strong	Strong
Operational convenience	High	Lower	Moderate
Future externalization of Studio	Good	Good	Best

26.2 Recommended Decision Rule

Adopt runtime switching now unless one or more of the following become true:

compute mode needs materially different kernel parameters or boot-time config
graphical/interactive teardown proves unreliable in practice
GPU role handoff remains too leaky under runtime-only switching
you want Compute to be operationally closer to a dedicated server persona than a temporary mode

If any two of the above become persistent problems, promote Compute into a specialisation.

26.3 Recommended Architecture Path

Phase 1

single NixOS host definition
runtime switching only
targets + slices + controller + guards

Phase 2

strengthen target separation
gather empirical failure/latency data

Phase 3

if needed, introduce specialisation.compute
preserve same desired/current/reconcile interface so operator UX does not change

Phase 4

if Studio is externalized, deprecate or disable studio-local
retain the same operator-facing control model for the host-local system

That means mode request compute could later choose:

runtime reconcile, or
request/reboot into compute specialisation

without changing the higher-level model.

27. Recommended Next Implementation Steps

define exact systemd target dependencies/conflicts in Nix
implement mode CLI wrapper script
implement observe-current
implement guard scripts with fixed exit-code contract
choose between:
- vllm@desktop.service / vllm@compute.service
- one service with profile env file
define slice resource policies for interactive vs AI
wire idle detector to mode request compute
validate transition behavior manually before enabling automation
add a capability-placement flag/model for future Studio externalization

28. Summary

This system should behave like a reconciled state machine for host-local operational modes.

The core model is:

desired mode is explicit runtime intent
current mode is observed reality
reconciliation closes the gap
guards prevent unsafe transitions
systemd targets/services perform the actual mode enactment

The implementation should start with runtime switching, but preserve a clean path to hybrid specialisation if operational evidence justifies stronger separation later.

Studio/Audio should be treated as a conditional local profile plus a capability-placement decision, so that a future move to a Mac mini does not invalidate the host-local architecture.

Keyboard shortcuts

Dubnium