Dual-Mode NixOS Workstation / AI Node
Unified Planning + Mode State Machine Document (v0.3 — Living)
1. Purpose
Design a single NixOS system that operates as a policy-driven multi-mode host with support for future workload externalization:
- Desktop / Dev workstation
- Optional local Studio / Audio profile
- Compute / Headless AI node
The broader workstation environment may also externalize selected capabilities, especially Studio/Audio, to a separate machine such as a Mac mini.
The system must support:
- low-latency audio workloads (DAW / live)
- GUI desktop usage via Hyprland
- GPU inference via vLLM
- k3s control-plane duties for Micrantha Laboratory / Hyperion
- explicit, auditable, reproducible transitions between modes
This document defines:
- planning assumptions
- architectural boundaries
- host-local mode definitions
- capability placement model
- invariants
- state machine
- guards and guard functions
- source-of-truth model
- reconciliation model
- implementation mapping to systemd
- design alternatives and tradeoffs
2. Core Principles
2.1 Modes Are Operational Contracts
A mode is not just a set of enabled services. A mode defines:
- resource ownership
- permitted workloads
- latency/throughput expectations
- security posture
- transition preconditions
2.2 Explicit Over Implicit
Mode transitions should be:
- explicit when possible
- observable
- reversible
- logged
- idempotent
Automation may request a transition, but the controller must decide whether it is safe.
2.3 Latency and Throughput Are Competing Objectives
- Desktop / Studio-Local optimize for responsiveness and bounded latency
- Compute optimizes for throughput and hardware utilization
The design must not pretend both can be maximized simultaneously.
2.4 One Physical Host, Multiple Logical Planes
This system is treated as:
one shared substrate hosting multiple logical operating modes
2.5 Declarative First, Runtime Reconciliation Second
- NixOS declares steady-state intent and system structure
- a mode controller reconciles runtime state toward desired operational mode
2.6 Host-Local Modes Must Survive Capability Relocation
The host-local state model should remain coherent even if some capabilities, especially Studio/Audio, move to another machine.
3. System Overview
flowchart TD
HW[Hardware]
subgraph BaseOS[NixOS Base Layer]
Kernel
Drivers[NVIDIA / CUDA]
Network
Storage
Nix
systemd
end
subgraph Control[Mode Control Plane]
Desired[Desired State]
Current[Current State]
Reconcile[Reconciler]
Guards[Guard Checks]
end
subgraph LocalModes[Host-Local Modes]
Desktop[Desktop / Dev]
StudioLocal[Studio-Local / Audio-Priority]
Compute[Compute / Headless]
end
subgraph Placement[Capability Placement]
StudioCap[Studio Capability]
AICap[AI Capability]
PlatformCap[Platform Capability]
end
subgraph Workloads[Workloads]
Hyprland
PipeWire
Reaper
vLLM
k3s
end
HW --> BaseOS
BaseOS --> Control
Control --> LocalModes
LocalModes --> Workloads
LocalModes --> Placement
4. Mode Definitions and Capability Placement
This document distinguishes between:
- host-local operational modes for the NixOS machine
- capability placement for functions that may later move to another machine
4.1 Host-Local Modes
Desktop / Dev Mode
Intent
Balanced interactive mode for programming, office work, light desktop use, and bounded AI.
Properties
- GUI enabled
- audio enabled for ordinary desktop use
- GPU0 reserved for display/compositor
- GPU1 may be used by AI workloads
- vLLM constrained to single-GPU operation or disabled
- k3s control plane may remain active
- CPU/RAM contention must remain bounded
Studio-Local / Audio-Priority Profile
Intent
A stricter local operating profile for low-latency audio work when Studio remains on the NixOS host.
Properties
- modeled as a protected interactive profile closely related to Desktop
- GUI enabled
- audio stack prioritized
- display GPU reserved exclusively for desktop responsibilities
- AI workloads disabled or reduced to near-zero
- heavy I/O and background maintenance jobs disallowed
- scheduler and system policy biased toward stable audio behavior
Design note
This profile is considered conditional and potentially temporary. It exists so the NixOS host can support local audio/studio workflows now, without assuming that Studio remains a permanent first-class local mode forever.
Implementation note
For the first implementation pass, studio-local should be modeled as a policy overlay on desktop, not as a first-class top-level systemd target. The operational state still exists in the controller/state model, but its enactment should initially be handled by marker/helper units layered onto the desktop path.
Compute / Headless Mode
Intent
Throughput-oriented headless mode for AI serving and platform duties.
Properties
- GUI disabled
- audio stack off or irrelevant
- both GPUs available to AI workloads
- vLLM may use both GPUs
- k3s workloads may run more aggressively
- CPU/RAM/storage can be utilized much more aggressively than in interactive modes
4.2 Capability Placement Model
Certain capabilities may be placed either:
- locally on the NixOS host
- externally on another machine
Capability: Studio / Audio
Possible placements:
localexternal-mac-mini
Capability: AI / Inference
Expected placement:
- primarily
local-nixos-host
Capability: Platform / k3s Control
Expected placement:
- primarily
local-nixos-host
4.3 Design Implication
The host-local state machine should remain valid even if Studio/Audio is moved to a Mac mini. That means Studio-specific policy should be represented as a local profile or conditional mode, not as the permanent center of the entire host architecture.
5. Resource Ownership Model
5.0 Implementation Note — Hardware-Tolerant Bring-Up
The architecture should continue to plan for the intended dual-GPU topology, but the NixOS implementation should remain tolerant of transitional hardware states while the second GPU is not yet installed or configured.
That means:
- the policy model may still describe the intended two-GPU end state
- module options should encode planned GPU ownership explicitly
- active service profiles must only reference GPUs that are currently present
- missing future hardware must not cause ordinary evaluation or steady-state services to fail unnecessarily
5.1 GPU Ownership
| Mode | GPU0 | GPU1 |
|---|---|---|
| Desktop | Display / compositor | AI optional |
| Studio-Local | Display / compositor (protected) | AI off or minimal |
| Compute | AI | AI |
5.2 CPU Ownership
- Shared via cgroups/systemd slices
- interactive slices retain priority/headroom in Desktop and Studio-Local
- compute slices may saturate cores in Compute
5.3 Memory Ownership
- bounded AI memory usage in Desktop
- stricter constraints in Studio-Local
- relaxed/high utilization in Compute
5.4 Storage Ownership
- heavy background I/O restricted in Studio-Local
- permitted but bounded in Desktop
- broadly permitted in Compute
5.5 Audio Ownership
- effectively exclusive in Studio-Local
- protected in Desktop
- not guaranteed in Compute
6. Invariants
These are system-level properties that must remain true regardless of transition path or future Studio placement.
6.1 Safety Invariants
- At most one host-local operational mode is authoritative at a time.
- A transition must either complete to a stable target state or abort back to a known-safe prior state.
- Mode transitions must be idempotent. Re-running a transition toward an already-satisfied state must not cause harm.
- When Studio-Local is active, heavyweight compute workloads must not materially jeopardize audio latency.
- Compute mode must not require a running graphical session.
- GPU0 must not be simultaneously treated as both protected display GPU and unrestricted compute GPU.
- The controller must not promote the system into Compute if guard failures indicate active user/audio risk.
- The system must always expose a way to determine current mode, desired mode, and last transition result.
- The host-local mode model must remain coherent if Studio/Audio capability is externalized to another machine.
6.2 State Invariants
- Desired state is authoritative intent.
- Current state is observed runtime fact.
- Reconciliation moves current state toward desired state; it never rewrites observed state to match wishful intent.
- A guard failure blocks transition, but does not silently change desired state unless policy explicitly says so.
6.3 Operational Invariants
- Models and mutable runtime data must live outside the Nix store.
- Dotfiles may influence user experience, not machine-critical mode policy.
- Mode policy must remain expressible and inspectable via systemd and Nix configuration.
- Capability placement decisions must not silently invalidate host-local invariants.
7. Desired State vs Current State
7.1 Desired State
The host-local mode the user or automation wants the system to be in.
Examples:
desktopstudio-localcompute
7.2 Current State
The host-local mode the system is actually in, as determined by observation.
Examples:
- graphical target active, PipeWire active, vLLM limited → likely
desktop - graphical target inactive, compute services active, both GPUs exposed to AI → likely
compute - GUI active, audio priority raised, compute services reduced → likely
studio-local
7.3 Why This Split Matters
Without this split, the system can lie to itself:
- a command says “switch to compute”
- but GPU is still held by compositor
- vLLM failed to scale up
- audio services are still active
In that case:
- desired state =
compute - current state =
transitioningordesktop (degraded)
The control plane must detect and reconcile this rather than assuming success.
8. Source of Truth for Mode
The system needs one authoritative representation of requested host-local mode.
8.1 Options Considered
Option A — File-Based Source of Truth
Example:
/run/mode-controller/desired/var/lib/mode-controller/desired
Pros
- simple
- easy to inspect
- works outside active user session
- easy for scripts and systemd units
Cons
- can drift from actual runtime state
- needs permissions and lifecycle handling
Option B — Environment Variable Source of Truth
Example:
MODE=compute
Pros
- simple for one-shot commands
- easy in shell contexts
Cons
- poor system-wide authority
- ephemeral
- fragile across sessions/reboots
- bad fit for authoritative machine state
Option C — systemd State as Source of Truth
Example:
compute.targetactive implies desired mode is compute
Pros
- tightly aligned with implementation
- introspectable
- avoids duplicate state stores
Cons
- desired state and current state can become conflated
- harder to represent “requested but not yet achieved”
- recovery/abort semantics become more awkward
8.2 Recommended Model
Use a hybrid model:
- Desired state source of truth: file in
/run/mode-controller/desired - Current state source of truth: observed systemd/runtime facts
- Transition machinery: systemd targets + controller service
This cleanly separates:
- intent
- observation
- enforcement
8.3 Proposed Files
/run/mode-controller/desired/run/mode-controller/current/run/mode-controller/last-transition.json
current may be a cached observation, but observation should always be derivable from system state.
9. State Machine
9.1 States
S0: Boot
Initial state before default operating mode is established.
S1: Desktop
Interactive general-purpose mode.
S2: StudioLocal
Strict interactive low-latency local audio profile.
S3: Compute
Headless throughput-oriented mode.
S4: Transitioning
Ephemeral reconciliation state while moving toward desired mode.
S5: FailedTransition
A recoverable error state indicating that desired state was not achieved.
9.2 State Diagram
stateDiagram-v2
[*] --> Boot
Boot --> Desktop : default boot
Desktop --> StudioLocal : request(studio-local)
StudioLocal --> Desktop : request(desktop)
Desktop --> Transitioning : request(compute)
StudioLocal --> Transitioning : request(compute)
Compute --> Transitioning : request(desktop)
Desktop --> Transitioning : request(desktop) / reconcile
StudioLocal --> Transitioning : request(studio-local) / reconcile
Compute --> Transitioning : request(compute) / reconcile
Transitioning --> Desktop : reached(desktop)
Transitioning --> StudioLocal : reached(studio-local)
Transitioning --> Compute : reached(compute)
Transitioning --> FailedTransition : guard_fail / action_fail / timeout
FailedTransition --> Desktop : recover(previous=desktop)
FailedTransition --> StudioLocal : recover(previous=studio-local)
FailedTransition --> Compute : recover(previous=compute)
9.3 Notes
- Direct
StudioLocal -> Computemay be allowed only through guarded reconciliation, not blind immediate promotion. - Reconciliation should be able to handle “already in desired mode” as a no-op success.
- Externalized Studio capability must not require redesign of the host-local state machine; it should only disable or deprecate
studio-localusage.
10. Guards
Guards are explicit check functions. They return exit codes and optionally structured diagnostics.
10.1 Guard Interface
Each guard function should follow a predictable interface:
check_<name>
exit 0 = pass
exit 10+ = policy failure / guard blocked
exit 20+ = check execution error / indeterminate
Structured output should ideally emit JSON or key=value diagnostics to stdout/stderr for logs.
10.2 Guard Set
G1: check_audio_idle
Purpose:
- verify no active low-latency local audio session that would make compute transition unsafe
Possible checks:
- no active REAPER process
- no active PipeWire/JACK graph beyond baseline
Exit codes:
0pass10audio active20unable to inspect audio graph
G2: check_gpu_display_released
Purpose:
- verify display/compositor has released GPU before compute promotion
Possible checks:
- no active Hyprland session
- no relevant graphical GPU consumers
Exit codes:
0pass11display GPU still owned by GUI21GPU inspection failure
G3: check_cpu_load_safe
Purpose:
- ensure transition is not occurring during obviously unsafe heavy local activity when policy requires quieting first
Exit codes:
0pass12CPU load too high22unable to inspect load
G4: check_user_jobs_safe
Purpose:
- detect known long-running interactive/user jobs that should block auto-transition
Possible checks:
- selected process patterns
- optional allowlist/denylist
Exit codes:
0pass13user jobs active23inspection failure
G5: check_memory_headroom
Purpose:
- ensure sufficient memory exists to perform transition or launch target services
Exit codes:
0pass14insufficient headroom24inspection failure
G6: check_vllm_drainable
Purpose:
- ensure compute workloads can be safely reduced when returning to Desktop/Studio-Local
Exit codes:
0pass15compute workload not drainable25inspection failure
G7: check_studio_capability_local
Purpose:
- verify that local Studio capability is still available on the NixOS host before allowing
studio-local
Possible checks:
- local policy flag indicates studio capability still hosted locally
- local audio stack and workflow prerequisites are not intentionally disabled due to externalization
Exit codes:
0pass19requested local studio capability not available29inspection failure
10.3 Guard Policy by Transition
| Transition | Required Guards |
|---|---|
| Desktop -> StudioLocal | check_target_reachable, check_studio_capability_local, check_user_jobs_safe (optional policy), compute downscale checks |
| StudioLocal -> Desktop | check_target_reachable |
| Desktop -> Compute | check_target_reachable, check_audio_idle, check_gpu_display_released, check_cpu_load_safe, check_user_jobs_safe, check_memory_headroom |
| StudioLocal -> Compute | check_target_reachable, check_audio_idle, check_gpu_display_released, check_cpu_load_safe, check_user_jobs_safe, check_memory_headroom |
| Compute -> Desktop | check_target_reachable, check_vllm_drainable, check_memory_headroom |
| Compute -> StudioLocal | check_target_reachable, check_studio_capability_local, check_vllm_drainable, check_memory_headroom |
11. Actions and Transition Semantics
Actions are the concrete operations used to move from one state to another.
11.1 Action Vocabulary
- stop/terminate GUI session
- isolate a target
- stop/start units
- wait for quiescence
- update desired/current state files
- restart services with different environment/policies
11.2 Action Interface
Each action should return:
0success- non-zero failure with logged reason
12. Exact Transition Mapping to systemd Operations
This is the implementation-oriented mapping.
12.1 Assumptions
Systemd targets:
desktop.targetcompute.target
studio-local is intentionally not a first-class target in v1. It is represented
as a desktop overlay through studio-local-policy.service and
audio-priority.service.
Supporting services:
mode-controller.servicevllm.servicek3s.servicepipewire.service/ user session services- graphical session manager or direct Hyprland session
Helper oneshot services/scripts:
mode-prepare-compute.servicemode-prepare-desktop.servicemode-prepare-studio-local.servicemode-observe.service
12.2 Desktop -> StudioLocal
Desired change
- desired mode file =
studio-local
systemd operations
systemctl start mode-controller.service(with target=studio-local)- controller runs guard set for Desktop -> StudioLocal
- controller verifies local Studio capability still exists
- controller stops or constrains AI workloads as needed
- v1 policy:
systemctl stop vllm.service
- v1 policy:
- controller isolates or verifies
desktop.target - controller starts
studio-local-policy.service - controller starts
audio-priority.service - controller updates current state observation
Example exact operations
write /run/mode-controller/desired = studio-local
systemctl start mode-controller@studio-local.service
systemctl stop vllm.service
systemctl isolate desktop.target
systemctl start studio-local-policy.service
systemctl start audio-priority.service
12.3 StudioLocal -> Desktop
Desired change
- desired mode file =
desktop
systemd operations
- write desired state
- start controller
- restore normal interactive policies
- optionally allow bounded AI services
- stop
audio-priority.service - stop
studio-local-policy.service systemctl isolate desktop.target- update current observation
Example exact operations
write /run/mode-controller/desired = desktop
systemctl start mode-controller@desktop.service
systemctl stop audio-priority.service
systemctl stop studio-local-policy.service
systemctl isolate desktop.target
12.4 Desktop -> Compute
Desired change
- desired mode file =
compute
systemd operations
- write desired state
- start controller for compute
- run guards:
- check_target_reachable
- check_audio_idle
- check_gpu_display_released (or prepare to release)
- check_cpu_load_safe
- check_user_jobs_safe
- check_memory_headroom
- if interactive session exists, controller requests/forces session termination
loginctl terminate-session <id>
- wait until compositor releases GPU
- stop or de-prioritize audio services if needed
- stop desktop-specific services not wanted in compute
- set service environment/profile for dual-GPU vLLM
systemctl isolate compute.target- start/restart
vllm.service - verify current state
Example exact operations
write /run/mode-controller/desired = compute
systemctl start mode-controller@compute.service
loginctl terminate-session <desktop-session>
systemctl stop graphical-session.target # if such target exists in design
systemctl isolate compute.target
systemctl restart vllm.service
12.5 Compute -> Desktop
Desired change
- desired mode file =
desktop
systemd operations
- write desired state
- start controller for desktop
- run guards:
- check_target_reachable
- check_vllm_drainable
- check_memory_headroom
- drain/stop or downscale vLLM
- constrain compute workloads
systemctl isolate desktop.target- start GUI path
- ensure GPU0 reserved for display
- start/restore audio path
- verify current state
Example exact operations
write /run/mode-controller/desired = desktop
systemctl start mode-controller@desktop.service
systemctl stop vllm.service # or restart single-GPU profile
systemctl isolate desktop.target
12.6 StudioLocal -> Compute
Two possible policies:
Policy A — direct guarded transition
Allowed if all compute guards pass and Studio-Local resources are cleanly relinquished.
Policy B — normalize through Desktop first
Transition path:
studio-local -> desktop -> compute
Recommendation: Use Policy A in implementation, but conceptually treat it as the same reconciliation pipeline with stricter guards.
13. Reconciliation Model
13.1 Motivation
A single mode request compute command should not blindly assume success. The system should:
- record desired mode
- observe current state
- compare desired vs current
- compute required transition plan
- execute actions
- re-observe
- either declare success or enter failed transition state
13.2 Reconciliation Loop
flowchart TD
Req[Request mode] --> Write[Write desired state]
Write --> Observe[Observe current state]
Observe --> Compare{Desired == Current?}
Compare -->|Yes| Done[No-op success]
Compare -->|No| Plan[Select transition plan]
Plan --> Guards[Run guards]
Guards -->|Fail| Fail[Record failure]
Guards -->|Pass| Act[Execute actions]
Act --> Reobserve[Observe current state again]
Reobserve --> Verify{Reached desired?}
Verify -->|Yes| Success[Record success]
Verify -->|No| RetryOrFail[Retry boundedly or fail]
13.3 Reconciliation Semantics
- bounded retries only
- no infinite loops
- every failure is logged with:
- desired state
- prior state
- failing guard or action
- timestamp
13.4 Why This Matters
This lets you support:
- manual requests
- idle-triggered auto-switching
- boot-time default mode
- recovery after partial failures
all through one mechanism.
14. Specialisations vs Runtime Switching
This is the main architectural fork.
14.1 Option A — Runtime Switching Only
Use one host definition with multiple systemd targets and runtime policies.
Pros
- fast transitions
- no reboot required
- best UX for switching between Desktop and Studio-Local
- simpler for day-to-day operation
Cons
- weaker isolation
- harder to fully guarantee all services/resources are cleanly re-bound
- risk of state leakage between modes
- some kernel/driver tuning differences are awkward live
Best fit
- Desktop <-> Studio-Local
- Desktop <-> Compute where flexibility matters more than hard isolation
14.2 Option B — NixOS Specialisations Only
Use separate NixOS specialisations for Desktop and Compute (and possibly Studio-Local).
Pros
- stronger isolation between role profiles
- easier to vary deeper system settings, kernel params, service sets
- clearer recovery story
- closer to “logical separate machines”
Cons
- slower transitions, often reboot-oriented in practice
- poorer UX for frequent switching
- more configuration duplication risk if not structured well
Best fit
- Desktop vs Compute if you want very strong separation
- not ideal for rapid Studio-Local toggling
14.3 Option C — Hybrid Model
Use:
- runtime switching for Desktop <-> Studio-Local
- specialisation boundary between Interactive and Compute families
Example:
- default specialisation = interactive
- runtime modes inside it: desktop, studio-local
- compute specialisation = headless compute
Pros
- strongest overall architecture
- preserves good UX for Studio-Local transitions
- lets Compute differ more deeply if needed
- handles future externalization of Studio more cleanly than treating Studio as a permanent top-level host identity
Cons
- more design complexity
- transition from interactive to compute may become reboot-oriented or at least heavier
- more machinery to maintain
14.4 Recommendation
For your current goal, use runtime switching first, with the design shaped so it can later evolve into a hybrid model.
Reasoning
- you need to learn actual contention boundaries first
- Desktop <-> Studio-Local benefits heavily from live switching
- Desktop <-> Compute can start as runtime-switched
- if the system proves too “sticky” or leaky, you can later promote Compute into a specialisation without redesigning the higher-level state machine
- if Studio moves to a Mac mini, the host-local model remains intact
Practical recommendation
Phase the design like this:
- Phase 1: one host, runtime switching only
- Phase 2: strong slices/targets/guards
- Phase 3: evaluate whether Compute should become a specialisation
- Phase 4: if Studio is externalized, deprecate or disable
studio-localwithout changing the operator-facing control model
This preserves velocity while keeping the abstraction clean.
15. Service Placement
15.1 Host-Level Services
- Hyprland
- PipeWire
- Reaper
- NVIDIA drivers/runtime
- mode controller
- possibly vLLM initially
- SSH / system services
15.2 k3s-Level Services
- Hyperion services
- platform/orchestration services
- dashboards and supporting workloads
- possibly model-serving abstractions later
First-pass implementation note
In v1, prefer keeping k3s.service continuously available while varying:
platform.sliceresource budgets- which workloads are allowed to run aggressively
- how much local compute capacity cluster workloads may consume
This is preferable to stopping and starting the cluster runtime during ordinary mode transitions.
15.3 Externalized Services (Possible Future)
- Studio/Audio workflows on Mac mini
- DAW/plugin-heavy sessions
- live audio interfaces and controllers
15.4 Recommendation
Keep hardware-near, latency-sensitive, and GPU-debug-sensitive components on the host first. Move services into k3s only after the host-level mode model is stable. Treat Mac mini externalization as a placement decision, not as a redesign trigger for the host-local state machine.
16. Idle Detection Policy
16.1 Role of Idle Detection
Idle detection is an input signal to the reconciler, not authority on its own.
16.2 Signals
- input inactivity
- audio activity
- GPU utilization / ownership
- CPU load
- selected user-job checks
16.3 Policy
Idle-triggered promotion to Compute should:
- update desired state to
compute - run the normal reconciliation pipeline
- abort safely if guards fail
It must never bypass guards.
16.4 Studio-Local Policy
Auto-promotion from studio-local to compute should generally be disabled unless explicitly requested. This remains true even if Studio capability later moves off-box.
17. Security Boundaries
Zones
- user desktop zone
- system service zone
- AI workload zone
- cluster service zone
- optional external Studio zone
Controls
- bind services to appropriate interfaces
- keep secrets outside dotfiles, e.g. SOPS/agenix
- keep mode control operations privileged and auditable
- do not let externalized capability assumptions silently weaken host-local controls
18. Risks and Failure Modes
18.1 Audio Degradation
Cause:
- background contention
Mitigation:
- Studio-Local invariants
- strict guard/action policy
18.2 GPU Contention
Cause:
- compositor and AI workloads racing for ownership
Mitigation:
- explicit GPU ownership model
- guard checks before Compute promotion
18.3 Partial Transition
Cause:
- GUI exits but vLLM fails to restart
- desired state written but current state never converges
Mitigation:
- reconciliation loop
- bounded retries
- failed-transition state
18.4 Configuration Drift
Cause:
- policy split across ad hoc scripts and dotfiles
Mitigation:
- keep mode policy in Nix + systemd-controlled scripts
18.5 Capability Drift
Cause:
- Studio capability moved to Mac mini, but local state machine or guards still assume it is local
Mitigation:
- explicit capability placement model
check_studio_capability_local- ADR-backed deprecation path for
studio-local
19. Open Questions
- Should vLLM be host-managed or profile-switched through separate unit templates?
- When should Compute graduate into a NixOS specialisation?
- How strict should auto-transition be about user jobs and unsaved work heuristics?
- Should
currentstate be derived on demand only, or also cached to/run/mode-controller/current? - At what point should local Studio capability be considered officially externalized to a Mac mini?
- What data/project sync model is required if Studio is split across machines?
19.1 Resolved Near-Term Decision
For v1:
studio-localis not a first-class targetstudio-localis represented as a protected interactive policy overlay ondesktopdesktopandcomputeare the only first-class top-level target families
This keeps the first implementation smaller while preserving the higher-level operational model and leaving room to strengthen Studio semantics later if needed.
19.2 Future Alternatives
Alternative A — Keep studio-local as an overlay permanently
Pros:
- less target duplication
- easier future deprecation if Studio moves to a Mac mini
- simpler runtime switching model
Cons:
- weaker systemd-level separability
- more policy encoded in helper units and controller logic
Alternative B — Promote studio-local into a first-class target later
Pros:
- stronger explicitness in systemd
- easier inspection of Studio-specific dependencies
- potentially clearer resource-policy boundaries
Cons:
- higher maintenance cost
- more duplication with
desktop - less aligned with the likely future externalization path
Recommendation
Start with the overlay model. Revisit only if empirical evidence shows that audio-protection policy is too hard to express or validate without a dedicated target.
19.3 Resolved Near-Term Decision — vLLM Service Shape
Target architecture:
vllm@desktop.servicevllm@compute.service
However, for the first implementation pass, a single vllm.service is acceptable if:
- desktop and compute profiles are still modeled explicitly in configuration
- controller actions remain profile-aware
- observation logic can still determine which profile is active
This allows the first bootable milestone to stay small without locking the architecture into a monolithic service model.
19.4 Resolved Near-Term Decision — k3s Service Shape
For v1:
k3s.serviceshould remain stable across host-local modes- mode differences should be expressed through:
- slice/resource budgets
- workload-placement or workload-intensity policy
- optional node labels/taints later
This keeps the control plane smaller and avoids coupling every host-mode transition to cluster-runtime teardown and recovery.
Future alternative
If empirical operation shows that stable-across-modes k3s still creates unacceptable interference or ambiguity, stronger k3s mode switching can be introduced later. That should be treated as a deliberate escalation, not the default starting point.
19.5 Resolved Near-Term Decision — Desktop AI Policy
For v1:
- keep vLLM off in
desktopfor the first convergence milestone - prove
desktop↔computetransitions before enabling bounded desktop-mode AI
Future alternative
After the control plane is reliable, bounded desktop-mode AI may be introduced as an explicit profile with clear GPU1 ownership and resource limits.
19.6 Resolved Near-Term Decision — studio-local Overlay Shape
For v1, represent studio-local with:
studio-local-policy.serviceaudio-priority.service
This gives the controller and observation logic a clear marker plus an explicit enforcement unit without promoting Studio into a first-class top-level target.
Future alternative
If this proves too implicit, studio-local can later be promoted into a stronger grouped target or target-like overlay.
19.7 Resolved Near-Term Decision — Capability Placement Source
For v1, capability-placement.json should be generated from Nix configuration rather than edited ad hoc at runtime.
Rationale
- keeps placement policy reproducible
- avoids silent runtime drift
- matches the design goal that machine-critical policy remain inspectable in Nix and systemd-managed artifacts
Future alternative
If operational experimentation later requires it, an explicit runtime override layer may be added with well-defined precedence and auditability.
19.8 Resolved Near-Term Decision — mode force
For v1, defer mode force.
Rationale
- keeps attention on making the ordinary reconciliation path correct
- avoids masking immature guard or transition logic
- reduces the chance of bypassing safety boundaries during initial bring-up
Future alternative
Add mode force later only after hard-vs-soft guard semantics are stable and well tested.
19.9 Resolved Near-Term Decision — GUI Teardown Semantics
For v1, compute promotion should require:
- graphical session absence
- explicit GPU-release verification
It should not initially depend on forcibly stopping every greeter or display-manager path unless empirical testing shows those components interfere with reliable GPU handoff.
19.10 Resolved Near-Term Decision — Desktop Target Ownership
For v1, desktop.target should not directly own the greeter/login path.
Rationale
- keeps mode ownership focused on operational policy rather than full session-manager orchestration
- reduces coupling to whichever login/session stack is chosen
- lets session presence remain an observed fact rather than an aggressively managed requirement
Future alternative
If desktop recovery proves unreliable without tighter control, greeter or display-manager paths can later be pulled under stronger mode ownership.
19.11 Resolved Near-Term Decision — studio-local-policy.service Scope
For v1, studio-local-policy.service should be:
- a reliable marker for observation/classification
- a light policy-application unit
- explicitly limited in scope
It should not become a giant all-in-one Studio behavior controller.
Rationale
- preserves clear observability
- avoids burying controller logic inside a catch-all helper unit
- keeps Studio overlay behavior inspectable and decomposable
19.12 Resolved Near-Term Decision — observe-current Implementation Language
For v1, implement observe-current in shell.
Constraints
- keep the output contract stable:
- plain mode name for shell use
- structured JSON for diagnostics
- structure the implementation so it can later be replaced by a typed helper without changing callers
Future alternative
If classifier complexity or JSON handling becomes unwieldy, replace only the classifier implementation with a small typed helper while keeping the same external contract.
19.13 Resolved Near-Term Decision — mode CLI Packaging
For v1:
- keep the script sources in the repository
- package them in
pkgs/ - install them through the NixOS module
Rationale
- keeps the tool packaging clean and testable
- avoids scattering ad hoc scripts directly into module definitions
- preserves a clean path to reuse across hosts later
19.14 Resolved Near-Term Decision — Reconciler Trigger Model
For v1:
- use parameterized oneshot reconciliation only
- do not enable timer-driven or path-triggered background reconciliation yet
Rationale
- keeps failure behavior easier to understand during bring-up
- avoids masking transition bugs behind background retries
- lets manual transitions prove the model first
Future alternative
After manual transitions are reliable, add periodic or path-triggered reconciliation for self-healing behavior.
19.15 Resolved Near-Term Decision — Boot Policy
For v1:
- normalize to
desktopon boot - do not replay persistent desired mode across reboot
Rationale
- gives the system a predictable safe recovery posture
- avoids booting directly back into a problematic compute path while the controller is still maturing
- keeps early operational behavior easier to reason about
Future alternative
Once transitions are reliable, desired-state persistence across reboot can be introduced as an explicit policy feature.
19A. Architectural Decision Record — Potential Studio Externalization
Context
There is a realistic possibility that low-latency Studio/Audio workloads will migrate from the NixOS machine to a Mac mini.
Decision
The NixOS host architecture should treat Studio as a conditional local profile (studio-local) rather than a permanently central host mode.
Consequences
- the host-local state machine remains stable if Studio moves off-box
- Compute and Desktop remain the durable primary host-local modes
- Studio capability can be represented separately through workload placement decisions
- local audio support can still exist now without overcommitting the architecture to a permanent local Studio role
Follow-on Design Implications
- add
check_studio_capability_localguard for anystudio-localtransition - keep local audio policy isolated from core Compute/Desktop mechanics where practical
- document future sync, control, and workflow boundaries if Studio becomes externalized
20. Control Interface and Implementation Contract
20.1 mode CLI Contract
The system should expose a single operator-facing interface:
mode status
mode request <desktop|studio-local|compute>
mode reconcile
mode current
mode desired
mode explain <desktop|studio-local|compute>
mode dry-run <desktop|studio-local|compute>
mode force <desktop|studio-local|compute>
Command Semantics
mode status
Returns:
- desired mode
- observed current mode
- whether reconciliation is needed
- last transition result
- blocking guard failures, if any
mode request <mode>
Behavior:
- write desired state
- invoke reconciliation
- return success only if reconciliation converged
mode reconcile
Behavior:
- observe current state
- compare to desired
- select transition plan
- run guards
- execute actions
- record results
mode current
Returns only the observed current mode.
mode desired
Returns only the desired mode file contents.
mode explain <mode>
Prints:
- target state properties
- expected services
- resource ownership rules
- guards required for entering that mode
- capability placement assumptions, where relevant
mode dry-run <mode>
Simulates the full reconciliation plan without mutating state.
mode force <mode>
Privileged path that bypasses selected non-safety guards, but must never bypass hard safety guards such as GPU/display or active audio protections unless explicitly designed to allow that.
Implementation note:
- defer this command in v1
- keep it in the long-term interface contract so the design remains forward-compatible
21. State Storage Layout
21.1 Runtime State Paths
/run/mode-controller/
desired
current
lock
last-transition.json
last-guards.json
reconcile.pid
capability-placement.json
hardware-topology.json
21.2 File Semantics
desired
Contains the requested mode:
desktopstudio-localcompute
current
Cached observation of current state. This is convenience state only; it must be derivable from system facts.
lock
Used to serialize reconciliation so only one transition runs at a time.
last-transition.json
Stores:
- requested mode
- prior observed mode
- final observed mode
- success/failure
- guard results
- action results
- timestamps
last-guards.json
Stores latest guard results for diagnostics.
capability-placement.json
Stores environment-level placement facts, for example:
studio: localstudio: external-mac-mini
This file is not the host-local mode source of truth. It is an environment metadata input used by guards and planning logic.
hardware-topology.json
Stores the currently configured hardware view, for example:
- planned GPU count
- currently present GPU indexes
- display GPU assignment
- desktop-mode AI GPU set
- compute-mode AI GPU set
This allows the implementation to preserve the intended dual-GPU architecture while remaining tolerant of temporary single-GPU bring-up phases.
22. systemd Unit and Target Layout
22.1 Targets
desktop.target
Wants:
- graphical-session target path
- bounded interactive services
- optional constrained AI services
First-pass implementation note:
- do not make
desktop.targetdirectly own greeter/login-manager startup in v1 - treat graphical session presence as an observed runtime fact
- strengthen ownership later only if empirical recovery behavior requires it
compute.target
Wants:
- headless service profile
- vLLM compute profile
- k3s compute-allowed policy/profile
22.2 Core Services
mode-controller@.service
Parameterized oneshot service.
Instance values:
mode-controller@desktop.servicemode-controller@studio-local.servicemode-controller@compute.service
Responsibilities:
- load desired mode
- observe current mode
- run reconciliation
- update state files and logs
First-pass implementation note:
- use this parameterized oneshot service as the sole reconciler trigger in v1
- defer timer/path-triggered background reconciliation until manual operation is proven reliable
mode-observe.service
Optional oneshot helper to compute observed current mode and refresh /run/mode-controller/current.
vllm@.service
Optional templated service for profile-specific operation:
vllm@desktop.servicevllm@studio-local.servicevllm@compute.service
Alternative:
- single
vllm.servicewith environment file switching
First-pass implementation guidance:
- prefer separate desktop and compute profiles conceptually
studio-localshould not require its own dedicated vLLM unit in v1 if Studio is implemented as a desktop overlay- a single
vllm.serviceis acceptable initially if it preserves a clean migration path to templated units later - keep desktop-mode vLLM disabled for the first transition-proof milestone
mode-guard@.service
Optional wrapper pattern for reusable guard execution, though plain scripts may be simpler initially.
studio-local overlay units
Recommended first-pass representation:
audio-priority.servicestudio-local-policy.service- optional environment/policy file consumed by observation and guard logic
These units should layer on top of desktop.target rather than replacing it with a distinct top-level target in v1.
Recommended scope for studio-local-policy.service:
- expose a clear mode marker
- apply only light, explicit Studio-specific policy
- delegate heavyweight orchestration to the controller or dedicated helper units
22.3 Suggested Slice Layout
system.slice
├── interactive.slice
│ ├── graphical-session scope/services
│ ├── audio-related helpers
│ └── bounded desktop workloads
├── ai.slice
│ ├── vllm service
│ └── AI helpers
└── platform.slice
├── k3s service
└── supporting infra services
Slice Intent
interactive.slicegets priority and headroom in Desktop/Studio-Localai.sliceis heavily constrained in Studio-Local, moderately constrained in Desktop, relaxed in Computeplatform.sliceremains comparatively stable but may have tighter resource budgets in interactive modes and relaxed budgets in Compute
23. Current State Observation Logic
Current state must be observed, not assumed.
23.1 Observation Inputs
GUI Indicators
graphical.targetor session-specific equivalent active- active user session via
loginctl - Hyprland process/session present
Audio Indicators
- PipeWire user service active
- active audio clients or REAPER process
- optional JACK graph activity
AI Indicators
vllm*.serviceactive- environment/profile indicates single-GPU or dual-GPU mode
- optional
nvidia-smi-based observation of active GPU usage
Platform Indicators
k3s.serviceactive- optional workload-class indicators
23.2 Observation Heuristic
Observed mode should be derived using a deterministic classifier.
Proposed classifier logic
Observe compute
If all of the following are true:
- no active graphical session
- compute target active or compute service profile active
- vLLM compute profile active or both GPUs assigned to AI policy
Then observed current mode = compute
Observe studio-local
If all of the following are true:
- graphical session active
- audio stack active
- studio-local policy marker active
- AI profile disabled or highly constrained
Then observed current mode = studio-local
Observe desktop
If all of the following are true:
- graphical session active
- desktop policy marker active
- no studio-local policy marker
Then observed current mode = desktop
Observe transitioning
If:
- desired != inferred stable mode
- controller is running or lock file exists
Then observed current mode = transitioning
Observe failed-transition
If:
- last transition failed
- current does not match desired
- no controller currently reconciling
Then observed current mode = failed-transition
23.3 Recommendation
Use a small classifier script:
/usr/local/libexec/mode-controller/observe-current
Outputs:
- plain mode name for shell use
- optional JSON with evidence for debugging
First-pass implementation note:
- implement this in shell first
- preserve a stable output contract so the implementation language can change later without changing the control plane
24. Guard Function Contract
24.1 Guard Naming
check_audio_idle
check_gpu_display_released
check_cpu_load_safe
check_user_jobs_safe
check_memory_headroom
check_vllm_drainable
check_graphical_session_absent
check_graphical_session_present
check_target_reachable
check_studio_capability_local
24.2 Exit Code Convention
0 pass
10 policy block: audio active
11 policy block: display GPU still owned
12 policy block: CPU load too high
13 policy block: user jobs active
14 policy block: insufficient memory headroom
15 policy block: vLLM not drainable
16 policy block: graphical session absent when required
17 policy block: graphical session present when forbidden
18 policy block: target unreachable / invalid request
19 policy block: requested local studio capability not available
20+ execution/inspection errors
30+ internal controller misuse
24.3 Guard Output Contract
Each guard should emit a concise structured line or JSON object such as:
{"guard":"check_audio_idle","ok":false,"code":10,"reason":"reaper process active"}
24.4 Hard vs Soft Guards
Hard guards
Must never be bypassed by ordinary automation:
- active audio protection for Studio-Local -> Compute or Desktop -> Compute
- GPU/display ownership guard
- target validity checks
- local Studio capability checks for
studio-local
Soft guards
May be bypassed by privileged operator action or policy:
- generic CPU load threshold
- selected user-job heuristics
- non-critical memory thresholds
25. Transition Plans with Exact Operations
This section normalizes each transition into explicit steps.
25.1 Common Transition Framework
All transitions should follow:
- acquire lock
- observe current state
- validate requested mode
- if current == desired, exit success
- select transition plan
- run transition guards
- execute pre-actions
- isolate or start target
- execute post-actions
- re-observe current state
- record success/failure
- release lock
25.2 Plan: Desktop -> StudioLocal
Preconditions
- desktop currently observed
- request = studio-local
- local Studio capability is still hosted on the NixOS machine
Guards
check_target_reachablecheck_studio_capability_local- optional
check_user_jobs_safe
Exact operations
write desired=studio-local
flock /run/mode-controller/lock
observe current
run guards
systemctl start audio-priority.service # if modeled separately
systemctl start studio-local-policy.service
observe current
record result
Notes
- GUI remains up
- audio policy is strengthened
- AI capacity is reduced or removed
- if Studio capability has been externalized, this transition must fail cleanly with an explanatory reason
25.3 Plan: StudioLocal -> Desktop
Guards
check_target_reachable
Exact operations
write desired=desktop
flock /run/mode-controller/lock
observe current
run guards
systemctl stop audio-priority.service # if separate helper exists
systemctl stop studio-local-policy.service
systemctl isolate desktop.target
observe current
record result
25.4 Plan: Desktop -> Compute
Guards
check_target_reachablecheck_audio_idlecheck_cpu_load_safecheck_user_jobs_safecheck_memory_headroom
Pre-actions
- terminate graphical session
- wait for GUI disappearance
- verify GPU/display release
Exact operations
write desired=compute
flock /run/mode-controller/lock
observe current
run initial guards
loginctl terminate-session <session-id>
wait until observe-current no longer sees graphical session
run check_gpu_display_released
systemctl isolate compute.target
systemctl start vllm@compute.service
observe current
record result
Additional notes
systemctl isolate compute.targetshould conflict with interactive/graphical targets in your target design- GPU release must be verified after GUI shutdown, not merely assumed
25.5 Plan: Compute -> Desktop
Guards
check_target_reachablecheck_vllm_drainablecheck_memory_headroom
Exact operations
write desired=desktop
flock /run/mode-controller/lock
observe current
run guards
systemctl stop vllm@compute.service # or downscale path
systemctl isolate desktop.target
systemctl start vllm@desktop.service # optional bounded single-GPU profile
observe current
record result
Notes
- graphical session may be started by display manager or login path depending on design
- GPU0 becomes protected for display once Desktop converges
25.6 Plan: StudioLocal -> Compute
Preferred behavior
Treat as a direct guarded transition using the same compute-entry pipeline.
Guards
check_target_reachablecheck_audio_idlecheck_cpu_load_safecheck_user_jobs_safecheck_memory_headroom
Exact operations
write desired=compute
flock /run/mode-controller/lock
observe current
run guards
loginctl terminate-session <session-id>
wait until graphical session absent
run check_gpu_display_released
systemctl isolate compute.target
systemctl start vllm@compute.service
observe current
record result
Policy note
Because Studio-Local is the most protected interactive mode, auto-promotion from Studio-Local to Compute should generally be disabled unless explicitly requested.
26. NixOS Specialisations vs Runtime Switching — Decision Guidance
26.1 Decision Matrix
| Criterion | Runtime Switching | Specialisations | Hybrid |
|---|---|---|---|
| Desktop <-> Studio-Local speed | Excellent | Poor | Excellent |
| Desktop <-> Compute isolation | Moderate | Strong | Stronger |
| Complexity | Lower | Moderate | Highest |
| Early experimentation | Best | Slower | Moderate |
| Deep kernel/boot divergence | Weak | Strong | Strong |
| Operational convenience | High | Lower | Moderate |
| Future externalization of Studio | Good | Good | Best |
26.2 Recommended Decision Rule
Adopt runtime switching now unless one or more of the following become true:
- compute mode needs materially different kernel parameters or boot-time config
- graphical/interactive teardown proves unreliable in practice
- GPU role handoff remains too leaky under runtime-only switching
- you want Compute to be operationally closer to a dedicated server persona than a temporary mode
If any two of the above become persistent problems, promote Compute into a specialisation.
26.3 Recommended Architecture Path
Phase 1
- single NixOS host definition
- runtime switching only
- targets + slices + controller + guards
Phase 2
- strengthen target separation
- gather empirical failure/latency data
Phase 3
- if needed, introduce
specialisation.compute - preserve same desired/current/reconcile interface so operator UX does not change
Phase 4
- if Studio is externalized, deprecate or disable
studio-local - retain the same operator-facing control model for the host-local system
That means mode request compute could later choose:
- runtime reconcile, or
- request/reboot into compute specialisation
without changing the higher-level model.
27. Recommended Next Implementation Steps
- define exact systemd target dependencies/conflicts in Nix
- implement
modeCLI wrapper script - implement
observe-current - implement guard scripts with fixed exit-code contract
- choose between:
vllm@desktop.service/vllm@compute.service- one service with profile env file
- define slice resource policies for interactive vs AI
- wire idle detector to
mode request compute - validate transition behavior manually before enabling automation
- add a capability-placement flag/model for future Studio externalization
28. Summary
This system should behave like a reconciled state machine for host-local operational modes.
The core model is:
- desired mode is explicit runtime intent
- current mode is observed reality
- reconciliation closes the gap
- guards prevent unsafe transitions
- systemd targets/services perform the actual mode enactment
The implementation should start with runtime switching, but preserve a clean path to hybrid specialisation if operational evidence justifies stronger separation later.
Studio/Audio should be treated as a conditional local profile plus a capability-placement decision, so that a future move to a Mac mini does not invalidate the host-local architecture.