ADR-0003: vLLM Is Compute-Only in V1

Status: accepted

Context

Desktop-mode AI is possible in the target architecture, especially when a second GPU is installed. For the first reliable control-loop milestone, desktop AI adds resource contention and observer complexity.

Decision

Keep vLLM compute-only in v1.

Use one vllm.service attached to compute behavior. Shape options and controller actions so vllm@compute.service and a future bounded desktop profile can be added later.

Consequences

desktop and studio-local should leave vLLM inactive.
compute owns vLLM activation.
The first milestone can focus on mode transitions and observation.
Bounded desktop AI is deferred until desktop <-> compute switching is reliable on real hardware.

Keyboard shortcuts

Dubnium

ADR-0003: vLLM Is Compute-Only in V1

Context

Decision

Consequences