Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

ADR-0003: vLLM Is Compute-Only in V1

Status: accepted

Context

Desktop-mode AI is possible in the target architecture, especially when a second GPU is installed. For the first reliable control-loop milestone, desktop AI adds resource contention and observer complexity.

Decision

Keep vLLM compute-only in v1.

Use one vllm.service attached to compute behavior. Shape options and controller actions so vllm@compute.service and a future bounded desktop profile can be added later.

Consequences

  • desktop and studio-local should leave vLLM inactive.
  • compute owns vLLM activation.
  • The first milestone can focus on mode transitions and observation.
  • Bounded desktop AI is deferred until desktop <-> compute switching is reliable on real hardware.