Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

ADR-0008: Seed Local vLLM Model Bundles

Status: accepted

Context

Dubnium’s first compute workload uses vLLM with a locally served model bundle. The exact model is host configuration, not part of the USB seed format.

Model weights are large mutable runtime artifacts. Keeping them in Git would inflate the repository and blur source policy with runtime state. Keeping them in the Nix store would make first install, rebuild, and recovery depend on large model fetches during system activation and would couple model bytes to immutable system generations.

Fresh install and recovery should work even when the machine does not yet have reliable network access. The seed format should not depend on Hugging Face hub cache internals such as refs, blobs, snapshots, or symlinks.

Decision

Keep model weights out of Git and out of the Nix store.

Treat /var/lib/dubnium/models as the Dubnium-owned runtime model store. Seed normal local model bundle directories from removable media as the preferred v1 provisioning path.

Use a materialized bundle directory for the selected compute model. The workstation vLLM service serves a path under:

/var/lib/dubnium/models

If a Hugging Face cache is used as the source of the seed, materialize the snapshot once before putting it on the USB. The runtime seed and installed model store should be ordinary directories with model files and SHA256SUMS.

Consequences

  • The Dubnium repository stays small and source-only.
  • Nix continues to own service policy and runtime configuration, not model artifact storage.
  • Fresh install and recovery can avoid depending on a large network download.
  • Runtime no longer depends on Hugging Face cache layout or symlink behavior.
  • Operators must manage the seed media and verify the local bundle before entering compute mode.
  • Reproducibility of model bytes depends on the seed contents until a specific model revision is selected and recorded.
  • vLLM startup failures may indicate an absent, incomplete, misplaced, or revision-mismatched local model bundle.

Escalation Criteria

Reconsider this policy if:

  • model revision pinning becomes mandatory for reproducible evaluation
  • a dedicated artifact mirror or cache service becomes available
  • install-time network access becomes reliable enough to remove the USB seed path
  • model storage needs to support multiple served models, quantized variants, or per-mode model selection