ADR-0008: Seed Local vLLM Model Bundles
Status: accepted
Context
Dubnium’s first compute workload uses vLLM with a locally served model bundle. The exact model is host configuration, not part of the USB seed format.
Model weights are large mutable runtime artifacts. Keeping them in Git would inflate the repository and blur source policy with runtime state. Keeping them in the Nix store would make first install, rebuild, and recovery depend on large model fetches during system activation and would couple model bytes to immutable system generations.
Fresh install and recovery should work even when the machine does not yet have
reliable network access. The seed format should not depend on Hugging Face hub
cache internals such as refs, blobs, snapshots, or symlinks.
Decision
Keep model weights out of Git and out of the Nix store.
Treat /var/lib/dubnium/models as the Dubnium-owned runtime model store. Seed
normal local model bundle directories from removable media as the preferred v1
provisioning path.
Use a materialized bundle directory for the selected compute model. The workstation vLLM service serves a path under:
/var/lib/dubnium/models
If a Hugging Face cache is used as the source of the seed, materialize the
snapshot once before putting it on the USB. The runtime seed and installed model
store should be ordinary directories with model files and SHA256SUMS.
Consequences
- The Dubnium repository stays small and source-only.
- Nix continues to own service policy and runtime configuration, not model artifact storage.
- Fresh install and recovery can avoid depending on a large network download.
- Runtime no longer depends on Hugging Face cache layout or symlink behavior.
- Operators must manage the seed media and verify the local bundle before entering compute mode.
- Reproducibility of model bytes depends on the seed contents until a specific model revision is selected and recorded.
- vLLM startup failures may indicate an absent, incomplete, misplaced, or revision-mismatched local model bundle.
Escalation Criteria
Reconsider this policy if:
- model revision pinning becomes mandatory for reproducible evaluation
- a dedicated artifact mirror or cache service becomes available
- install-time network access becomes reliable enough to remove the USB seed path
- model storage needs to support multiple served models, quantized variants, or per-mode model selection