Memory Phase 2: Governed Structured Memory
Status: planning
Phase 1 prepared the local memory substrate: package, API, Postgres/pgvector schema, Redis support, retrieval events, tests, and an opt-in workstation runbook.
Phase 2 makes memory useful for governed agent workflows without turning the memory service into the governance authority.
Goal
Build a structured, policy-aware memory layer that Anthesis or another orchestrator can govern explicitly.
The Phase 2 target is:
Anthesis decides what memory may be used.
Dubnium stores, retrieves, filters, and records memory events.
vLLM remains the inference runtime.
Non-Goals
Do not implement these in Phase 2:
- autonomous self-editing memory
- global always-on personal memory injection
- durable transformer KV-cache persistence
- multi-agent memory federation
- Temporal or complex workflow orchestration
- MinIO or OCI memory bundles
- raw artifact extraction pipelines
- public or Tailscale-exposed memory API
- Anthesis itself inside the Dubnium memory service
Boundary
flowchart TD
A[Anthesis / Orchestrator] --> B[Memory Policy Decision]
B --> C[Dubnium Memory API]
C --> D[(Postgres)]
C --> E[(pgvector)]
C --> F[(Redis)]
C --> G[Retrieval Event]
G --> A
A --> H[Execution Envelope]
A --> I[vLLM / Agent Prompt]
Dubnium must expose enough structure for Anthesis to audit and replay memory use, but Dubnium must not silently decide that retrieved memory belongs in a prompt.
Phase 2 Capabilities
1. Memory Namespaces
Add explicit namespace concepts on top of the existing scope field.
Suggested namespace shape:
personal:<name>
project:<repo-or-system>
session:<uuid>
agent:<agent-id>
workflow:<workflow-id>
The existing scope field can remain the primary filter, but Phase 2 should document and validate accepted scope patterns.
2. Memory Classes
Keep the current memory types:
workingepisodicsemantic
Add operational guidance:
| Type | Meaning | Default retention |
|---|---|---|
working | transient task/session context | short TTL |
episodic | event/session summaries | medium or explicit TTL |
semantic | normalized stable facts/decisions | long-lived but reviewable |
Semantic memory should require stronger provenance and confidence than working memory.
3. Governance Metadata
Each memory row already carries sensitivity, validation_status, source, provenance, and ttl. Phase 2 should standardize expected provenance fields.
Recommended provenance shape:
{
"origin": "agent|operator|system|import",
"source_uri": "optional source reference",
"source_event_id": "optional event id",
"extractor": "manual|summary-worker|agent",
"extractor_version": "1",
"governance": "manual|anthesis|none",
"envelope_id": "optional Anthesis envelope id"
}
4. Retrieval Policy Contract
Add a policy-facing retrieval request contract:
{
"query": "string",
"scope": "project:dubnium",
"allowed_sensitivity": ["internal"],
"require_verified": false,
"limit": 8,
"purpose": "ask|plan|patch|review|test",
"requester": {
"actor_type": "human|agent|system",
"actor_id": "string"
},
"envelope_id": "optional Anthesis envelope id"
}
The existing API can continue accepting the Phase 1 shape, but Phase 2 should add optional fields and preserve backward compatibility.
5. Retrieval Event Completeness
Retrieval events should eventually record:
- query
- scope
- returned memory ids
- returned artifact ids
- allowed sensitivities
require_verified- requester
- purpose
- envelope id
- timestamp
This is the key replay hook for Anthesis.
6. Memory Promotion
Add an explicit promotion workflow:
working -> episodic -> semantic -> repo doc / ADR / runbook
Rules:
- working memory can be generated freely inside a session
- episodic memory requires summarization and provenance
- semantic memory requires confidence, review status, and scope
- repo docs remain the highest-authority source for durable project truth
7. Memory Rejection
Add a clear rejection path:
candidate memory -> rejected -> never retrieved unless explicitly requested for audit
Rejection reasons should include:
- secret-like content
- cross-scope contamination
- hallucinated or unsupported claim
- stale fact
- prompt-injection residue
- unsupported provenance
8. Prompt Assembly Boundary
The memory service should never return a final prompt. It should return candidates and event metadata.
The orchestrator owns:
- prompt assembly
- context ordering
- final redaction
- policy enforcement
- provider selection
- execution envelope capture
Implementation Tasks
Task 1: Add Governance-Oriented Request Metadata
Files:
pkgs/memory-service/src/dubnium_memory/models.pypkgs/memory-service/src/dubnium_memory/serialization.pypkgs/memory-service/tests/test_models.pypkgs/memory-service/tests/test_api.py
Add optional fields to retrieval requests:
purposerequesterenvelope_id
Keep them optional so Phase 1 clients do not break.
Task 2: Extend Retrieval Events
Files:
pkgs/memory-service/src/dubnium_memory/migrations/003_retrieval_event_metadata.sqlpkgs/memory-service/src/dubnium_memory/postgres.pypkgs/memory-service/tests/test_migrations.pypkgs/memory-service/tests/test_postgres.py
Add nullable metadata columns or a metadata jsonb field to retrieval_events.
Recommended initial shape:
ALTER TABLE retrieval_events
ADD COLUMN IF NOT EXISTS metadata jsonb NOT NULL DEFAULT '{}'::jsonb;
This avoids premature schema churn while keeping replay metadata available.
Task 3: Add Scope Validation Helpers
Files:
pkgs/memory-service/src/dubnium_memory/scopes.pypkgs/memory-service/tests/test_scopes.py
Add validation for scope prefixes:
personal:project:session:agent:workflow:
Do not enforce globally until existing tests and callers are migrated.
Task 4: Add Promotion/Rejection Contract Docs
Files:
docs/specs/memory-governance-contract.mddocs/runbooks/memory-service.md
Document:
- memory promotion rules
- rejection reasons
- semantic memory expectations
- Anthesis envelope handoff
Task 5: Add Policy Examples
Files:
docs/examples/memory-policy.project-dubnium.jsondocs/examples/memory-retrieval-request.jsondocs/examples/memory-retrieval-event.json
These examples should be data contracts, not active enforcement.
Acceptance Criteria
Phase 2 is complete when:
- retrieval requests can carry optional governance metadata
- retrieval events preserve that metadata for replay
- scope conventions are documented and testable
- memory promotion/rejection rules are documented
- Anthesis can use memory ids and retrieval event ids in execution envelopes
- no prompt assembly happens inside the memory service
- memory remains opt-in on the workstation host
Risks
| Risk | Mitigation |
|---|---|
| Memory poisoning | require scope, provenance, validation status, and retrieval event logging |
| Cross-project leakage | enforce scoped retrieval and explicit sensitivity filters |
| Silent context injection | keep prompt assembly outside memory service |
| Governance coupling | expose metadata; let Anthesis decide policy |
| Schema churn | prefer additive migrations and metadata JSON for early governance fields |
| Stale semantic facts | use confidence, validation status, TTL, and promotion workflow |
Recommended First PR
The first Phase 2 PR should be small:
- Add
metadata jsonbtoretrieval_events - Add optional retrieval request metadata fields
- Preserve metadata in retrieval event responses
- Add tests
- Add governance contract docs
Do not add Anthesis runtime wiring yet.