Memory Phase 2: Governed Structured Memory

Status: planning

Phase 1 prepared the local memory substrate: package, API, Postgres/pgvector schema, Redis support, retrieval events, tests, and an opt-in workstation runbook.

Phase 2 makes memory useful for governed agent workflows without turning the memory service into the governance authority.

Goal

Build a structured, policy-aware memory layer that Anthesis or another orchestrator can govern explicitly.

The Phase 2 target is:

Anthesis decides what memory may be used.
Dubnium stores, retrieves, filters, and records memory events.
vLLM remains the inference runtime.

Non-Goals

Do not implement these in Phase 2:

autonomous self-editing memory
global always-on personal memory injection
durable transformer KV-cache persistence
multi-agent memory federation
Temporal or complex workflow orchestration
MinIO or OCI memory bundles
raw artifact extraction pipelines
public or Tailscale-exposed memory API
Anthesis itself inside the Dubnium memory service

Boundary

flowchart TD
    A[Anthesis / Orchestrator] --> B[Memory Policy Decision]
    B --> C[Dubnium Memory API]
    C --> D[(Postgres)]
    C --> E[(pgvector)]
    C --> F[(Redis)]
    C --> G[Retrieval Event]
    G --> A
    A --> H[Execution Envelope]
    A --> I[vLLM / Agent Prompt]

Dubnium must expose enough structure for Anthesis to audit and replay memory use, but Dubnium must not silently decide that retrieved memory belongs in a prompt.

Phase 2 Capabilities

1. Memory Namespaces

Add explicit namespace concepts on top of the existing scope field.

Suggested namespace shape:

personal:<name>
project:<repo-or-system>
session:<uuid>
agent:<agent-id>
workflow:<workflow-id>

The existing scope field can remain the primary filter, but Phase 2 should document and validate accepted scope patterns.

2. Memory Classes

Keep the current memory types:

working
episodic
semantic

Add operational guidance:

Type	Meaning	Default retention
`working`	transient task/session context	short TTL
`episodic`	event/session summaries	medium or explicit TTL
`semantic`	normalized stable facts/decisions	long-lived but reviewable

Semantic memory should require stronger provenance and confidence than working memory.

3. Governance Metadata

Each memory row already carries sensitivity, validation_status, source, provenance, and ttl. Phase 2 should standardize expected provenance fields.

Recommended provenance shape:

{
  "origin": "agent|operator|system|import",
  "source_uri": "optional source reference",
  "source_event_id": "optional event id",
  "extractor": "manual|summary-worker|agent",
  "extractor_version": "1",
  "governance": "manual|anthesis|none",
  "envelope_id": "optional Anthesis envelope id"
}

4. Retrieval Policy Contract

Add a policy-facing retrieval request contract:

{
  "query": "string",
  "scope": "project:dubnium",
  "allowed_sensitivity": ["internal"],
  "require_verified": false,
  "limit": 8,
  "purpose": "ask|plan|patch|review|test",
  "requester": {
    "actor_type": "human|agent|system",
    "actor_id": "string"
  },
  "envelope_id": "optional Anthesis envelope id"
}

The existing API can continue accepting the Phase 1 shape, but Phase 2 should add optional fields and preserve backward compatibility.

5. Retrieval Event Completeness

Retrieval events should eventually record:

query
scope
returned memory ids
returned artifact ids
allowed sensitivities
require_verified
requester
purpose
envelope id
timestamp

This is the key replay hook for Anthesis.

6. Memory Promotion

Add an explicit promotion workflow:

working -> episodic -> semantic -> repo doc / ADR / runbook

Rules:

working memory can be generated freely inside a session
episodic memory requires summarization and provenance
semantic memory requires confidence, review status, and scope
repo docs remain the highest-authority source for durable project truth

7. Memory Rejection

Add a clear rejection path:

candidate memory -> rejected -> never retrieved unless explicitly requested for audit

Rejection reasons should include:

secret-like content
cross-scope contamination
hallucinated or unsupported claim
stale fact
prompt-injection residue
unsupported provenance

8. Prompt Assembly Boundary

The memory service should never return a final prompt. It should return candidates and event metadata.

The orchestrator owns:

prompt assembly
context ordering
final redaction
policy enforcement
provider selection
execution envelope capture

Implementation Tasks

Task 1: Add Governance-Oriented Request Metadata

Files:

pkgs/memory-service/src/dubnium_memory/models.py
pkgs/memory-service/src/dubnium_memory/serialization.py
pkgs/memory-service/tests/test_models.py
pkgs/memory-service/tests/test_api.py

Add optional fields to retrieval requests:

purpose
requester
envelope_id

Keep them optional so Phase 1 clients do not break.

Task 2: Extend Retrieval Events

Files:

pkgs/memory-service/src/dubnium_memory/migrations/003_retrieval_event_metadata.sql
pkgs/memory-service/src/dubnium_memory/postgres.py
pkgs/memory-service/tests/test_migrations.py
pkgs/memory-service/tests/test_postgres.py

Add nullable metadata columns or a metadata jsonb field to retrieval_events.

Recommended initial shape:

ALTER TABLE retrieval_events
  ADD COLUMN IF NOT EXISTS metadata jsonb NOT NULL DEFAULT '{}'::jsonb;

This avoids premature schema churn while keeping replay metadata available.

Task 3: Add Scope Validation Helpers

Files:

pkgs/memory-service/src/dubnium_memory/scopes.py
pkgs/memory-service/tests/test_scopes.py

Add validation for scope prefixes:

personal:
project:
session:
agent:
workflow:

Do not enforce globally until existing tests and callers are migrated.

Task 4: Add Promotion/Rejection Contract Docs

Files:

docs/specs/memory-governance-contract.md
docs/runbooks/memory-service.md

Document:

memory promotion rules
rejection reasons
semantic memory expectations
Anthesis envelope handoff

Task 5: Add Policy Examples

Files:

docs/examples/memory-policy.project-dubnium.json
docs/examples/memory-retrieval-request.json
docs/examples/memory-retrieval-event.json

These examples should be data contracts, not active enforcement.

Acceptance Criteria

Phase 2 is complete when:

retrieval requests can carry optional governance metadata
retrieval events preserve that metadata for replay
scope conventions are documented and testable
memory promotion/rejection rules are documented
Anthesis can use memory ids and retrieval event ids in execution envelopes
no prompt assembly happens inside the memory service
memory remains opt-in on the workstation host

Risks

Risk	Mitigation
Memory poisoning	require scope, provenance, validation status, and retrieval event logging
Cross-project leakage	enforce scoped retrieval and explicit sensitivity filters
Silent context injection	keep prompt assembly outside memory service
Governance coupling	expose metadata; let Anthesis decide policy
Schema churn	prefer additive migrations and metadata JSON for early governance fields
Stale semantic facts	use confidence, validation status, TTL, and promotion workflow

Recommended First PR

The first Phase 2 PR should be small:

Add metadata jsonb to retrieval_events
Add optional retrieval request metadata fields
Preserve metadata in retrieval event responses
Add tests
Add governance contract docs

Do not add Anthesis runtime wiring yet.

Keyboard shortcuts

Dubnium