Memory
Short-term, working, long-term — each with different governance.
Memory is how delegation becomes experience
Delegation builds trust through experience. Memory is how an agent accumulates that experience. There are three tiers, and each raises a different governance question.
What you've said, what the agent has done, what's been decided. Lost when the session ends.
Active project state. The agent remembers it is halfway through a multi-step task.
Preferences, history, institutional knowledge that persists across sessions.
Short-term memory
The current conversation. Within a single session, the agent remembers what you said three messages ago and what it decided two tool calls ago. This is just context window, but managed — you decide what to keep and what to summarise as the conversation grows.
Governance questions are light: privacy for the user's current turn, and whether the conversation gets logged for auditability.
Working memory
Task context across a multi-step task. An agent researching a topic remembers which sources it has already consulted, which contradictions it has flagged, which questions are still open. This is usually stored outside the model (a JSON blob, a task record) and re-injected each turn.
The governance question becomes: how do we audit what the agent did within a task? Working memory is typically where reasoning traces live, and those are what you review after the fact.
Long-term memory
Persistent state across sessions. The agent knows your role, your preferences, your last project, the decision your team made three weeks ago. This is usually a database or vector store keyed by user or project.
Long-term memory is where governance gets hard.
- What should it remember? A customer's preference for formal language — yes. Their home address — only with consent.
- What should it forget? Outdated decisions, stale preferences, data that was deleted under a right-to-erasure request.
- Who controls it? Memory that the agent writes to itself is a black box. Memory that the user explicitly curates is accountable but slow.
Practical patterns
- Session-scoped: keep short-term memory in the request. No storage, no governance burden.
- Task-scoped: store working memory in a task record, wipe on completion. Audit the trace, not the persistent store.
- User-scoped: store long-term memory keyed by user, with explicit read/write/delete API surfaces, exposed to the user (and to auditors).
Start with the least amount of memory that works. Add tiers as the job demands, not as "it would be nice if the agent remembered".