Memory¶
Purpose¶
Enable the agent to “remember facts across sessions” while preventing conversation context from growing unboundedly. Harness splits memory into two layers: high-frequency low-curation “daily logs” + low-frequency high-curation “long-term memory”, supplemented by FTS5 full-text search and background maintenance.
Trigger Points¶
When |
Action |
|---|---|
Before reasoning ( |
|
End of |
|
Context overflow ( |
|
Oversized tool result ( |
|
Background schedule |
|
Key Logic¶
Two-Layer Memory Model¶
graph LR
Conv[conversation messages] -->|over threshold| Compactor[ConversationCompactor]
Compactor -->|offload| Sess[sessions/<id>.log.jsonl]
Compactor -->|flushMemories| Flush[MemoryFlushManager]
Flush -->|append + index| Daily[memory/YYYY-MM-DD.md]
Daily -. background processing .-> Cons[MemoryConsolidator]
MEM[MEMORY.md as context] -->|read for deduplication| Cons
Cons -->|rewrite| MEM
MEM -->|injected each reasoning turn| Hook[WorkspaceContextHook]
Daily -.not injected directly.- Hook
Daily --> Idx[(MemoryIndex<br/>SQLite FTS5)]
MEM --> Idx
Layer 1 — Daily log
memory/YYYY-MM-DD.md: owned byMemoryFlushManager, append-only, no deduplication; a raw record of “what was just being discussed”.Layer 2 — Curated long-term memory
MEMORY.md: owned byMemoryConsolidator, complete rewrite;MemoryFlushManagernever touches it. Injected into system prompt byWorkspaceContextHookon every reasoning turn.Index
MemoryIndex: fully indexed at startup withindexAllFromWorkspace; incrementally rebuilt for today’s file after each flush; SQLite file at<workspace_parent>/memory_index.db.
Conversation Compaction (ConversationCompactor)¶
check thresholds → find cutoff (don't split ASSISTANT/TOOL pairs)
→ (optional) flushMemories(prefix)
→ (optional) offloadMessages(messages → sessions/.../<id>.jsonl)
→ LLM distills summary
→ [summaryUserMsg] + tail returned to hook to reload memory
Default values (all configurable):
Parameter |
Default |
Description |
|---|---|---|
|
|
Trigger by message count ( |
|
|
Trigger by estimated token count ( |
|
|
Number of tail messages to retain |
|
|
Non-zero: scan from back by token budget, overrides |
|
|
Extract memory to daily log before compacting |
|
|
Append raw messages to session |
|
Built-in template |
Four-section format: SESSION INTENT / SUMMARY / ARTIFACTS / NEXT STEPS |
CompactionConfig.builder()
.triggerMessages(30)
.keepMessages(10)
.build(); // flush/offload both default to true
TruncateArgsConfig — Lightweight Pre-processing (Optional)¶
Before LLM summarization, a no-LLM pre-pass can truncate ToolUseBlock arguments in older messages (default threshold: 25 messages / 40k tokens, arguments exceeding 2000 characters are trimmed). Useful for scenarios like write_file where large argument bodies are not needed later.
CompactionConfig.builder()
.triggerMessages(80)
.truncateArgs(TruncateArgsConfig.builder().build())
.build();
Automatic Context Overflow Recovery¶
When the model returns a context_length_exceeded / maximum context style error, HarnessAgent.recoverFromOverflow → forceCompactAndRetry builds a temporary CompactionConfig with triggerMessages=1, runs one compaction round, clears Memory, and retries. Prerequisite: compaction(...) must be configured; otherwise the error is rethrown directly.
Memory Extraction (MemoryFlushManager)¶
flushMemories(messages): hands the currentMEMORY.mdand today’s log to the LLM as “deduplication reference”, requesting only newly added bullets. “NO_REPLY” means nothing to write.Write location is always
memory/YYYY-MM-DD.md, neverMEMORY.md(to prevent layer 1 overwriting layer 2).After writing, immediately calls
indexFromStringto rebuild the file index, then callsMemoryMaintenanceScheduler.requestConsolidation()to signal “consolidate when you can”.
Secondary Consolidation (MemoryConsolidator)¶
Reads daily logs with mtime exceeding the watermark + current
MEMORY.md, calls LLM to merge, deduplicate, and trim.Output limit: default
maxMemoryTokens=4000(~16k characters); the prompt communicates this as a character budget to the LLM.After writing, advances the watermark stored in
memory/.consolidation_state; next run only looks at files with mtime past the watermark.Consolidation only runs on the background executor: triggered by a periodic tick or
requestConsolidation(), never blocking the reasoning loop.
Background Maintenance (MemoryMaintenanceScheduler)¶
Auto-created and start()ed inside HarnessAgent.build(); each tick runs in sequence:
expireDailyFiles— archive daily files older thandailyFileRetentionDaystomemory/archive/(default 90 days)consolidateMemory— callMemoryConsolidator.consolidate()pruneOldSessions— delete session files with mtime older thansessionRetentionDays(default 180 days)reindex—MemoryIndex.indexAllFromWorkspace
Default interval: Duration.ofHours(6); opportunistic calls are throttled to 30-minute intervals to avoid hammering the LLM with frequent flushes.
Tool Result Eviction (ToolResultEvictionConfig)¶
Independent from compaction. When a tool_call return text exceeds the threshold, the full content is written to a file under evictionPath, and the original position is replaced with a “head+tail preview + path” placeholder. The agent calls read_file when it needs the full content.
Parameter |
Default |
Description |
|---|---|---|
|
|
Evict if exceeded |
|
|
Number of head and tail preview characters |
|
|
Root path for evicted files |
|
Built-in set (includes |
Tools excluded from eviction |
HarnessAgent.builder()
...
.toolResultEviction(ToolResultEvictionConfig.defaults())
.build();
Configuration and Code Examples¶
HarnessAgent agent = HarnessAgent.builder()
.name("MyAgent")
.model(model)
.workspace(workspace)
.compaction(CompactionConfig.builder()
.triggerMessages(30)
.keepMessages(10)
.build())
.toolResultEviction(ToolResultEvictionConfig.defaults())
.build();
// Agent can call memory_search at any time
MemoryIndex index = new MemoryIndex(workspaceAgentScopeDir);
index.open();
List<MemoryIndex.SearchHit> hits = index.search("database migration", 10);
// hit: { path, lineNumber, content, rank }