Memory¶
Role¶
Lets the agent “remember facts across sessions” while keeping the conversation context bounded. Harness splits memory into two layers:
Layer 1 · daily log
memory/YYYY-MM-DD.md— append-only each day, raw and not deduped;Layer 2 · curated long-term
MEMORY.md— periodically merged + deduped by the LLM; injected into the system prompt every reasoning step as long-term memory.
Three companion mechanisms:
Conversation compaction — summarizes history and keeps a recent tail when context is too long;
Overflow safety net — when the model actually errors, force a compaction and retry;
Large tool-result offloading — offload to disk + placeholder when a single tool returns too much.
How the two layers work¶
graph LR
Conv["conversation messages"] -->|over threshold| Compactor["conversation compaction"]
Compactor -->|offload| Sess["sessions/<id>.log.jsonl"]
Compactor -->|extract new facts| Daily["memory/YYYY-MM-DD.md"]
Daily -. periodic background merge .-> MEM["MEMORY.md"]
MEM -->|injected each reasoning step| SYS["system prompt"]
Key points:
Layer 1 only appends, never dedupes; Layer 2 is periodically rewritten as a whole; the two layers never overwrite each other.
Layer 2 is the only one injected into the prompt; Layer 1 waits to be merged.
Raw messages dropped during compaction are also saved into a never-compacted log file (
*.log.jsonl) for later audit orsession_search.
Enable compaction¶
HarnessAgent agent = HarnessAgent.builder()
.name("MyAgent")
.model(model)
.workspace(workspace)
.compaction(CompactionConfig.builder()
.triggerMessages(30) // fire at 30 messages
.keepMessages(10) // keep the last 10 after compaction
.build())
.build();
Common options:
Field |
Default |
Meaning |
|---|---|---|
|
|
Trigger by message count ( |
|
|
Trigger by estimated tokens ( |
|
|
Number of tail messages to keep |
|
|
When non-zero, walk back by token budget; overrides |
|
|
Extract new facts to the daily log before compacting |
|
|
Append raw messages to the never-compacted log before compacting |
Auto-recovery on overflow: when the model returns context_length_exceeded (or similar), the framework forces one compaction and retries — but only when compaction(...) is configured; otherwise the error propagates.
Want it lighter? Trim arguments first¶
Tool calls like write_file carry huge arguments that nobody reads later. Before LLM summarization you can run a non-LLM string truncation:
CompactionConfig.builder()
.triggerMessages(80)
.truncateArgs(CompactionConfig.TruncateArgsConfig.builder()
.maxArgLength(2000)
.truncationText("... [truncated] ...")
.build())
.build();
Large tool-result offloading¶
Independent of compaction. When a single tool call returns more than the threshold, the full text is written to a directory and only a head/tail preview + a placeholder is left in context. The agent can read_file for the full content:
HarnessAgent.builder()
...
.toolResultEviction(ToolResultEvictionConfig.defaults())
.build();
Defaults:
Triggered at 80K characters
Keeps ~2K chars at head + tail + a line “full content at
{path}”read_fileis excluded by default (to avoid re-offloading what was just read back)
Customize threshold or destination via ToolResultEvictionConfig.builder()...build().
Tools the agent can use itself¶
When memory is enabled, the agent gets two tools:
memory_search query="..."— keyword scan overMEMORY.md+memory/*.md, up to 30 hitsmemory_get path="memory/2026-06-02.md" startLine=10 endLine=40— read a specific line range
When the model sees a “MEMORY truncated” note in the prompt, it typically calls memory_search to look further back.
Background maintenance¶
When memory is enabled, a throttled background job also runs (triggered at each call() end with a minimum gap, default ~30 minutes max):
Archives daily logs older than 90 days to
memory/archive/Runs one
MEMORY.mdconsolidation passPrunes session logs older than 180 days
All numbers are tunable, but most projects don’t need to touch them.
Turn it off entirely¶
If you want to handle memory yourself or wire your own tools:
HarnessAgent.builder()
...
.disableMemoryHooks() // disables flush + background maintenance
.disableMemoryTools() // skips memory_search / memory_get / session_search registration
.build();