Context Compaction

How Anthracode manages context window limits through tiered compaction

Large language models have finite context windows. Anthracode uses a multi-phase compaction system to manage this limit, preserving the most important information while discarding what is no longer needed.

Compaction runs automatically when the context window approaches its limit, and can also be triggered manually via the /compact slash command in the TUI.

Phases

Compaction follows a multi-stage pipeline, with each stage operating at a different level of granularity:

Pre-phase — Overflow prune

When the context window exceeds its budget, surplus messages are trimmed from the middle of the conversation before compaction proceeds. This prevents overflow conditions while preserving the most recent and most relevant turns.

Phase 1 — LLM Summarization

Generates a compressed summary of the conversation history. The summary preserves:

Decisions made and the rationale
File paths that were modified or read
Commands that were executed
Any user preferences or corrections

L0 — In-memory snapshot

Before any irreversible compaction, Anthracode saves a complete snapshot of the current conversation into the compaction_snapshot table. This enables the /uncompact command to restore the previous state.

The L0 snapshot captures the raw conversation messages (user, assistant, and tool messages with their full parts), which enables the /uncompact command to restore the exact previous state.

L1 — Structured state parsing

Extracts a structured summary from the conversation into typed fields: goal, constraints, progress, decisions, and key references. This structured state is combined with the compaction summary to produce a compact but semantically rich representation.

L2 — High-signal content selection

Instead of pruning, L2 scores individual message parts by informational value and retains the highest-signal ones verbatim alongside the summary. Retention factors:

Higher signal	Lower signal
Tool errors (score: 100)	Status messages
User corrections (score: 90)	Acknowledgments
User text input (score: 70)	Prose between decisions
Decision prose (score: 30)	Repetitive output

The selected content is appended as a “Retained Verbatim” section to the compaction summary, ensuring critical details are not lost.

L3 — Full-text recall indexing

Creates a BM25 FTS5 full-text search index of the conversation. This index powers the recall tool, enabling search across all session history — including messages compacted away — using plain-text queries ranked by relevance.

Compaction guards

Compaction is deferred to avoid unsafe context states. The compaction is blocked when:

An agent subagent task is actively running — compacting would lose the parent agent’s dispatch state
A user response is pending — compacting before the user replies would lose context needed to respond
An error state is present in the conversation — error diagnostics should not be compacted away

Additionally, a circuit breaker tracks consecutive compaction failures. After 3 successive failures (MAX_COMPACTION_STALLS), compaction is suspended to prevent repeated attempts.

Verified compaction

After compaction completes, a read-only verification check scans the compacted state for high-signal markers (file paths, code identifiers, error codes). It checks that these markers from the pruned head still appear in the resulting state.

Verification is a read-only self-check — it never changes what compaction produced. If verification detects that critical markers were lost, the compaction is still applied (no rollback mechanism exists), but a warning is logged.

The `recall` tool

When compaction removes messages, the information is not lost forever. The recall tool provides full-text search across the entire session history — including messages that were compacted away and are no longer visible in the active context.

{
  "permission": {
    "recall": "allow"
  }
}

Parameter	Description
`query`	Plain-text words to search for (punctuation is ignored)
`limit`	Maximum number of results to return (default: 8)

The tool uses BM25 full-text search to rank matching excerpts by relevance. This is especially useful for finding precise file paths, error strings, commands, or user instructions that were summarized away.

# Example: find a specific error message from earlier in the session
recall({ query: "ENOENT src/config.json" })

Uncompact

The uncompact operation reverses the most recent compaction by restoring the L0 snapshot. In the TUI, use the /compact keybind (ctrl+x c) followed by the compact button to trigger uncompact. The /uncompact TUI command is also available.

/uncompact      # Undo the last compaction

Behavior:

Restores the full pre-compaction context from the saved snapshot
Removes the compaction summary message
Deletes the L0 snapshot entry so subsequent /uncompact calls have nothing to restore
Each call undoes only one level of compaction — call repeatedly to undo multiple compactions
Safe to call when no compaction has occurred — it is a no-op if no snapshot exists

Configuration

Compaction behavior can be configured in anthracode.json:

{
  "compaction": {
    "enabled": true,
    "preserve_recent_tokens": 4096,
    "tail_turns": 6,
    "retain_high_signal": true,
    "retain_high_signal_tokens": 2048
  }
}

Option	Type	Default	Description
`enabled`	boolean	`true`	Enable or disable automatic compaction
`preserve_recent_tokens`	number	`4096`	Number of recent tokens to keep verbatim after compaction
`tail_turns`	number	`6`	Number of most recent conversation turns to preserve unchanged
`retain_high_signal`	boolean	`true`	Enable high-signal content retention (L2)
`retain_high_signal_tokens`	number	`2048`	Token budget for retained high-signal verbatim content

Disable compaction entirely:

{
  "compaction": {
    "enabled": false
  }
}