Skip to main content

Nanobot Memory Enhancement — Implementation Plan

Date: 2026-03-25 Based on: OpenClaw memory research Goal: Bring openclaw-style long-term memory into nanobot with minimal core loop changes

Current State (What We Have)

File-based memory (nanobot/agent/memory.py)

  • MemoryStore: manages memory/MEMORY.md (long-term curated facts)
  • MemoryConsolidator: LLM-driven consolidation that archives old messages into MEMORY.md when context exceeds half the window
  • MEMORY.md content is injected into system prompt via ContextBuilder
  • Consolidation uses forced save_memory tool call with fallback to raw archiving

Vector memory (nanobot/agent/vector_memory.py)

  • VectorMemoryManager: indexes workspace/docs/<project>/ with hybrid BM25 + vector search
  • SQLite + sqlite-vec + FTS5, embeddings via litellm
  • Background sync loop (60s interval), project-scoped search
  • VectorSearchTool: exposes vector_search tool to the agent

Sessions (nanobot/session/manager.py)

  • JSONL files in workspace/sessions/ (one per channel:chat_id)
  • Metadata line + append-only message dicts
  • get_history() returns messages[last_consolidated:]

What’s Missing (vs OpenClaw)

  1. No dated memory files ✅ Implemented (Phase 1)
  2. No pre-compaction memory flush ✅ Implemented (Phase 2)
  3. Memory files not searchable ✅ Implemented (Phase 3)
  4. No memory_search tool ✅ Implemented (Phase 4)
  5. HISTORY.md redundant ✅ Removed (replaced by dated memory files + memory_search)
  6. No temporal decay ✅ Implemented (separate config for memory vs docs)

What We’re Building

Four features, in order of implementation:
#FeatureInspired ByCore Loop Impact
1Session-end dated memory filesOpenClaw session-memory hook1 line in /new handler
2Pre-compaction memory flushOpenClaw memory-flush.ts1 call before consolidation
3Memory file indexingOpenClaw memory-core indexingExtend VectorMemoryManager
4memory_search toolOpenClaw memory_search tool1 tool registration
What we keep unchanged:
  • Current hybrid BM25 + vector search algorithm
  • Current embedding pipeline (litellm)
  • Current consolidation logic (MemoryConsolidator)
  • Current session JSONL format
  • MEMORY.md in system prompt

Phase 1: Session-End Dated Memory Files

Goal: When /new is called, summarize the ending session into memory/YYYY-MM-DD-slug.md.

New class: SessionMemoryWriter

File: nanobot/agent/memory.py (add to existing file)
class SessionMemoryWriter:
    """Creates dated memory files from session transcripts on session end."""

    def __init__(self, workspace: Path, provider: LLMProvider, model: str):
        self.memory_dir = workspace / "memory"
        self.provider = provider
        self.model = model

    async def write_session_memory(
        self, session_key: str, messages: list[dict], max_messages: int = 20,
    ) -> Path | None:
        """Summarize recent messages and write memory/YYYY-MM-DD-slug.md."""

Logic

  1. Take the last max_messages user/assistant messages from the session snapshot (skip tool calls/results for the LLM summary input)
  2. If fewer than 3 user messages, skip (trivial session, not worth persisting)
  3. Call the LLM with a dedicated prompt:
    • System: “You are a memory writer. Generate a concise summary of this conversation.”
    • Include a create_session_memory forced tool call with parameters:
      • slug: short kebab-case filename slug (max 40 chars, e.g. “api-design-review”)
      • summary: markdown summary of key topics, decisions, and facts
  4. Write to memory/YYYY-MM-DD-slug.md:
# Session Memory: 2026-03-25 14:30

- **Session**: telegram:12345
- **Date**: 2026-03-25 14:30:00

## Summary

[LLM-generated summary here]

## Key Messages

user: [first user message excerpt]
assistant: [first assistant response excerpt]
...
  1. If LLM call fails, write a raw transcript excerpt as fallback (same pattern as _raw_archive in MemoryStore)

Integration point in loop.py

Current /new handler (line 476):
if cmd == "/new":
    snapshot = session.messages[session.last_consolidated:]
    session.clear()
    self.sessions.save(session)
    self.sessions.invalidate(session.key)
    if snapshot:
        self._schedule_background(self.memory_consolidator.archive_messages(snapshot))
    return OutboundMessage(...)
Add one line — schedule session memory write as background task:
if cmd == "/new":
    snapshot = session.messages[session.last_consolidated:]
    session.clear()
    self.sessions.save(session)
    self.sessions.invalidate(session.key)
    if snapshot:
        self._schedule_background(self.memory_consolidator.archive_messages(snapshot))
        self._schedule_background(self.session_memory_writer.write_session_memory(
            session.key, snapshot,
        ))
    return OutboundMessage(...)

Config addition (schema.py)

class SessionMemoryConfig(Base):
    """Configuration for session-end memory files."""
    enabled: bool = True
    max_messages: int = 20       # Max messages to include in summary
    min_user_messages: int = 3   # Skip trivial sessions
Nest under ToolsConfig alongside vector_memory:
class ToolsConfig(Base):
    ...
    session_memory: SessionMemoryConfig = Field(default_factory=SessionMemoryConfig)

Files modified

  • nanobot/agent/memory.py — add SessionMemoryWriter class (~80 lines)
  • nanobot/agent/loop.py — add self.session_memory_writer init + 1 line in /new handler
  • nanobot/config/schema.py — add SessionMemoryConfig

Phase 2: Pre-Compaction Memory Flush

Goal: Before maybe_consolidate_by_tokens starts archiving messages, run a dedicated LLM turn that writes durable memories to memory/YYYY-MM-DD.md.

New class: MemoryFlusher

File: nanobot/agent/memory.py (add to existing file)
class MemoryFlusher:
    """Runs a dedicated LLM turn to extract durable memories before context compaction."""

    def __init__(self, workspace: Path, provider: LLMProvider, model: str):
        self.memory_dir = workspace / "memory"
        self.provider = provider
        self.model = model
        self._flushed_sessions: set[str] = set()  # track per-compaction-cycle

    async def maybe_flush(
        self,
        session_key: str,
        messages: list[dict],
        estimated_tokens: int,
        context_window: int,
        threshold_tokens: int = 4000,
    ) -> bool:
        """Flush memories if tokens are near the limit. Returns True if flushed."""

Logic

  1. Trigger condition: estimated_tokens > context_window - threshold_tokens
  2. Guard: Skip if already flushed for this session key in current compaction cycle (reset on session clear)
  3. Deduplicate: Read existing memory/YYYY-MM-DD.md to avoid writing duplicate content
  4. Call LLM with system prompt:
    You are a memory extraction agent. Review the conversation below and extract
    important facts, decisions, preferences, and action items worth remembering.
    
    Write them as bullet points. If there's nothing important to remember, respond
    with exactly "[silent]".
    
    Target file: memory/YYYY-MM-DD.md (append only, never overwrite existing content)
    
  5. If LLM returns [silent], skip write
  6. Otherwise, append to memory/YYYY-MM-DD.md (create if not exists):
## Flush at 14:30 (session: telegram:12345)

- User prefers REST over GraphQL for the new API
- Decision: use PostgreSQL for the analytics service
- Action item: review the vendor proposal by Friday

Integration point

In MemoryConsolidator.maybe_consolidate_by_tokens() (memory.py line 302), add a flush call before the consolidation loop:
async def maybe_consolidate_by_tokens(self, session: Session) -> None:
    if not session.messages or self.context_window_tokens <= 0:
        return
    lock = self.get_lock(session.key)
    async with lock:
        target = self.context_window_tokens // 2
        estimated, source = self.estimate_session_prompt_tokens(session)
        if estimated <= 0 or estimated < self.context_window_tokens:
            return

        # NEW: Pre-compaction memory flush
        if self.memory_flusher:
            await self.memory_flusher.maybe_flush(
                session.key,
                session.messages[session.last_consolidated:],
                estimated,
                self.context_window_tokens,
            )

        # ... existing consolidation loop continues ...

Config addition (schema.py)

class MemoryFlushConfig(Base):
    """Configuration for pre-compaction memory flush."""
    enabled: bool = True
    threshold_tokens: int = 4000  # Flush when within this many tokens of the limit

Files modified

  • nanobot/agent/memory.py — add MemoryFlusher class (~70 lines), modify MemoryConsolidator.__init__ to accept it
  • nanobot/agent/loop.py — pass MemoryFlusher instance to MemoryConsolidator
  • nanobot/config/schema.py — add MemoryFlushConfig

Phase 3: Memory File Indexing

Goal: Extend VectorMemoryManager to also index memory/MEMORY.md and memory/*.md files, making them searchable alongside project docs.

Changes to VectorMemoryManager

File: nanobot/agent/vector_memory.py

1. Extend _discover_files() to include memory files

Current behavior: walks docs/<project>/ directories only. New behavior: also discovers memory/MEMORY.md and memory/*.md, using a reserved project name _memory.
def _discover_files(self) -> list[tuple[Path, str]]:
    results: list[tuple[Path, str]] = []

    # Existing: docs/<project>/ files
    if self.docs_dir.exists():
        # ... existing docs discovery code (unchanged) ...

    # NEW: memory files
    if self.memory_dir.exists():
        memory_md = self.workspace / "MEMORY.md"  # root MEMORY.md
        if memory_md.exists():
            results.append((memory_md, "_memory"))
        for fpath in sorted(self.memory_dir.glob("**/*.md")):
            if not fpath.name.startswith("."):
                results.append((fpath, "_memory"))

    return results

2. Add self.memory_dir in __init__

def __init__(self, workspace: Path, config: VectorMemoryConfig):
    ...
    self.memory_dir = workspace / "memory"
    ...

3. Add search_memory() convenience method

async def search_memory(self, query: str, max_results: int = 10) -> list[dict[str, Any]]:
    """Search memory files specifically (project='_memory')."""
    return await self.search(query, "_memory", max_results)

No changes to search algorithm

The existing hybrid BM25 + vector search works identically — memory files are just another “project” in the same SQLite database.

Files modified

  • nanobot/agent/vector_memory.py — extend _discover_files(), add memory_dir, add search_memory()

Phase 4: memory_search Tool

Goal: Give the agent a dedicated tool to search its own memory files.

New file: nanobot/agent/tools/memory_search.py

class MemorySearchTool(Tool):
    """Search agent memory files using hybrid semantic + keyword search."""

    name = "memory_search"
    description = (
        "Search your memory files (MEMORY.md and memory/*.md dated logs) "
        "using hybrid semantic + keyword search. Use this to recall past "
        "conversations, decisions, facts, and preferences."
    )
    parameters = {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "Search query — a question, keywords, or description of what you're looking for",
            },
        },
        "required": ["query"],
    }

    def __init__(self, manager: VectorMemoryManager):
        self._manager = manager

    async def execute(self, query: str, **kwargs: Any) -> str:
        query = query.strip()
        if not query:
            return "Error: query is required."
        results = await self._manager.search_memory(query, max_results=10)
        if not results:
            return f"No memories found for '{query}'."
        # Format results (same style as VectorSearchTool)
        lines = [f"Found {len(results)} memory result(s) for '{query}':\n"]
        for i, r in enumerate(results, 1):
            score = r.get("score", 0)
            path = r.get("path", "?")
            start = r.get("start_line", "?")
            end = r.get("end_line", "?")
            snippet = r.get("snippet", "")
            lines.append(f"[{i}] {path} (lines {start}-{end}, score: {score:.2f})")
            lines.append(snippet.strip())
            lines.append("")
        return "\n".join(lines)

Registration in loop.py

In _register_default_tools(), after existing vector_search registration:
if self.vector_memory:
    from nanobot.agent.tools.vector_search import VectorSearchTool
    from nanobot.agent.tools.memory_search import MemorySearchTool
    self.tools.register(VectorSearchTool(self.vector_memory))
    self.tools.register(MemorySearchTool(self.vector_memory))

System prompt update (context.py)

Update the workspace description in _get_identity() to mention memory_search:
- Long-term memory: {workspace_path}/memory/MEMORY.md (write important facts here)
- Memory files: {workspace_path}/memory/*.md (dated session logs, searchable via memory_search tool)

Files modified

  • nanobot/agent/tools/memory_search.py — new file (~50 lines)
  • nanobot/agent/loop.py — 2 lines (import + register)
  • nanobot/agent/context.py — update system prompt text

Architecture After Implementation

┌──────────────────────────────────────────────────────────────┐
│                     Agent Loop                                │
│                                                               │
│  ┌─────────────┐  ┌───────────────┐  ┌───────────────────┐  │
│  │ System Prompt│  │ memory_search │  │ vector_search     │  │
│  │ (MEMORY.md) │  │ (tool)        │  │ (tool — docs/)    │  │
│  └──────┬──────┘  └───────┬───────┘  └────────┬──────────┘  │
│         │                 │                     │             │
│         │                 ▼                     ▼             │
│         │     ┌─────────────────────────────────────┐        │
│         │     │   VectorMemoryManager (SQLite)       │        │
│         │     │  ┌─────────┐  ┌──────────────────┐  │        │
│         │     │  │ FTS5    │  │ sqlite-vec       │  │        │
│         │     │  │ (BM25)  │  │ (cosine dist)    │  │        │
│         │     │  └─────────┘  └──────────────────┘  │        │
│         │     │  project="_memory" | project="xyz"   │        │
│         │     └─────────────────────────────────────┘        │
│         │                      ▲                              │
│         │                      │ indexes                      │
│         │     ┌────────────────┴────────────────────┐        │
│         │     │         Source Files                  │        │
│         │     │  MEMORY.md          (always in prompt)│        │
│         │     │  memory/*.md        (dated memories)  │        │
│         │     │  docs/<project>/*   (project docs)    │        │
│         │     └─────────────────────────────────────┘        │
│         │                      ▲                              │
│         │                      │ writes                       │
│         │     ┌────────────────┴────────────────────┐        │
│         │     │      Memory Writers                   │        │
│         │     │  • Agent (direct MEMORY.md edits)     │        │
│         │     │  • SessionMemoryWriter (/new hook)    │        │
│         │     │  • MemoryFlusher (pre-compaction)     │        │
│         │     │  • MemoryConsolidator (existing)      │        │
│         │     └─────────────────────────────────────┘        │
└──────────────────────────────────────────────────────────────┘

Data flow: session lifecycle

Session active (messages flowing)

  ├─ Context nears limit?
  │   YES → MemoryFlusher extracts memories → memory/YYYY-MM-DD.md
  │        → MemoryConsolidator archives old messages → MEMORY.md

  └─ User sends /new?
      YES → SessionMemoryWriter summarizes session → memory/YYYY-MM-DD-slug.md
           → MemoryConsolidator archives remainder → MEMORY.md
           → Session cleared

Next session starts:
  ├─ MEMORY.md loaded into system prompt (existing)
  ├─ memory/*.md indexed by VectorMemoryManager (new)
  └─ Agent can call memory_search to find past conversations (new)

Implementation Order & Effort

PhaseFeatureNew CodeModified FilesEffort
1Session-end dated memory files~80 linesmemory.py, loop.py, schema.py2-3 hours
2Pre-compaction memory flush~70 linesmemory.py, loop.py, schema.py2-3 hours
3Memory file indexing~20 linesvector_memory.py1 hour
4memory_search tool~50 linesnew file + loop.py, context.py1-2 hours
Total: ~220 lines of new code, ~6-9 hours of work.

Dependencies

  • Phase 3 and 4 require vector memory to be enabled (tools.vector_memory.enabled: true)
  • Phase 1 and 2 are independent of vector memory — they write plain markdown files
  • Phase 4 depends on Phase 3 (memory files must be indexed to be searchable)
  • Phase 1 and 2 can be implemented in parallel

What We’re NOT Doing (and why)

FeatureReason to skip
Multiple embedding providersWe already have litellm which supports all providers
LanceDB pluginSeparate concern, memory-core approach is sufficient
QMD backendExperimental in openclaw, unnecessary complexity
File watcher (chokidar)Background sync loop already catches changes on next cycle
Embedding cache tablelitellm handles caching; our hash-based skip logic is sufficient
Session JSONL indexingDated memory files already capture session content in better form
MMR re-rankingNot enabled by default in openclaw either; keep simple
Temporal decayNot enabled by default in openclaw; our recency is implicit via dated files
Plugin slot systemNanobot doesn’t have plugins; direct integration is simpler
Auto-recall/auto-captureToo invasive for “minimal core loop changes” goal

Config Reference (Final)

{
  "tools": {
    "vectorMemory": {
      "enabled": true,
      "embeddingModel": "openrouter/openai/text-embedding-3-small",
      "embeddingApiKey": "",
      "chunkTokens": 512,
      "chunkOverlap": 64,
      "syncIntervalSeconds": 60,
      "vectorWeight": 0.7,
      "textWeight": 0.3
    },
    "sessionMemory": {
      "enabled": true,
      "maxMessages": 20,
      "minUserMessages": 3
    },
    "memoryFlush": {
      "enabled": true,
      "thresholdTokens": 4000
    }
  }
}

Risk Mitigation

  1. LLM call failures: Both SessionMemoryWriter and MemoryFlusher have fallback to raw text dump (same pattern as existing _fail_or_raw_archive)
  2. Duplicate writes: MemoryFlusher tracks flushed sessions per-compaction-cycle; SessionMemoryWriter generates unique slugs per session
  3. Disk space: Dated memory files are small (1-5 KB each). At one session per day, that’s ~1.5 MB/year
  4. Index consistency: VectorMemoryManager’s periodic sync naturally picks up new memory files within 60 seconds
  5. Backward compatibility: All new features are opt-in via config. Defaults match current behavior except session_memory.enabled: true (which is safe — it only writes files on /new)