How we built a 3-layer memory architecture that bridges OpenClaw and Claude Code into a single brain — with real numbers from 33 days of operation.
Most AI agent setups have a memory problem: they either forget everything between sessions (stateless) or accumulate noise until the context window overflows. RAG helps with retrieval but doesn't build understanding. The LLM rediscovers knowledge from scratch on every query.
Karpathy's LLM Wiki proposes a compelling alternative: a persistent, compounding wiki maintained by the LLM. Great idea — but designed for a researcher browsing Obsidian. We needed something for an operational AI agent running a business with 8 stores, 20 cron jobs, 7 services, and two different AI platforms (OpenClaw + Claude Code).
This document describes what we built, what worked, what didn't, and the decisions behind each choice.
┌─────────────────────────────────────────────────────────┐
│ TWO ENTRY POINTS │
│ │
│ OpenClaw (Telegram) Claude Code (VS Code) │
│ │ │ │
│ ▼ ▼ │
│ JSONL transcriptions claude-mem plugin │
│ │ (auto-capture) │
│ ▼ │ │
│ Bridge (cron */30) │ │
│ │ │ │
│ └──────────┐ ┌──────────────┘ │
│ ▼ ▼ │
│ ┌──────────────┐ │
│ │ claude-mem │ Layer 1: Subconscious │
│ │ SQLite + DB │ 1,493 observations │
│ │ Chroma 46MB │ Vector search │
│ └──────┬───────┘ │
│ │ │
│ auto-precompact (Sonnet, daily 23h) │
│ │ │
│ ┌──────▼───────┐ │
│ │ Workspace MD │ Layer 2: Conscious │
│ │ decisions.md │ Curated, structured │
│ │ lessons.md │ Read at boot │
│ │ pending.md │ │
│ │ wip.md │ │
│ └──────┬───────┘ │
│ │ │
│ ┌──────▼───────┐ │
│ │ Auto-memory │ Layer 3: Persistent │
│ │ 16 .md files │ Cross-session identity │
│ │ (CC native) │ │
│ └──────────────┘ │
│ │
│ ┌──────────────┐ │
│ │ Schema files │ How to operate │
│ │ SOUL.md │ Who I am │
│ │ AGENTS.md │ Rules & protocols │
│ │ CLAUDE.md │ Boot sequence │
│ └──────────────┘ │
└─────────────────────────────────────────────────────────┘
What: Automatic capture of everything that happens in every session. No manual intervention.
How: The claude-mem plugin hooks into Claude Code's tool calls and extracts observations — structured summaries of what happened, what was decided, what changed. These go into a SQLite database and are indexed in ChromaDB for vector search.
Bridge: OpenClaw sessions happen in Telegram, not Claude Code. A Python bridge (openclaw_mem_bridge.py) reads OpenClaw's JSONL transcriptions and inserts observations into the same SQLite DB. Runs every 30 minutes via cron. Zero LLM tokens — pure heuristic extraction.
Numbers:
- 1,493 observations in 33 days
- 46 MB of vector embeddings
- 4 projects tracked (workspace, openclaw, root, tmux-executor)
Key decision: We chose claude-mem over OpenClaw's built-in memory-core because claude-mem had 1,200+ observations vs memory-core's empty database (Gemini embeddings were failing silently). OpenClaw's memory slot is exclusive — only one memory plugin can run at a time.
What: Curated, structured files that both OpenClaw and Claude Code read at boot. This is the shared brain.
| File | Purpose | Size |
|---|---|---|
MEMORY.md |
Index — points to everything else | 85 lines |
memory/decisions.md |
Permanent decisions — never revisit | 42 KB |
memory/lessons.md |
Lessons learned (strategic + tactical) | 89 KB |
memory/pending.md |
Open items waiting for action | 7 KB |
memory/wip.md |
Work in progress — where we stopped | ~1 KB |
memory/people.md |
Who is who | 2 KB |
memory/projects.md |
Project status | varies |
memory/YYYY-MM-DD.md |
Daily logs (39 files) | varies |
feedback/approved.json |
Patterns to repeat | 4 KB |
feedback/rejected.json |
Patterns to never repeat | 2 KB |
Key insight: These files sit on disk in the OpenClaw workspace. Both platforms read the same files. One source of truth, two consumers.
What: Claude Code's native memory system — 16 markdown files with YAML frontmatter that persist across conversations.
| Type | Count | Examples |
|---|---|---|
| user | 2 | Agent identity, user profile |
| feedback | 6 | "crons use bash not LLM", "backtest before deploy" |
| project | 4 | Credit scoring v3, CRM portal status |
| reference | 3 | Infrastructure ports, fiscal structure |
Key insight: This layer stores how to behave, not what happened. It tells future sessions things like "this user has ADHD — keep responses short" or "never mock the database in tests."
The most important automation. Runs via claude --print (Claude Sonnet, subscription — zero extra cost) with Gemini Flash as fallback.
What it does:
- Reads last 12 hours of observations from claude-mem (both Claude Code AND OpenClaw sessions)
- Reads today's daily log and current state of decisions/lessons/pending
- Asks Sonnet to extract: decisions, lessons, resolved items, new pending items, work in progress
- Writes results to the appropriate files with deduplication
What it extracts:
| Field | Max per run | Filter |
|---|---|---|
| Decisions | 2 | Must be permanent ("always do X"). Most sessions have 0. |
| Lessons | 3 | Must be an error that cost time. Not things that worked. |
| Pending new | 3 | Must not exist already. Key-phrase dedup. |
| Pending resolved | unlimited | Exact text match against existing items. |
| Work in progress | 3 | Tasks started but not finished. Preserves continuity. |
| Feedback | unlimited | Only explicit user approval. |
Why not the OpenClaw gateway? Anthropic blocked OAuth-based API proxying. claude --print is the official CLI, uses the subscription directly, and produces better results (Sonnet > Haiku).
Tactical lessons (marked with ⏳) expire after 30 days. Strategic lessons (marked with 🔒) are permanent. The pruner also removes tactical lessons that duplicate strategic ones (60% word overlap threshold).
Imports OpenClaw Telegram sessions into claude-mem. Heuristic extraction from JSONL — no LLM tokens consumed.
Every session starts by reading (in order):
SOUL.md— agent identity and valuesAGENTS.md— operational rulesMEMORY.md— index of everythingmemory/YYYY-MM-DD.md— today's logmemory/pending.md— open itemsmemory/wip.md— where we left offfeedback/approved.json— patterns to repeatfeedback/rejected.json— patterns to avoid
Token budget: ~8-10K tokens. Down from ~20K after pruning historical deliveries out of MEMORY.md.
The insight that changed our architecture came from comparing Karpathy's LLM Wiki with OriginMind's Creative DNA critique.
Karpathy says: good query answers should be filed back into the wiki. OriginMind says: systems should preserve momentum, not just artifacts.
Both point to the same gap: what happens between sessions?
Our system captured what was decided and what was learned, but not what was in progress. Every new session started from zero conceptual state.
The fix: wip.md — a file that the auto-precompact overwrites each run with whatever was being worked on but not finished. The next session reads it at boot and knows exactly where to pick up.
# Work in Progress
*Updated: 2026-04-08 18:31 UTC — generated by auto-precompact*
## Agent Cris — integration tests post-boot
- **Status:** Systemd service running (PID 850282, port 8001), test list was cut before completion
- **Next step:** Test CV reception via Evolution API webhook and validate full screening flow
- **Context:** /etc/systemd/system/rh-agente-cris.service, http://0.0.0.0:8001| Idea | Why we rejected it |
|---|---|
| Obsidian as UI | Our primary consumer is the LLM, not a human browsing. No one clicks wiki links. |
| Cross-referencing between files | Chroma does semantic cross-referencing on demand. Explicit [[links]] in markdown would be written but never read. |
| Chroma in boot | Boot runs before knowing what the user wants. Injecting random observations adds noise, not value. Chroma serves on-demand queries. |
| Entity pages (one per person/system) | Doesn't scale for an operation with 130 employees. One people.md file is enough. |
| LLM-powered dreaming/consolidation | OpenClaw's built-in dreaming ran 3 times, promoted 0 insights. Auto-precompact with Sonnet does better extraction. Dreaming delegated back to OpenClaw. |
| Flashcards (SM-2) | Built 453 cards. Nobody reviewed them. The knowledge already exists in decisions.md and lessons.md. Archived. |
| Decision | Why |
|---|---|
claude --print over API keys |
Official client, uses subscription, zero extra cost, better model (Sonnet). |
| Bash wrappers over LLM for crons | 19 of 20 crons are deterministic. Zero tokens, guaranteed execution. Only precompact needs LLM. |
| Single pending.md | Had two files diverging silently. One source of truth, no symlinks. |
| Tactical lessons expire | lessons.md was 89 KB and growing. Tactical items (⏳) auto-prune after 30 days. Strategic items (🔒) are permanent. |
| Include OpenClaw observations | The user makes decisions in Telegram too. Filtering them out meant losing context. |
| Metric | Value |
|---|---|
| Total observations | 1,493 |
| Permanent decisions | 39 |
| Lessons learned | 64 (39 strategic + 25 tactical) |
| Daily logs | 39 files |
| Auto-memory files | 16 |
| Vector embeddings | 46 MB |
| Boot tokens | ~8-10K |
| Crons running | 20 (1 uses LLM) |
| Services running | 7 |
| Cost of memory system | $0 (subscription + Gemini free tier) |
- Monitor WIP quality — new feature, needs validation over 5-10 sessions
- Lint for contradictions — pending.md says "CRM 0% execution" while projects.md says "CRM operational"
- Query → file back — when the agent synthesizes a good answer, save it as a wiki page (Karpathy's insight)
The architecture is platform-agnostic. You need:
- A capture layer — claude-mem, or any system that records what happens in sessions
- A consolidation process — our auto-precompact runs daily, uses
claude --printfor extraction - Structured markdown files — decisions, lessons, pending, WIP
- A boot sequence — CLAUDE.md or equivalent that tells the agent what to read on startup
- A feedback loop — approved.json/rejected.json so the system learns from corrections
The key principle: the LLM is the primary consumer of memory, not the human. Design for machine reading (structured, deduplicated, minimal), not human browsing (interlinked, visual, explorable).
Built with OpenClaw + Claude Code + claude-mem. Running in production since March 7, 2026.