Files
openclaw-workspace/docs/MEMORY_SYSTEM_ARCHITECTURE.md
2026-04-11 09:45:12 -05:00

289 lines
10 KiB
Markdown

# Memory System Architecture
*Diagram of how information flows and persists in the OpenClaw system*
---
## Overview
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ INFORMATION FLOW │
└─────────────────────────────────────────────────────────────────────────────┘
User Conversation
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ ME (Main Agent) │────▶│ Memory Worker │────▶│ SQLite Database │
│ (Real-time) │ │ (Daily 3 AM) │ │ (Structured) │
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ Daily Notes │ │ Query Interface │ │ Stats/Search │
│ (memory/*.md) │◄────│ (On Demand) │◄────│ (SQL/FTS) │
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
┌─────────────────────┐
│ MEMORY.md │
│ (Curated) │
└─────────────────────┘
┌─────────────────────┐
│ Supermemory.ai │
│ (Cloud Backup) │
└─────────────────────┘
```
---
## Storage Layers (By Speed & Persistence)
### 1. ⚡ Session RAM (Transient)
| Aspect | Details |
|--------|---------|
| **What** | Current conversation context, tool outputs, working memory |
| **Writes** | Every message I process |
| **When** | Real-time during conversation |
| **Stored** | Until session ends or compaction (30 min - 4 hours) |
| **Survives** | ❌ Session crash ❌ Gateway restart |
| **Size** | ~100K-250K tokens |
**Risk:** Compaction clears this. The "danger zone" is between last tool use and compaction.
---
### 2. 📝 Daily Notes (Short-term)
| Aspect | Details |
|--------|---------|
| **What** | Raw daily activity, decisions, tasks, errors |
| **Writes** | Pre-compaction flush (automatic) + manual captures |
| **When** | End of productive sessions, before `/compact` |
| **Stored** | `memory/YYYY-MM-DD.md` |
| **Survives** | ✅ Session crash ✅ Gateway restart |
| **Retention** | ~30-90 days (manually reviewed) |
| **Format** | Free-form markdown |
**Written by:** Me (main agent) during heartbeat or EOD ritual
---
### 3. 🧠 MEMORY.md (Long-term)
| Aspect | Details |
|--------|---------|
| **What** | Curated important info, distilled from daily notes |
| **Writes** | Manual review of daily notes, during heartbeats |
| **When** | Every few days, or when something critical happens |
| **Stored** | `MEMORY.md` in workspace root |
| **Survives** | ✅ Everything (file-based) |
| **Retention** | Permanent (manual curation) |
| **Format** | Human-readable markdown |
**Written by:** Me, after reviewing daily notes
---
### 4. 📊 SQLite Database (Structured)
| Aspect | Details |
|--------|---------|
| **What** | Structured: tasks, decisions, facts, projects with salience |
| **Writes** | Memory Worker (automated daily extraction) |
| **When** | Daily 3:00 AM (cron job) |
| **Stored** | `~/.openclaw/memory.db` |
| **Survives** | ✅ File-based |
| **Retention** | Permanent (until manual deletion) |
| **Format** | Relational: cells, scenes, FTS index |
**Written by:** Memory Worker agent (spawned via cron)
**Schema:**
```sql
memory_cells: id, scene, cell_type, salience, content, source_file, created_at
scenes: scene, summary, item_count, updated_at
memory_fts: full-text search index
```
---
### 5. 🌐 Supermemory.ai (Cloud)
| Aspect | Details |
|--------|---------|
| **What** | Full backup of all memory files |
| **Writes** | Supermemory Backup job (automated) |
| **When** | Daily 2:00 AM |
| **Stored** | Supermemory.ai cloud service |
| **Survives** | ✅ Disk failure ✅ Workspace loss |
| **Retention** | Cloud provider dependent |
| **Format** | API-uploaded documents |
**Written by:** Python script via cron job
---
### 6. 📋 Workspace Context (Session Bridge)
| Aspect | Details |
|--------|---------|
| **What** | Current conversation, in-progress, finished today |
| **Writes** | Real-time during session |
| **When** | Continuously updated |
| **Stored** | `workspace-context.md` |
| **Survives** | ✅ Session crash ✅ Channel switch |
| **Retention** | Cleared nightly (~11 PM) |
| **Format** | Structured markdown |
**Special:** Survives between channels and session crashes. Cleared daily.
---
## Retention Summary
| Layer | Retention | Cleared When | Backup |
|-------|-----------|--------------|--------|
| Session RAM | Minutes-hours | Compaction | ❌ |
| Workspace Context | ~24 hours | 11 PM nightly | ❌ |
| Daily Notes | 30-90 days | Manual archive | Supermemory |
| MEMORY.md | Permanent | Manual edit | Supermemory |
| SQLite DB | Permanent | Manual delete | ❌ (local only) |
| Supermemory | Permanent | Cloud provider | N/A (is backup) |
---
## Write Triggers
```
Every Message
├─► Session RAM (immediate)
└─► If important ┐
workspace-context.md
Pre-compaction ┤
memory/YYYY-MM-DD.md
Periodic review ┤
MEMORY.md
Daily 2 AM ┤
Supermemory.ai
Daily 3 AM ┤
SQLite Database
```
---
## Access Patterns
### I (Main Agent) Access:
| Source | When | Purpose |
|--------|------|---------|
| MEMORY.md | Every session startup | Core identity, user prefs, important facts |
| USER.md | Every session startup | Who Corey is |
| SOUL.md | Every session startup | How I should behave |
| workspace-context.md | Every session startup | Current conversation state |
| memory/*.md | During heartbeats | Recent context |
| SQLite DB | On demand | Structured queries ("what tasks pending?") |
### Memory Worker Access:
| Source | When | Purpose |
|--------|------|---------|
| IDENTITY.md | Daily 3 AM | Who it is |
| SOUL.md | Daily 3 AM | Its mission |
| HEARTBEAT.md | Daily 3 AM | What to do (the script) |
| memory/YYYY-MM-DD.md | Daily 3 AM | What to extract |
| SQLite DB | Daily 3 AM | Where to write |
---
## Failure Recovery
### Scenario: Session Crash
-**Survives:** Files (MEMORY.md, daily notes, workspace-context)
-**Lost:** Session RAM (compaction would have cleared anyway)
- 🔄 **Recovery:** Read files on restart, reconstruct context
### Scenario: Gateway Restart
-**Survives:** All files, SQLite DB
-**Lost:** Session state, cron job state (must recreate jobs)
- 🔄 **Recovery:** Gateway restart, verify cron jobs running
### Scenario: Disk Failure
-**Survives:** Supermemory.ai (cloud backup)
-**Lost:** Local files, SQLite DB
- 🔄 **Recovery:** Restore from Supermemory, recreate DB (re-extract from notes)
---
## Key Insights
1. **Text > Brain** — Files persist, my session doesn't
2. **Daily notes = raw, MEMORY.md = curated** — Filter noise from signal
3. **Worker = automated structuring** — Don't have to manually organize everything
4. **Hybrid = best of both** — Human-readable + machine-queryable
5. **Multiple backups** — Local files + cloud (Supermemory) + structured DB
---
*Generated: 2026-02-16*
*System Version: Multi-agent with SQLite extraction*
---
## UPDATE: 2026-03-03 — Supermonkey Memory Vector System
### What's New
**Added Real-Time Session Monitoring:**
- **session_monitor.py** runs every 2 minutes
- Reads OpenClaw's own session transcripts (`.jsonl`)
- Auto-captures conversations every 15 messages
- Stores in SQLite with vector embeddings
**Replaced Supermemory:**
- Old: Supermemory.ai cloud API (flaky, rate-limited)
- New: Local SQLite + sqlite-vector + Ollama embeddings
- Result: Works offline, faster searches, no rate limits
**Database Schema Changes:**
| New Table | Purpose |
|-----------|---------|
| `memory_embeddings` | 768-dim vectors (nomic-embed-text) |
| `session_tracking` | Tracks session checkpoints |
**New Components:**
| Component | Purpose |
|-----------|---------|
| `memory_vector.py` | SQLite-vector wrapper |
| `session_monitor.py` | Auto-capture via transcript reading |
| `tools/search_memories.py` | CLI semantic search |
**Current Stats:**
- Total embeddings: 1,623
- Auto session snapshots: 27
- Tracked sessions: 245
**The Innovation:**
We discovered OpenClaw stores session transcripts in `.openclaw/agents/main/sessions/*.jsonl`. Instead of waiting for message hooks (which don't exist), we read these files directly:
1. Parse JSONL for `role: "user"` messages
2. Track line position with checkpoints
3. At 15 messages: summarize → embed → store
4. Survives restarts, no API limits
**Full Documentation:** `memory/projects/supermonkey-memory-system.md`
---
*Updated: 2026-03-03*
*System Version: Supermonkey — Local vectors, no cloud dependency*