159 lines
3.6 KiB
Markdown
159 lines
3.6 KiB
Markdown
# Supermonkey Memory System
|
|
|
|
**Status:** ✅ Production
|
|
**Created:** 2026-03-02
|
|
**Location:** `~/.openclaw/memory.db`
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
Local semantic memory search using SQLite + Ollama embeddings. Replaces flaky Supermemory cloud API.
|
|
|
|
**Why "Supermonkey"?**
|
|
- Works offline (like a monkey with a typewriter)
|
|
- No cloud dependency
|
|
- Just keeps going
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
### File-Based Pipeline (Daily)
|
|
```
|
|
Memory Files (markdown)
|
|
↓
|
|
memory_embedding_worker.py
|
|
↓
|
|
Ollama (nomic-embed-text) → 768-dim vectors
|
|
↓
|
|
SQLite + sqlite-vector extension
|
|
↓
|
|
Cosine similarity search
|
|
```
|
|
|
|
### Real-Time Session Pipeline (Live)
|
|
```
|
|
Discord/Chat Messages
|
|
↓
|
|
OpenClaw Session Transcript (.jsonl)
|
|
↓
|
|
session_monitor.py (cron every 2 min)
|
|
↓
|
|
Count messages → At 15: summarize → embed → store
|
|
↓
|
|
Ollama (nomic-embed-text)
|
|
↓
|
|
SQLite + sqlite-vector
|
|
```
|
|
|
|
**The Innovation:** Read OpenClaw's own session transcripts to auto-capture conversations without manual tracking or hooks!
|
|
|
|
---
|
|
|
|
## Components
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `memory_vector.py` | Core SQLite-vector wrapper |
|
|
| `memory_embedding_worker.py` | Daily memory file processor |
|
|
| `session_monitor.py` | Real-time transcript capture |
|
|
| `session_snapshotter.py` | Manual session capture |
|
|
| `search_memories.py` | CLI search tool |
|
|
| `bulk_memory_loader.py` | One-time historical import |
|
|
|
|
---
|
|
|
|
## Quick Commands
|
|
|
|
```powershell
|
|
# Search memories
|
|
python tools/search_memories.py "home assistant automation"
|
|
|
|
# Check stats
|
|
python -c "import sqlite3; db=sqlite3.connect(r'C:\Users\admin\.openclaw\memory.db'); c=db.cursor(); c.execute('SELECT COUNT(*) FROM memory_embeddings'); print('Total:', c.fetchone()[0]); c.execute('SELECT COUNT(*) FROM memory_embeddings WHERE source_type=\'auto_session\''); print('Auto snapshots:', c.fetchone()[0]); db.close()"
|
|
|
|
# Run daily worker manually
|
|
python tools/memory_embedding_worker.py --date 2026-03-03
|
|
|
|
# Run session monitor manually
|
|
python tools/session_monitor.py
|
|
```
|
|
|
|
---
|
|
|
|
## Current Stats
|
|
|
|
| Metric | Value |
|
|
|--------|-------|
|
|
| Total embeddings | ~1,623 |
|
|
| Daily notes processed | 818 |
|
|
| Project files | 332 |
|
|
| MEMORY.md sections | 33 |
|
|
| Manual session snapshots | 2 |
|
|
| **Auto session snapshots** | **27** |
|
|
| Tracked sessions | 245 |
|
|
| Active sessions | 243 |
|
|
| Database size | ~5 MB |
|
|
|
|
---
|
|
|
|
## Database Schema
|
|
|
|
### memory_embeddings
|
|
| Column | Type | Description |
|
|
|--------|------|-------------|
|
|
| id | INTEGER | Primary key |
|
|
| source_type | TEXT | daily, memory_md, project, auto_session |
|
|
| source_path | TEXT | File path + section |
|
|
| content_text | TEXT | First 500 chars |
|
|
| embedding | BLOB | 768-dim vector |
|
|
| created_at | TIMESTAMP | Auto-set |
|
|
|
|
### session_tracking
|
|
| Column | Type | Description |
|
|
|--------|------|-------------|
|
|
| session_id | TEXT | OpenClaw UUID |
|
|
| transcript_path | TEXT | Path to .jsonl |
|
|
| last_message_index | INTEGER | Checkpoint |
|
|
| messages_since_snapshot | INTEGER | Counter |
|
|
| is_active | BOOLEAN | Active? |
|
|
|
|
---
|
|
|
|
## Cron Schedule
|
|
|
|
| Job | Schedule | Purpose |
|
|
|-----|----------|---------|
|
|
| Memory Embeddings Daily | 3:00 AM | Process yesterday's memory files |
|
|
| Session Monitor | Every 2 min | Auto-snapshot live conversations |
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
**Ollama not running:**
|
|
```powershell
|
|
ollama serve
|
|
```
|
|
|
|
**Database locked:**
|
|
Close DB Browser for SQLite
|
|
|
|
**Unicode errors in cron:**
|
|
All emojis replaced with ASCII-safe markers
|
|
|
|
---
|
|
|
|
## Future Enhancements
|
|
|
|
- [ ] Keyword filtering alongside vector search
|
|
- [ ] Date range queries
|
|
- [ ] Source type filtering
|
|
- [ ] Embedding quality scoring
|
|
|
|
---
|
|
|
|
**Credit:** Corey's genius idea to read session.json files 💡
|
|
**System:** Operational and self-managing
|