vault backup: 2026-03-12 08:35:13
This commit is contained in:
159
Projects/Memory System/Supermonkey Memory System.md
Normal file
159
Projects/Memory System/Supermonkey Memory System.md
Normal file
@@ -0,0 +1,159 @@
|
||||
# Supermonkey Memory System
|
||||
|
||||
**Status:** ✅ Production
|
||||
**Created:** 2026-03-02
|
||||
**Location:** `~/.openclaw/memory.db`
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Local semantic memory search using SQLite + Ollama embeddings. Replaces flaky Supermemory cloud API.
|
||||
|
||||
**Why "Supermonkey"?**
|
||||
- Works offline (like a monkey with a typewriter)
|
||||
- No cloud dependency
|
||||
- Just keeps going
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
### File-Based Pipeline (Daily)
|
||||
```
|
||||
Memory Files (markdown)
|
||||
↓
|
||||
memory_embedding_worker.py
|
||||
↓
|
||||
Ollama (nomic-embed-text) → 768-dim vectors
|
||||
↓
|
||||
SQLite + sqlite-vector extension
|
||||
↓
|
||||
Cosine similarity search
|
||||
```
|
||||
|
||||
### Real-Time Session Pipeline (Live)
|
||||
```
|
||||
Discord/Chat Messages
|
||||
↓
|
||||
OpenClaw Session Transcript (.jsonl)
|
||||
↓
|
||||
session_monitor.py (cron every 2 min)
|
||||
↓
|
||||
Count messages → At 15: summarize → embed → store
|
||||
↓
|
||||
Ollama (nomic-embed-text)
|
||||
↓
|
||||
SQLite + sqlite-vector
|
||||
```
|
||||
|
||||
**The Innovation:** Read OpenClaw's own session transcripts to auto-capture conversations without manual tracking or hooks!
|
||||
|
||||
---
|
||||
|
||||
## Components
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `memory_vector.py` | Core SQLite-vector wrapper |
|
||||
| `memory_embedding_worker.py` | Daily memory file processor |
|
||||
| `session_monitor.py` | Real-time transcript capture |
|
||||
| `session_snapshotter.py` | Manual session capture |
|
||||
| `search_memories.py` | CLI search tool |
|
||||
| `bulk_memory_loader.py` | One-time historical import |
|
||||
|
||||
---
|
||||
|
||||
## Quick Commands
|
||||
|
||||
```powershell
|
||||
# Search memories
|
||||
python tools/search_memories.py "home assistant automation"
|
||||
|
||||
# Check stats
|
||||
python -c "import sqlite3; db=sqlite3.connect(r'C:\Users\admin\.openclaw\memory.db'); c=db.cursor(); c.execute('SELECT COUNT(*) FROM memory_embeddings'); print('Total:', c.fetchone()[0]); c.execute('SELECT COUNT(*) FROM memory_embeddings WHERE source_type=\'auto_session\''); print('Auto snapshots:', c.fetchone()[0]); db.close()"
|
||||
|
||||
# Run daily worker manually
|
||||
python tools/memory_embedding_worker.py --date 2026-03-03
|
||||
|
||||
# Run session monitor manually
|
||||
python tools/session_monitor.py
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Current Stats
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Total embeddings | ~1,623 |
|
||||
| Daily notes processed | 818 |
|
||||
| Project files | 332 |
|
||||
| MEMORY.md sections | 33 |
|
||||
| Manual session snapshots | 2 |
|
||||
| **Auto session snapshots** | **27** |
|
||||
| Tracked sessions | 245 |
|
||||
| Active sessions | 243 |
|
||||
| Database size | ~5 MB |
|
||||
|
||||
---
|
||||
|
||||
## Database Schema
|
||||
|
||||
### memory_embeddings
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| id | INTEGER | Primary key |
|
||||
| source_type | TEXT | daily, memory_md, project, auto_session |
|
||||
| source_path | TEXT | File path + section |
|
||||
| content_text | TEXT | First 500 chars |
|
||||
| embedding | BLOB | 768-dim vector |
|
||||
| created_at | TIMESTAMP | Auto-set |
|
||||
|
||||
### session_tracking
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| session_id | TEXT | OpenClaw UUID |
|
||||
| transcript_path | TEXT | Path to .jsonl |
|
||||
| last_message_index | INTEGER | Checkpoint |
|
||||
| messages_since_snapshot | INTEGER | Counter |
|
||||
| is_active | BOOLEAN | Active? |
|
||||
|
||||
---
|
||||
|
||||
## Cron Schedule
|
||||
|
||||
| Job | Schedule | Purpose |
|
||||
|-----|----------|---------|
|
||||
| Memory Embeddings Daily | 3:00 AM | Process yesterday's memory files |
|
||||
| Session Monitor | Every 2 min | Auto-snapshot live conversations |
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Ollama not running:**
|
||||
```powershell
|
||||
ollama serve
|
||||
```
|
||||
|
||||
**Database locked:**
|
||||
Close DB Browser for SQLite
|
||||
|
||||
**Unicode errors in cron:**
|
||||
All emojis replaced with ASCII-safe markers
|
||||
|
||||
---
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
- [ ] Keyword filtering alongside vector search
|
||||
- [ ] Date range queries
|
||||
- [ ] Source type filtering
|
||||
- [ ] Embedding quality scoring
|
||||
|
||||
---
|
||||
|
||||
**Credit:** Corey's genius idea to read session.json files 💡
|
||||
**System:** Operational and self-managing
|
||||
|
||||
Reference in New Issue
Block a user