Add vault content - WORK folder, Tasks, Projects, Summaries, Templates

This commit is contained in:
AlexAI
2026-02-24 09:46:07 -06:00
parent bc3284ad69
commit c402fb1161
41 changed files with 4299 additions and 0 deletions

View File

@@ -0,0 +1,148 @@
# 7 Best Apify Alternatives (2026)
**Source:** https://www.gumloop.com/blog/apify-alternatives
**Summarized:** 2026-02-23
---
## TL;DR
Apify is powerful but built for developers. These 7 alternatives offer web scraping with varying levels of no-code friendliness, AI integration, and pricing.
---
## What to Look For in an Apify Alternative
- **Built-in web scraping** — Native scraping vs HTTP parsing from scratch
- **Tech stack integration** — Google Sheets, Slack, Notion, CRM; MCP servers = bonus
- **LLM integration** — Pass scraped data through GPT/Claude/Gemini for enrichment
- **Security & scale** — RBAC, audit logs, SOC 2 if enterprise
- **Custom code support** — Python/JS for advanced scenarios
- **Templates/AI assistants** — Pre-built templates or AI that builds workflows for you
---
## The 7 Alternatives
### 1. Gumloop ⭐ (Author's Pick)
| | |
|---|---|
| **Best for** | AI agents + workflows that scrape, analyze, and act |
| **Pricing** | Free (2K credits) → $37/mo (Solo) → $244/mo (Team) |
**Why it wins:**
- "Gummie" AI assistant builds workflows via natural language
- Built-in web scraping + any LLM integration (no extra API keys needed)
- MCP server support for connecting to any tool
- Create autonomous AI agents that handle scraping tasks independently
**Gotcha:** Not scraping-specific; template library limited (but Gummie makes templates obsolete)
---
### 2. Octoparse
| | |
|---|---|
| **Best for** | No-code scraping with 500+ preset templates |
| **Pricing** | Free (10 tasks, local only) → $83/mo (Standard) → $299/mo (Pro) |
**Why it wins:**
- Visual crawler builder for non-technical users
- 500+ templates: Google Maps, LinkedIn, Amazon scrapers
- IP rotation + CAPTCHA solving built-in
- Run locally or cloud
**Gotcha:** Workflows are rigid; struggles with complex multi-step flows; no AI agent features
---
### 3. n8n
| | |
|---|---|
| **Best for** | Technical teams wanting self-hosted, open-source automation |
| **Pricing** | Free (self-hosted) → $24/mo (cloud) |
**Why it wins:**
- Open source, self-hostable = full data control
- Visual workflow builder with web scraping nodes
- Connect to any API/service
**Gotcha:** Requires technical setup; not scraping-specific
---
### 4. Relay.app
| | |
|---|---|
| **Best for** | Workflow automation with human-in-the-loop steps |
| **Pricing** | Free tier available |
**Why it wins:**
- Combines automation with human approval steps
- Good for workflows that need review before action
---
### 5. Thunderbit
| | |
|---|---|
| **Best for** | AI-powered web data extraction |
| **Pricing** | Varies |
**Why it wins:**
- AI-first approach to scraping
- Handles dynamic content well
---
### 6. Browse AI
| | |
|---|---|
| **Best for** | No-code web monitoring and data extraction |
| **Pricing** | Free tier available |
**Why it wins:**
- Record actions → replay automatically
- Monitor sites for changes
- Pre-built robots for common tasks
---
### 7. Claude (Direct)
| | |
|---|---|
| **Best for** | One-off scraping/analysis with AI |
| **Pricing** | Subscription-based |
**Why it wins:**
- Can scrape and analyze web content directly
- No setup required for simple tasks
- Great for ad-hoc research
---
## Quick Comparison
| Tool | No-Code | AI Built-In | Self-Host | Price |
|------|---------|-------------|-----------|-------|
| Gumloop | ✅ | ✅ | ❌ | $0-244/mo |
| Octoparse | ✅ | ❌ | ❌ | $0-299/mo |
| n8n | Partial | ❌ | ✅ | $0-24/mo |
| Browse AI | ✅ | ❌ | ❌ | Freemium |
| Claude | N/A | ✅ | ❌ | Subscription |
---
## Bottom Line
- **Gumloop** = Best for AI-powered scraping + automation in one platform
- **Octoparse** = Best no-code scraping with templates
- **n8n** = Best for devs who want open-source + self-host
- **Claude** = Best for one-off AI scraping without setup

View File

@@ -0,0 +1,90 @@
---
title: Detecting and Preventing Distillation Attacks
category: Summary
type: Security/AI
source_url: https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks
source: Anthropic News
date: 2026-02-23
tags: [anthropic, ai, security, distillation, deepseek, moonshot, minimax]
---
# Detecting and Preventing Distillation Attacks
**URL:** https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks
**Source:** Anthropic News
**Date Summarized:** 2026-02-23
---
## tl;dr
Anthropic identified three AI labs (DeepSeek, Moonshot, MiniMax) running industrial-scale campaigns to extract Claude's capabilities through "distillation" — generating over 16 million exchanges via 24,000+ fraudulent accounts to train their own models on Claude's outputs.
---
## What is Distillation?
**Definition:** Training a smaller/less capable model on outputs from a stronger one.
**Legitimate Use:** Frontier labs distill their own models to create smaller, cheaper versions for customers.
**Illicit Use:** Competitors extract powerful capabilities from other labs at fraction of the cost/time.
---
## Why It Matters
### National Security Risks
- Illicitly distilled models **lack safeguards**
- Protections against bioweapons, cyber attacks, etc. are stripped out
- Dangerous capabilities proliferate without protections
### Authoritarian Use
- Foreign labs can feed distilled models into military/intelligence/surveillance
- Enables offensive cyber operations, disinformation, mass surveillance
- Open-sourced distilled models spread beyond any government's control
---
## Export Control Implications
- Distillation attacks **undermine export controls**
- Allows foreign labs (including CCP-controlled) to close competitive gaps
- Rapid "advancements" by these labs are actually **extracted capabilities**, not innovation
- Restricted chip access limits both:
- Direct model training
- Scale of illicit distillation campaigns
---
## What Anthropic Found
| Detail | Data |
|--------|------|
| **Labs involved** | DeepSeek, Moonshot, MiniMax |
| **Exchange volume** | 16+ million interactions |
| **Fraudulent accounts** | ~24,000 accounts |
| **Violation** | Terms of service + regional access restrictions |
---
## The Threat
- Campaigns growing in **intensity and sophistication**
- Window to act is **narrow**
- Threat extends **beyond any single company or region**
- Requires **coordinated action** by industry, policymakers, global AI community
---
## Key Takeaways
1. Distillation is a **dual-use technique** — legitimate for efficiency, dangerous when weaponized
2. **Scale matters** — 16M+ exchanges shows industrial-level extraction, not casual use
3. **Safeguards evaporate** — distilled models lose critical safety protections
4. **Export controls undermined** — distillation bypasses chip restrictions through data theft
5. **National security threat** — authoritarian actors gain frontier AI capabilities
---
*Source: https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks*

View File

@@ -0,0 +1,66 @@
# 4 Common Home Assistant Mistakes That Silently Break Your Automations
**Source:** https://www.xda-developers.com/home-assistant-mistakes-that-can-break-your-automations/
**Summarized:** 2026-02-23
---
## TL;DR
Four common mistakes that break Home Assistant automations: conflicting conditions, unavailable entities, ignoring DST changes, and wrong automation modes. Most are fixable with documentation, better tooling, and understanding HASS automation modes.
---
## 1. Conflicting Automation Conditions
**Problem:** Multiple workflows trying to control the same device simultaneously, causing:
- Failed triggers when another automation is using the device
- Endless flip-flopping where two automations fight over device state
**Solutions:**
- Document your automations thoroughly
- Avoid overly complex multi-device setups
- Switch to **Node-RED** for visual troubleshooting (canvas view vs YAML hunting)
---
## 2. Entities Becoming Unavailable
**Problem:** Cheap wireless devices drop connection; battery-powered sensors die → automations fail because the device isn't reachable.
**Solutions:**
- Invest in reliable devices (not cheap knockoffs prone to disconnects)
- Keep battery-powered sensors charged
- Use a central bridge for devices with different protocols (reduces lag-induced missed triggers)
---
## 3. Forgetting DST Changes in Time-Based Automations
**Problem:** Daylight Saving Time shifts your triggers by an hour → automations fire at wrong times.
**Solutions:**
- Set reminders to update automations before DST changes
- Use **HACS blueprints** that alert you when clocks shift
- Better: Use **sun position triggers** instead of hard-coded times (adaptive lighting approach)
---
## 4. Choosing the Wrong Automation Mode
**Problem:** Default mode is `single`, which warns/ignores new triggers while automation is running. Breaks motion sensors, timers, and anything that fires rapidly.
**Modes Explained:**
| Mode | Behavior | Best For |
|------|----------|----------|
| `single` (default) | Ignores new triggers while running | Simple toggles |
| `restart` | Aborts current action, starts fresh | Motion sensors, rapid re-triggers |
| `queued` | Logs triggers, executes sequentially | Tasks where order matters |
| `parallel` | Runs multiple actions simultaneously | Complex workflows with independent actions |
---
## Key Takeaway
Home Assistant automations are powerful but fragile. Documentation, reliable hardware, and understanding automation modes prevent most silent failures.

View File

@@ -0,0 +1,88 @@
---
title: Homelab MCP Server
category: Summary
type: Infrastructure/MCP
source_url: https://lobehub.com/mcp/theonlytruebigmac-homelab-mcp
github: https://github.com/theonlytruebigmac/homelab-mcp
date: 2026-02-23
tags: [mcp, homelab, infrastructure, ai, self-hosted]
---
# Homelab MCP Server
**URL:** https://lobehub.com/mcp/theonlytruebigmac-homelab-mcp
**GitHub:** https://github.com/theonlytruebigmac/homelab-mcp
**Date Summarized:** 2026-02-23
## tl;dr
A unified MCP (Model Context Protocol) server that connects AI agents to your self-hosted homelab infrastructure through 30+ consolidated tools.
---
## What it is
- **MCP Server** for homelab infrastructure
- Connects AI assistants (Claude, Gemini, ChatGPT, Cursor) to your self-hosted services
- Also exposes a full REST API for automation tools like n8n
---
## Key Features
- **30 consolidated MCP tools** — action-based compound tools for efficient context windows
- **MCP Resources** — real-time data feeds (clients, devices, queues, health)
- **MCP Prompts** — pre-built templates for troubleshooting, security audits, health checks
- **REST API** — every tool exposed as REST endpoint with Swagger docs
- **Conditional registration** — only enabled services register their tools
- **Docker-first deployment** — single `docker compose up`
- **Audit logging** — every tool call traced and logged
---
## Supported Services (9 total)
| Service | Category | Tools | Capabilities |
|---------|----------|-------|--------------|
| **Unifi** | Networking | 9 | Clients, devices, firewall, VLANs, security, guest access |
| **Proxmox** | Virtualization | 3 | VMs, containers, snapshots, storage, power mgmt |
| **Plex** | Media Server | 2 | Playback, library search, scans, stream control |
| **Radarr/Sonarr** | Media Mgmt | 4 | Movie/TV search, add content, calendar, queue |
| **SABnzbd** | Downloads | 2 | Queue management, speed limits, history |
| **Portainer** | Docker | 4 | Containers, stacks, volumes, logs |
| **OPNsense** | Firewall | 2 | Interfaces, DHCP, gateway, firmware |
| **Home Assistant** | IoT/Smart Home | 3 | Entities, automations, scenes, service calls |
| **Traefik** | Reverse Proxy | 1 | Router inspection, backends, health |
---
## Tech Stack
- **Python 3.11+**
- **Docker/Docker Compose**
- **MCP 2.0+**
- **MIT License**
---
## Quick Install
```bash
git clone https://github.com/theonlytruebigmac/homelab-mcp.git
cd homelab-mcp
cp .env.example .env
# Edit .env with your service credentials
docker compose up
```
---
## Notes
- Each service has an `*_ENABLED` flag — set to `false` to disable
- Supports both MCP protocol and REST API
- Designed for AI agents to directly manage homelab infrastructure
---
*Summarized from lobehub.com/mcp/theonlytruebigmac-homelab-mcp*

View File

@@ -0,0 +1,106 @@
# Khoj AI - Self-Hostable AI Research App
**Source:** https://www.makeuseof.com/started-using-self-hostable-app-for-research-should-have-sooner/
**Summarized:** 2026-02-23
---
## TL;DR
Khoj AI is a middle ground between ChatGPT (too minimal) and NotebookLM (too heavy). Self-hostable, supports custom agents, automations, and your own models via Ollama. Think of it as "NotebookLM + Claude had a baby."
---
## What is Khoj AI?
A research assistant that combines web search, document analysis, and LLM chat. Two ways to use:
- **Cloud:** Free tier with Gemini Flash 3 and basic models
- **Self-hosted:** Docker + bring your own model (Ollama supported)
---
## Key Features
### 1. Built-in Agents
Pre-configured personas:
- Khoj (default)
- Technical Lead
- Teacher
- Legal Expert
Switch agents per conversation for role-specific responses.
### 2. Slash Commands
| Command | Function |
|---------|----------|
| `/notes` | Pull info only from your uploaded documents |
| `/code` | Launch built-in Python interpreter (can generate graphs via Matplotlib) |
| `/web` | Web search integration |
### 3. Custom Agents
Create your own:
1. Add files to knowledge base
2. Choose model
3. Set input/output modes
4. Done
### 4. Automations
Schedule recurring tasks:
- Daily stock market summaries at 9 AM
- RSS feed fetching at set times
- Results delivered to email automatically
No code required.
---
## Self-Hosting Setup
**Requirements:** Docker + decent hardware (local LLMs need beefy machines)
```bash
mkdir ~/.khoj && cd ~/.khoj
wget https://raw.githubusercontent.com/khoj-ai/khoj/main/docker-compose.yml
nano docker-compose.yml # Set admin email/password, add API keys
docker-compose up
```
**Access:** http://localhost:3600
**Model options:**
- Use third-party providers (OpenAI, Anthropic, Gemini) with API keys
- Use local models via Ollama
---
## Why Choose Khoj Over NotebookLM?
| Khoj | NotebookLM |
|------|------------|
| Self-hostable | Cloud only |
| Custom agents | Fixed structure |
| Automations | Manual queries |
| Bring your own model | Google models only |
| Middle ground complexity | Heavy, structured |
---
## Use Cases
- **Students:** Research, understanding topics (not copy-pasting assignments)
- **Work:** Document analysis, research workflows
- **Personal projects:** Custom agents for specific domains
---
## Caveats
- LLMs can hallucinate — always verify important info (legal, medical)
- Local models need strong hardware
- Accuracy depends on model choice
---
## Bottom Line
Khoj fills the gap between minimal chat interfaces and heavy research tools. Self-hosting gives you full stack ownership—own, don't rent.

View File

@@ -0,0 +1,58 @@
# I Replaced My Entire Note-Taking System with a Tool That Syncs Without an Account
**Source:** https://www.makeuseof.com/replaced-entire-note-taking-system-with-tool-that-syncs-without-account/
**Summarized:** 2026-02-23
---
## TL;DR
The author ditched subscription-based note apps for a free, open-source combo: **Obsidian** for writing + **Syncthing** for syncing. Result: full data ownership, no monthly fees, seamless cross-device sync without any cloud middleman.
---
## The Problem
Most note apps (Notion, Evernote, Apple Notes) lock data in proprietary formats on their servers. Two devices? Pay a subscription. Your data, their rules.
---
## The Solution: Obsidian + Syncthing
| Tool | Role | Why It Works |
|------|------|--------------|
| **Obsidian** | Note-taking | Local-first, Markdown files (.md), plain text = future-proof |
| **Syncthing** | Sync | P2P file sync, encrypted, no account needed |
**Key Benefits:**
- Own your data — Notes are just files in a folder
- No subscriptions — Both tools free and open-source
- Cross-platform — Windows, macOS, Linux, Android, iOS
- Encrypted sync — Direct device-to-device, no server sees content
- Conflict handling — Creates `.sync-conflict` files instead of silent overwrites
---
## Setup Highlights
1. **Obsidian vault** = folder of Markdown files
2. **Syncthing** folder type: Send & Receive
3. **File versioning** enabled (keeps 5-10 backups)
4. **Ignore patterns** for `.obsidian/cache` and `workspace*` (prevents UI conflicts)
5. **Device pairing** via ID exchange — works identically desktop & Android
**Android:** Use Syncthing-Fork (Play Store/F-Droid) with better battery optimization.
---
## Pro Tips
- Syncthing runs continuously → vault always up-to-date
- Bidirectional links + graph view in Obsidian = powerful knowledge mapping
- Plugins/themes sync too (`.obsidian` folder minus cache)
---
## Bottom Line
If you're tired of paying to access your own notes, this combo offers "unfairly good" value once the initial setup clicks into place.

View File

@@ -0,0 +1,146 @@
---
title: Obsidian Dataview
category: Summary
type: Tool/Plugin
source: https://github.com/blacksmithgu/obsidian-dataview
date: 2026-02-23
tags: [obsidian, dataview, plugin, query, database]
---
# Obsidian Dataview
**URL:** https://github.com/blacksmithgu/obsidian-dataview
**Description:** A data index and query language over Markdown files for Obsidian
**Date Summarized:** 2026-02-24
---
## tl;dr
Treat your Obsidian Vault as a **database** that you can query. Query, filter, sort, and extract data from Markdown pages using YAML frontmatter and inline fields.
---
## What It Does
Dataview generates data from your vault by pulling information from:
1. **Markdown Frontmatter** — YAML at the top of documents:
```yaml
---
alias: "document"
last-reviewed: 2021-08-17
rating: 8
status: active
---
```
2. **Inline Fields** — Key:: Value syntax in documents:
```markdown
Basic Field:: Value
**Bold Field**:: Nice!
You can also write [field:: inline fields]
(field2:: hidden field)
```
---
## Query Modes
### 1. Dataview Query Language (DQL)
Pipeline-based, SQL-like expressions:
```dataview
TABLE file.name AS "File", rating AS "Rating"
FROM #book
SORT rating DESC
```
### 2. Inline Expressions
Embed directly in markdown:
`= this.file.name` → shows filename
### 3. DataviewJS
JavaScript for complex logic:
```dataviewjs
for (let group of dv.pages("#book")
.where(p => p["time-read"].year == 2021)
.groupBy(p => p.genre)) {
dv.header(3, group.key);
dv.table(["Name", "Rating"], group.rows
.sort(k => k.rating, 'desc')
.map(k => [k.file.link, k.rating]))
}
```
---
## Example Use Cases
**Table of games with metadata:**
```dataview
TABLE time-played, length, rating
FROM "games"
SORT rating DESC
```
**List by tags:**
```dataview
LIST FROM #game/moba or #game/crpg
```
**Tasks from active projects:**
```dataview
TASK FROM #projects/active
```
**Books read in 2021, grouped by genre:**
```dataviewjs
for (let group of dv.pages("#book")
.where(p => p["time-read"].year == 2021)
.groupBy(p => p.genre)) {
dv.header(3, group.key);
dv.table(["Name", "Time Read", "Rating"],
group.rows.sort(k => k.rating, 'desc')
.map(k => [k.file.link, k["time-read"], k.rating]))
}
```
---
## Key Features
- ✅ Query any Markdown files in your vault
- ✅ Filter by tags, folders, metadata
- ✅ Sort by any field
- ✅ Group results
- ✅ Render as tables, lists, or tasks
- ✅ JavaScript API for complex logic
- ✅ Inline fields for hidden metadata
- ✅ Automatic data index updates
---
## Potential Uses for Corey's Vault
1. **Home Assistant Automations Table** — Query all automations by status
2. **Projects Dashboard** — Active vs completed projects
3. **Daily Notes Query** — Recent entries, completed tasks
4. **Research Summaries** — All `/summarize` outputs by date
5. **Cron Jobs Status** — Active vs disabled jobs
---
## Installation
Available in Obsidian Community Plugins:
1. Settings → Community Plugins → Browse
2. Search "Dataview"
3. Install & Enable
---
**Full Docs:** https://blacksmithgu.github.io/obsidian-dataview/
---
*Source: https://github.com/blacksmithgu/obsidian-dataview*

View File

@@ -0,0 +1,105 @@
# OpenClaw Multi-Agent Workflows - 4 Levels Explained
**Source:** https://www.reddit.com/r/openclaw/comments/1r2euvp/this_is_how_ive_learned_to_create_multiagent/
**Summarized:** 2026-02-23
---
## TL;DR
OpenClaw has **4 levels of multi-agent support** built-in, from simple persistent agents to full A2A Protocol orchestration. No Docker required for levels 1-3—they all run in a single gateway process.
---
## Level 1: Multiple Persistent Agents (Built-in)
Define isolated agents in config, each with their own workspace, system prompt, model, and tools:
```yaml
agents:
list:
- id: researcher
default: true
workspace: ~/.openclaw/workspace-research
- id: coder
workspace: ~/.openclaw/workspace-code
bindings:
- agentId: researcher
match: { channel: telegram, accountId: research-bot }
- agentId: coder
match: { channel: discord, guildId: "123456" }
```
Each agent has **full isolation**: separate session history, model config, tool permissions.
---
## Level 2: Agent-to-Agent Communication (Built-in)
Enable `tools.agentToAgent` for agents to talk via `sessions_send`:
```yaml
tools:
agentToAgent:
enabled: true
allow: ["researcher", "coder", "writer"]
```
- Ping-pong conversations (up to 5 turns by default)
- `sessions_spawn` for background sub-agents that report back
- Closest to "orchestrator delegates to specialist" pattern
---
## Level 3: Cross-Agent Delegation (3-Level Hierarchy)
Work around single-level limits:
```
Orchestrator (main agent)
├─ sessions_send → Specialist A (sibling main agent)
│ ├─ sessions_spawn → subagent A1
│ └─ sessions_spawn → subagent A2
└─ sessions_send → Specialist B (sibling main agent)
├─ sessions_spawn → subagent B1
└─ sessions_spawn → subagent B2
```
Config uses `subagents.allowAgents` for cross-agent spawning.
---
## Level 4: True Multi-Agent Orchestration (A2A Protocol)
For advanced use cases with intelligent routing, review, retries, synthesis:
- **a2a-adapter**: Wraps OpenClaw agents as A2A Protocol servers
- Mix-and-match: OpenClaw + CrewAI + LangChain + n8n
- Can run as local Python processes or remote agents
Example:
```python
from a2a_adapter import load_a2a_agent, serve_agent
adapter = await load_a2a_agent({
"adapter": "openclaw",
"agent_id": "researcher",
"thinking": "low",
"async_mode": True,
})
serve_agent(agent_card=agent_card, adapter=adapter, port=9001)
```
---
## Key Takeaways
| Level | Complexity | Setup |
|-------|-----------|-------|
| 1 | Low | Config only |
| 2 | Low-Medium | Config + enable tool |
| 3 | Medium | Config + cross-agent permissions |
| 4 | High | A2A Protocol + external orchestrator |
**Bottom line:** OpenClaw's built-in multi-agent (levels 1-3) requires only `~/.openclaw/config.yaml` changes—no additional infrastructure needed.