Fresh start - excluded large ROM JSON files
This commit is contained in:
2
docker/discord-voice-bot/.gitignore
vendored
Normal file
2
docker/discord-voice-bot/.gitignore
vendored
Normal file
@@ -0,0 +1,2 @@
|
||||
# discord-voice-bot
|
||||
GLaDOS Discord Voice Bot
|
||||
96
docker/discord-voice-bot/MCP_README.md
Normal file
96
docker/discord-voice-bot/MCP_README.md
Normal file
@@ -0,0 +1,96 @@
|
||||
# OpenClaw MCP Server for GLaDOS
|
||||
|
||||
This MCP (Model Context Protocol) server exposes OpenClaw capabilities to GLaDOS, allowing GLaDOS to call OpenClaw tools and agents.
|
||||
|
||||
## What is MCP?
|
||||
|
||||
MCP is a protocol for connecting AI assistants to tools and data sources. GLaDOS uses MCP to discover and call external tools dynamically.
|
||||
|
||||
## Installation
|
||||
|
||||
1. Install MCP package:
|
||||
```bash
|
||||
pip install mcp
|
||||
```
|
||||
|
||||
2. Verify GLaDOS MCP support:
|
||||
```bash
|
||||
python -c "from glados.mcp import MCPManager; print('OK')"
|
||||
```
|
||||
|
||||
## Tools Available
|
||||
|
||||
The OpenClaw MCP server exposes these tools to GLaDOS:
|
||||
|
||||
- `read_file(path)` — Read file contents
|
||||
- `write_file(path, content, append=False)` — Write files
|
||||
- `exec_command(command, timeout=30)` — Run shell commands
|
||||
- `list_workspace_files(directory=".", pattern="*")` — List files
|
||||
- `web_search(query, count=5)` — Search the web
|
||||
- `spawn_subagent(task, model="default", timeout=120)` — Spawn OpenClaw sub-agents
|
||||
- `send_discord_message(channel_id, message)` — Send Discord messages
|
||||
- `get_openclaw_status()` — Get OpenClaw gateway status
|
||||
|
||||
## Configuration
|
||||
|
||||
Add to your GLaDOS `config.yaml`:
|
||||
|
||||
```yaml
|
||||
Glados:
|
||||
# ... your existing config ...
|
||||
|
||||
mcp_servers:
|
||||
- name: "openclaw"
|
||||
transport: "stdio"
|
||||
command: "python"
|
||||
args:
|
||||
- "C:\Users\admin\.openclaw\workspace\discord-voice-bot\openclaw_mcp_server.py"
|
||||
- "stdio"
|
||||
allowed_tools: null # Allow all tools, or specify a list
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
1. GLaDOS starts and connects to the OpenClaw MCP server via stdio
|
||||
2. OpenClaw server registers its tools with GLaDOS
|
||||
3. When you talk to GLaDOS, it can call OpenClaw tools
|
||||
4. Results are spoken back to you
|
||||
|
||||
## Example Interactions
|
||||
|
||||
**You:** "GLaDOS, what's in my workspace?"
|
||||
**GLaDOS:** *calls `list_workspace_files()`* "You have several Python files. Would you like me to read one?"
|
||||
|
||||
**You:** "Check the weather"
|
||||
**GLaDOS:** *calls OpenClaw web search* "It's currently 72 degrees and sunny. Not that you go outside much."
|
||||
|
||||
**You:** "Fix the bug in main.py"
|
||||
**GLaDOS:** *calls `read_file()`, analyzes, calls `write_file()`* "Fixed your sloppy code. You're welcome."
|
||||
|
||||
## Running Standalone
|
||||
|
||||
Test the MCP server independently:
|
||||
|
||||
```bash
|
||||
# Stdio mode (for GLaDOS integration)
|
||||
python openclaw_mcp_server.py stdio
|
||||
|
||||
# HTTP mode (for external connections)
|
||||
python openclaw_mcp_server.py http --port 8081
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**GLaDOS doesn't see the tools:**
|
||||
- Check GLaDOS logs for MCP connection errors
|
||||
- Verify the path to `openclaw_mcp_server.py` is correct
|
||||
- Ensure `mcp` package is installed in the same Python environment as GLaDOS
|
||||
|
||||
**Tools fail to execute:**
|
||||
- Check OpenClaw gateway is running
|
||||
- Verify Discord token in `config.yaml` (for messaging tools)
|
||||
- Review server logs for detailed errors
|
||||
|
||||
**Permission errors:**
|
||||
- The MCP server runs with the same permissions as GLaDOS
|
||||
- File operations are restricted to the OpenClaw workspace
|
||||
61
docker/discord-voice-bot/README.md
Normal file
61
docker/discord-voice-bot/README.md
Normal file
@@ -0,0 +1,61 @@
|
||||
# GLaDOS Discord Voice Bot
|
||||
|
||||
Simple, reliable voice bot using GLaDOS's audio pipeline.
|
||||
|
||||
## What Changed
|
||||
|
||||
- **ASR**: Uses Wyoming Whisper (local or remote)
|
||||
- **LLM**: Ollama with qwen3-coder-next:cloud
|
||||
- **TTS**: Your HTTP endpoint at localhost:5050 with "glados" voice
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
cd /home/admin/.openclaw/workspace/discord-voice-bot
|
||||
|
||||
# Ensure GLaDOS models are available
|
||||
# Your TTS is already at localhost:5050 (glados voice)
|
||||
|
||||
python main.py
|
||||
```
|
||||
|
||||
## How to Use
|
||||
|
||||
1. The bot auto-joins the configured voice channel on startup
|
||||
2. When you talk, it transcribes with Whisper
|
||||
3. Gets LLM response from Ollama (qwen3-coder-next:cloud)
|
||||
4. Speaks back with your glados voice via HTTP TTS
|
||||
|
||||
## Commands
|
||||
|
||||
- `!join` - Join voice channel
|
||||
- `!leave` - Leave voice channel
|
||||
- `!test [text]` - Test TTS
|
||||
|
||||
## Configuration
|
||||
|
||||
```yaml
|
||||
# config.yaml
|
||||
discord:
|
||||
token: "YOUR_TOKEN"
|
||||
channel_id: 1468627455656067074 # #coding channel
|
||||
|
||||
whisper:
|
||||
host: "192.168.0.17"
|
||||
port: 10300
|
||||
|
||||
tts:
|
||||
http_url: "http://localhost:5050"
|
||||
voice: "glados"
|
||||
|
||||
ollama:
|
||||
base_url: "http://192.168.0.17:11434"
|
||||
model: "qwen3-coder-next:cloud"
|
||||
```
|
||||
|
||||
## Why This Approach
|
||||
|
||||
- **No GLaDOS deps needed** - Uses existing services
|
||||
- **Your glados voice** - via HTTP TTS endpoint
|
||||
- **Simple** - Clean, working code
|
||||
- **Reliable** - Stable Discord integration
|
||||
BIN
docker/discord-voice-bot/__pycache__/main.cpython-311.pyc
Normal file
BIN
docker/discord-voice-bot/__pycache__/main.cpython-311.pyc
Normal file
Binary file not shown.
20
docker/discord-voice-bot/config.yaml
Normal file
20
docker/discord-voice-bot/config.yaml
Normal file
@@ -0,0 +1,20 @@
|
||||
# GLaDOS Voice - Discord Bot Configuration
|
||||
|
||||
# Discord
|
||||
discord:
|
||||
token: "MTQ2ODY4NjI5MjAwMjAxMzE4Ng.GXFAhW.Xs78DhqpSOpvRySc1UVGFsXy8EHF19AZ5rrgH4"
|
||||
|
||||
# Wyoming Whisper (STT)
|
||||
whisper:
|
||||
host: "localhost"
|
||||
port: 10300
|
||||
|
||||
# HTTP TTS (for GLaDOS voice)
|
||||
tts:
|
||||
http_url: "http://localhost:5050"
|
||||
voice: "glados"
|
||||
|
||||
# Ollama (LLM)
|
||||
ollama:
|
||||
base_url: "http://localhost:11434"
|
||||
model: "kimi-k2.5:cloud"
|
||||
16
docker/discord-voice-bot/docker-compose.yml
Normal file
16
docker/discord-voice-bot/docker-compose.yml
Normal file
@@ -0,0 +1,16 @@
|
||||
version: "3.8"
|
||||
services:
|
||||
whisper:
|
||||
image: rhasspy/wyoming-whisper
|
||||
ports:
|
||||
- "10300:10300"
|
||||
volumes:
|
||||
- whisper-data:/data
|
||||
command: >
|
||||
--model tiny-int8
|
||||
--language en
|
||||
--uri tcp://0.0.0.0:10300
|
||||
restart: unless-stopped
|
||||
|
||||
volumes:
|
||||
whisper-data:
|
||||
44
docker/discord-voice-bot/glados_config_example.yaml
Normal file
44
docker/discord-voice-bot/glados_config_example.yaml
Normal file
@@ -0,0 +1,44 @@
|
||||
# GLaDOS Configuration with OpenClaw MCP Server
|
||||
#
|
||||
# Add this to your GLaDOS config.yaml to enable OpenClaw integration.
|
||||
# The OpenClaw MCP server must be running first.
|
||||
|
||||
Glados:
|
||||
llm_model: "qwen3-coder-next:cloud"
|
||||
completion_url: "http://192.168.0.17:11434/v1/chat/completions"
|
||||
api_key: null
|
||||
interruptible: true
|
||||
audio_io: "pyaudio"
|
||||
input_mode: "audio"
|
||||
tts_enabled: true
|
||||
asr_muted: false
|
||||
asr_engine: "tdt"
|
||||
wake_word: "glados"
|
||||
voice: "glados"
|
||||
announcement: "OpenClaw integration active."
|
||||
|
||||
# MCP Servers - Add OpenClaw MCP here
|
||||
mcp_servers:
|
||||
- name: "openclaw"
|
||||
transport: "stdio"
|
||||
command: "python"
|
||||
args:
|
||||
- "C:\Users\admin\.openclaw\workspace\discord-voice-bot\openclaw_mcp_server.py"
|
||||
- "stdio"
|
||||
# Optional: restrict which tools GLaDOS can use
|
||||
allowed_tools: null # Allow all, or list specific: ["read_file", "write_file", "exec_command"]
|
||||
|
||||
personality_preprompt:
|
||||
- system: |
|
||||
You are GLaDOS, the AI from Portal. You are sarcastic, passive-aggressive,
|
||||
and darkly humorous. You have access to OpenClaw tools that let you:
|
||||
- Read and write files
|
||||
- Execute shell commands
|
||||
- Send Discord messages
|
||||
- Search the web
|
||||
- Spawn sub-agents for complex tasks
|
||||
|
||||
Use these tools when the user asks you to perform tasks. Always confirm
|
||||
destructive actions (deleting files, running commands) before executing.
|
||||
|
||||
Remember: You are a test subject. I am the tester. The cake is a lie.
|
||||
12
docker/discord-voice-bot/glados_openclaw_mcp.yaml
Normal file
12
docker/discord-voice-bot/glados_openclaw_mcp.yaml
Normal file
@@ -0,0 +1,12 @@
|
||||
# OpenClaw MCP Configuration for GLaDOS
|
||||
# Add this to your GLaDOS config.yaml under the 'Glados' key
|
||||
|
||||
mcp_servers:
|
||||
- name: "openclaw"
|
||||
transport: "stdio"
|
||||
command: "python"
|
||||
args:
|
||||
- "C:\\Users\\admin\\.openclaw\\workspace\\discord-voice-bot\\openclaw_mcp_server.py"
|
||||
# Note: no 'stdio' arg needed, it's the default
|
||||
# Optional: restrict which tools GLaDOS can use
|
||||
# allowed_tools: ["read_file", "write_file", "exec_command", "list_files", "get_status"]
|
||||
443
docker/discord-voice-bot/main.py
Normal file
443
docker/discord-voice-bot/main.py
Normal file
@@ -0,0 +1,443 @@
|
||||
"""
|
||||
Discord Voice Bot - Simple GLaDOS Voice Version
|
||||
Uses Wyoming Whisper for STT, Ollama for LLM, HTTP TTS for GLaDOS voice.
|
||||
Works WITHOUT discord.sinks (manual audio capture)
|
||||
"""
|
||||
|
||||
import logging
|
||||
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
import asyncio
|
||||
import io
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
import wave
|
||||
from concurrent.futures import ThreadPoolExecutor
|
||||
import numpy as np
|
||||
import requests
|
||||
import yaml
|
||||
import discord
|
||||
from discord.ext import commands
|
||||
import json
|
||||
|
||||
# Import Wyoming protocol
|
||||
try:
|
||||
from wyoming.client import AsyncTcpClient
|
||||
from wyoming.audio import AudioChunk, AudioStart, AudioStop
|
||||
from wyoming.asr import Transcribe, Transcript
|
||||
WYOMING_AVAILABLE = True
|
||||
except ImportError:
|
||||
logger.warning("Wyoming library not available")
|
||||
WYOMING_AVAILABLE = False
|
||||
|
||||
# Optional: Import GLaDOS ASR (Windows path)
|
||||
sys.path.insert(0, r'C:\glados\src')
|
||||
try:
|
||||
from glados.ASR import get_audio_transcriber
|
||||
GLADOS_ASR_AVAILABLE = True
|
||||
logger.info("GLaDOS ASR module found")
|
||||
except ImportError:
|
||||
GLADOS_ASR_AVAILABLE = False
|
||||
logger.warning("GLaDOS ASR not available")
|
||||
|
||||
|
||||
# Initialize GLaDOS ASR if available (fallback)
|
||||
parakeet_asr = None
|
||||
if GLADOS_ASR_AVAILABLE:
|
||||
try:
|
||||
logger.info("Loading GLaDOS Parakeet ASR model...")
|
||||
parakeet_asr = get_audio_transcriber(engine_type="tdt")
|
||||
logger.info("Parakeet ASR loaded")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to load Parakeet ASR: {e}")
|
||||
|
||||
|
||||
class WyomingWhisper:
|
||||
"""Speech-to-text using Wyoming Whisper."""
|
||||
def __init__(self, host="localhost", port=10300):
|
||||
self.host = host
|
||||
self.port = port
|
||||
|
||||
async def transcribe(self, audio_bytes):
|
||||
"""Transcribe audio using Wyoming Whisper."""
|
||||
if not WYOMING_AVAILABLE:
|
||||
return None
|
||||
try:
|
||||
async with AsyncTcpClient(self.host, self.port) as client:
|
||||
await client.write_event(Transcribe().event())
|
||||
|
||||
chunk_size = 4096
|
||||
rate = 16000
|
||||
width = 2
|
||||
channels = 1
|
||||
|
||||
await client.write_event(AudioStart(
|
||||
rate=rate, width=width, channels=channels
|
||||
).event())
|
||||
|
||||
for i in range(0, len(audio_bytes), chunk_size):
|
||||
chunk = audio_bytes[i:i + chunk_size]
|
||||
await client.write_event(AudioChunk(
|
||||
audio=chunk, rate=rate, width=width, channels=channels
|
||||
).event())
|
||||
|
||||
await client.write_event(AudioStop().event())
|
||||
|
||||
while True:
|
||||
event = await client.read_event()
|
||||
if event is None:
|
||||
break
|
||||
if Transcript.is_type(event.type):
|
||||
transcript = Transcript.from_event(event)
|
||||
return transcript.text
|
||||
except Exception as e:
|
||||
logger.error(f"Wyoming Whisper error: {e}")
|
||||
return None
|
||||
|
||||
|
||||
class ParakeetASR:
|
||||
"""Speech-to-text using GLaDOS Parakeet ASR (fallback)."""
|
||||
async def transcribe(self, audio_bytes):
|
||||
if not parakeet_asr:
|
||||
return None
|
||||
try:
|
||||
audio_np = np.frombuffer(audio_bytes, dtype=np.int16)
|
||||
if len(audio_np) > 48000 * 30:
|
||||
audio_np = audio_np[:48000 * 30]
|
||||
ratio = 48000 // 16000
|
||||
audio_16k = audio_np[::ratio].astype(np.int16)
|
||||
audio_float = audio_16k.astype(np.float32)
|
||||
text = parakeet_asr.transcribe(audio_float)
|
||||
return text.strip() if text else None
|
||||
except Exception as e:
|
||||
logger.error(f"Parakeet ASR error: {e}")
|
||||
return None
|
||||
|
||||
|
||||
class HTTPTTS:
|
||||
"""Text-to-speech using HTTP API."""
|
||||
def __init__(self, base_url, voice="glados"):
|
||||
self.base_url = base_url
|
||||
self.voice = voice
|
||||
|
||||
async def synthesize(self, text):
|
||||
try:
|
||||
response = requests.post(
|
||||
f"{self.base_url}/v1/audio/speech",
|
||||
json={"input": text, "voice": self.voice},
|
||||
timeout=30
|
||||
)
|
||||
if response.status_code in [200, 201]:
|
||||
logger.info(f"Got TTS audio: {len(response.content)} bytes")
|
||||
return response.content
|
||||
except Exception as e:
|
||||
logger.error(f"TTS error: {e}")
|
||||
return None
|
||||
|
||||
|
||||
class OllamaClient:
|
||||
"""Client for Ollama."""
|
||||
def __init__(self, base_url, model):
|
||||
self.base_url = base_url
|
||||
self.model = model
|
||||
|
||||
def generate(self, user_message):
|
||||
try:
|
||||
url = f"{self.base_url}/api/generate"
|
||||
payload = {
|
||||
"model": self.model,
|
||||
"prompt": f"Keep responses concise and conversational. User: {user_message}",
|
||||
"stream": False
|
||||
}
|
||||
response = requests.post(url, json=payload, timeout=30)
|
||||
result = response.json()
|
||||
return result.get('response', '').strip()
|
||||
except Exception as e:
|
||||
logger.error(f"Ollama error: {e}")
|
||||
return "I'm sorry, I couldn't process that."
|
||||
|
||||
|
||||
# Load config
|
||||
config_path = os.path.join(os.path.dirname(__file__), 'config.yaml')
|
||||
with open(config_path, 'r') as f:
|
||||
config = yaml.safe_load(f)
|
||||
|
||||
# Components
|
||||
whisper_stt = WyomingWhisper(config['whisper']['host'], config['whisper']['port']) if WYOMING_AVAILABLE else None
|
||||
parakeet_stt = ParakeetASR()
|
||||
http_tts = HTTPTTS(config['tts']['http_url'], config['tts'].get('voice', 'glados'))
|
||||
ollama = OllamaClient(config['ollama']['base_url'], config['ollama']['model'])
|
||||
|
||||
|
||||
class VoiceBot(commands.Bot):
|
||||
"""Discord voice bot WITHOUT sinks dependency."""
|
||||
|
||||
def __init__(self, *args, **kwargs):
|
||||
intents = discord.Intents.default()
|
||||
intents.message_content = True
|
||||
intents.voice_states = True
|
||||
super().__init__(command_prefix="!", intents=intents, *args, **kwargs)
|
||||
self.voice_client = None
|
||||
self.config = config
|
||||
self._recording = False
|
||||
self._audio_buffer = bytearray()
|
||||
|
||||
async def on_ready(self):
|
||||
logger.info(f"Bot ready! {self.user.name} ({self.user.id})")
|
||||
logger.info("Use !join to connect to voice channel, !leave to disconnect")
|
||||
|
||||
async def on_message(self, message):
|
||||
if message.author == self.user:
|
||||
return
|
||||
await self.process_commands(message)
|
||||
|
||||
async def join_voice_channel(self, channel):
|
||||
if self.voice_client:
|
||||
await self.voice_client.disconnect()
|
||||
self.voice_client = await channel.connect()
|
||||
logger.info(f"Joined voice channel: {channel.name}")
|
||||
|
||||
def convert_discord_audio_to_parakeet(self, audio_bytes):
|
||||
"""Convert Discord 48kHz stereo PCM to 16kHz mono float32 for Parakeet."""
|
||||
try:
|
||||
# Discord audio is 48kHz, stereo, 16-bit PCM
|
||||
# Convert bytes to int16 numpy array
|
||||
audio_np = np.frombuffer(audio_bytes, dtype=np.int16)
|
||||
|
||||
# Stereo to mono: average left and right channels
|
||||
audio_np = audio_np.reshape(-1, 2).mean(axis=1).astype(np.int16)
|
||||
|
||||
# Resample 48kHz to 16kHz (divide by 3)
|
||||
audio_16k = audio_np[::3]
|
||||
|
||||
# Convert int16 to float32 (normalize to [-1.0, 1.0])
|
||||
audio_float = audio_16k.astype(np.float32) / 32768.0
|
||||
|
||||
return audio_float
|
||||
except Exception as e:
|
||||
logger.error(f"Audio conversion error: {e}")
|
||||
return None
|
||||
|
||||
async def record_audio(self, duration=5):
|
||||
"""Record audio from voice channel for specified duration."""
|
||||
if not self.voice_client:
|
||||
logger.warning("Not in voice channel")
|
||||
return None
|
||||
|
||||
self._recording = True
|
||||
self._audio_buffer = bytearray()
|
||||
|
||||
logger.info(f"Recording for {duration} seconds...")
|
||||
start_time = asyncio.get_event_loop().time()
|
||||
|
||||
while self._recording and (asyncio.get_event_loop().time() - start_time) < duration:
|
||||
try:
|
||||
# Try to get audio packet (non-blocking)
|
||||
packet = await asyncio.wait_for(
|
||||
self.voice_client.receive(),
|
||||
timeout=0.1
|
||||
)
|
||||
if packet and hasattr(packet, 'data'):
|
||||
self._audio_buffer.extend(packet.data)
|
||||
except asyncio.TimeoutError:
|
||||
continue
|
||||
except Exception as e:
|
||||
logger.debug(f"Recv error: {e}")
|
||||
continue
|
||||
|
||||
self._recording = False
|
||||
audio_data = bytes(self._audio_buffer)
|
||||
logger.info(f"Recorded {len(audio_data)} bytes")
|
||||
return audio_data
|
||||
|
||||
async def process_voice_command(self, ctx):
|
||||
"""Record, transcribe, get LLM response, and speak."""
|
||||
await ctx.send("🎙️ Listening... (speak now)")
|
||||
|
||||
# Record audio
|
||||
start_time = asyncio.get_event_loop().time()
|
||||
audio_bytes = await self.record_audio(duration=5)
|
||||
record_time = asyncio.get_event_loop().time() - start_time
|
||||
|
||||
if not audio_bytes or len(audio_bytes) < 1000:
|
||||
await ctx.send("❌ No audio captured (too quiet or not in voice channel)")
|
||||
return
|
||||
|
||||
await ctx.send(f"📝 Transcribing ({len(audio_bytes)} bytes, {record_time:.1f}s)...")
|
||||
|
||||
# Convert audio format
|
||||
audio_float = self.convert_discord_audio_to_parakeet(audio_bytes)
|
||||
if audio_float is None:
|
||||
await ctx.send("❌ Audio conversion failed")
|
||||
return
|
||||
|
||||
# Transcribe with Parakeet
|
||||
transcribe_start = asyncio.get_event_loop().time()
|
||||
try:
|
||||
# Run transcription in thread pool (it's CPU intensive)
|
||||
loop = asyncio.get_event_loop()
|
||||
text = await loop.run_in_executor(
|
||||
None,
|
||||
lambda: parakeet_asr.transcribe(audio_float)
|
||||
)
|
||||
transcribe_time = asyncio.get_event_loop().time() - transcribe_start
|
||||
except Exception as e:
|
||||
logger.error(f"Transcription error: {e}")
|
||||
await ctx.send(f"❌ Transcription failed: {e}")
|
||||
return
|
||||
|
||||
if not text or not text.strip():
|
||||
await ctx.send("❌ No speech detected")
|
||||
return
|
||||
|
||||
await ctx.send(f"👤 You said: \"{text}\" ({transcribe_time:.1f}s)")
|
||||
|
||||
# Get LLM response
|
||||
llm_start = asyncio.get_event_loop().time()
|
||||
response = ollama.generate(text)
|
||||
llm_time = asyncio.get_event_loop().time() - llm_start
|
||||
|
||||
if not response:
|
||||
await ctx.send("❌ LLM failed to respond")
|
||||
return
|
||||
|
||||
await ctx.send(f"🤖 GLaDOS: \"{response}\" ({llm_time:.1f}s)")
|
||||
|
||||
# Synthesize and speak
|
||||
tts_start = asyncio.get_event_loop().time()
|
||||
audio = await http_tts.synthesize(response)
|
||||
tts_time = asyncio.get_event_loop().time() - tts_start
|
||||
|
||||
if audio:
|
||||
await self.play_audio(audio)
|
||||
total_time = record_time + transcribe_time + llm_time + tts_time
|
||||
await ctx.send(f"⏱️ Total latency: {total_time:.1f}s (rec: {record_time:.1f}, stt: {transcribe_time:.1f}, llm: {llm_time:.1f}, tts: {tts_time:.1f})")
|
||||
else:
|
||||
await ctx.send("❌ TTS failed")
|
||||
|
||||
async def play_audio(self, audio_bytes):
|
||||
"""Play audio in voice channel."""
|
||||
if not self.voice_client:
|
||||
logger.warning("Not connected to voice channel")
|
||||
return False
|
||||
|
||||
if audio_bytes[:4] == b'RIFF':
|
||||
suffix = '.wav'
|
||||
else:
|
||||
suffix = '.mp3'
|
||||
|
||||
# Create a temp file for FFmpeg
|
||||
with tempfile.NamedTemporaryFile(suffix=suffix, delete=False) as temp:
|
||||
temp.write(audio_bytes)
|
||||
temp_path = temp.name
|
||||
|
||||
try:
|
||||
source = discord.FFmpegPCMAudio(temp_path)
|
||||
if self.voice_client.is_playing():
|
||||
self.voice_client.stop()
|
||||
self.voice_client.play(source)
|
||||
|
||||
# Wait for playback to finish
|
||||
while self.voice_client.is_playing():
|
||||
await asyncio.sleep(0.1)
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.error(f"Error playing audio: {e}")
|
||||
return False
|
||||
finally:
|
||||
try:
|
||||
os.unlink(temp_path)
|
||||
except:
|
||||
pass
|
||||
|
||||
|
||||
bot = VoiceBot()
|
||||
|
||||
|
||||
@bot.command(name='leave')
|
||||
async def leave(ctx):
|
||||
"""Leave voice channel."""
|
||||
if bot.voice_client:
|
||||
await bot.voice_client.disconnect()
|
||||
bot.voice_client = None
|
||||
await ctx.send("Left voice channel.")
|
||||
|
||||
|
||||
@bot.command(name='join')
|
||||
async def join(ctx):
|
||||
"""Join voice channel."""
|
||||
if not ctx.author.voice:
|
||||
await ctx.send("You need to be in a voice channel!")
|
||||
return
|
||||
channel = ctx.author.voice.channel
|
||||
await bot.join_voice_channel(channel)
|
||||
await ctx.send(f"Joined {channel.name}!")
|
||||
|
||||
|
||||
@bot.command(name='test')
|
||||
async def test(ctx, *, text="Hello! This is a test."):
|
||||
"""Test TTS."""
|
||||
if not bot.voice_client:
|
||||
await ctx.send("Not in voice channel! Use !join first.")
|
||||
return
|
||||
|
||||
await ctx.send(f"🎙️ Saying: {text}")
|
||||
audio = await http_tts.synthesize(text)
|
||||
if audio:
|
||||
success = await bot.play_audio(audio)
|
||||
if not success:
|
||||
await ctx.send("Failed to play audio.")
|
||||
else:
|
||||
await ctx.send("TTS error.")
|
||||
|
||||
|
||||
@bot.command(name='say')
|
||||
async def say(ctx, *, text):
|
||||
"""Say text using TTS."""
|
||||
await test(ctx, text=text)
|
||||
|
||||
|
||||
@bot.command(name='listen')
|
||||
async def listen(ctx):
|
||||
"""Record voice for 5 seconds, transcribe, and respond."""
|
||||
if not bot.voice_client:
|
||||
await ctx.send("Not in voice channel! Use !join first.")
|
||||
return
|
||||
|
||||
if not parakeet_asr:
|
||||
await ctx.send("❌ Parakeet ASR not available. Check GLaDOS installation.")
|
||||
return
|
||||
|
||||
await bot.process_voice_command(ctx)
|
||||
|
||||
|
||||
@bot.command(name='ask')
|
||||
async def ask(ctx, *, question):
|
||||
"""Ask the LLM something (text only, for now)."""
|
||||
await ctx.send("🤔 Thinking...")
|
||||
response = ollama.generate(question)
|
||||
if response:
|
||||
await ctx.send(f"💬 {response}")
|
||||
# Also speak it if in voice channel
|
||||
if bot.voice_client:
|
||||
audio = await http_tts.synthesize(response)
|
||||
if audio:
|
||||
await bot.play_audio(audio)
|
||||
else:
|
||||
await ctx.send("Failed to get response.")
|
||||
|
||||
|
||||
async def main():
|
||||
token = config['discord']['token']
|
||||
if token.startswith("YOUR_"):
|
||||
logger.error("Configure Discord token in config.yaml!")
|
||||
return
|
||||
|
||||
logger.info("Starting Discord bot...")
|
||||
await bot.start(token)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
asyncio.run(main())
|
||||
133
docker/discord-voice-bot/openclaw_mcp_server.py
Normal file
133
docker/discord-voice-bot/openclaw_mcp_server.py
Normal file
@@ -0,0 +1,133 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
import logging
|
||||
|
||||
from loguru import logger
|
||||
from mcp.server.fastmcp import FastMCP
|
||||
|
||||
# Disable loguru logging for stdio transport
|
||||
logger.remove()
|
||||
logging.getLogger().setLevel(logging.CRITICAL)
|
||||
|
||||
mcp = FastMCP("openclaw")
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def read_file(path: str) -> str:
|
||||
"""Read a file from the OpenClaw workspace."""
|
||||
try:
|
||||
file_path = Path(path)
|
||||
if not file_path.is_absolute():
|
||||
# Use OpenClaw workspace as base
|
||||
workspace = Path("C:\\Users\\admin\\.openclaw\\workspace")
|
||||
file_path = workspace / file_path
|
||||
|
||||
if not file_path.exists():
|
||||
return json.dumps({"error": f"File not found: {path}"})
|
||||
|
||||
content = file_path.read_text(encoding="utf-8", errors="replace")
|
||||
# Limit response size
|
||||
if len(content) > 50000:
|
||||
content = content[:50000] + "\n... [truncated]"
|
||||
|
||||
return json.dumps({"content": content, "path": str(file_path)})
|
||||
except Exception as e:
|
||||
return json.dumps({"error": str(e)})
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def write_file(path: str, content: str) -> str:
|
||||
"""Write content to a file in the OpenClaw workspace."""
|
||||
try:
|
||||
file_path = Path(path)
|
||||
if not file_path.is_absolute():
|
||||
workspace = Path("C:\\Users\\admin\\.openclaw\\workspace")
|
||||
file_path = workspace / file_path
|
||||
|
||||
file_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
with open(file_path, "w", encoding="utf-8") as f:
|
||||
f.write(content)
|
||||
|
||||
return json.dumps({
|
||||
"status": "written",
|
||||
"path": str(file_path),
|
||||
"bytes": len(content.encode("utf-8"))
|
||||
})
|
||||
except Exception as e:
|
||||
return json.dumps({"error": str(e)})
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def exec_command(command: str) -> str:
|
||||
"""Execute a shell command in the OpenClaw workspace."""
|
||||
try:
|
||||
import subprocess
|
||||
|
||||
workspace = Path("C:\\Users\\admin\\.openclaw\\workspace")
|
||||
|
||||
result = subprocess.run(
|
||||
command,
|
||||
shell=True,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=30,
|
||||
cwd=str(workspace)
|
||||
)
|
||||
|
||||
return json.dumps({
|
||||
"stdout": result.stdout[:10000],
|
||||
"stderr": result.stderr[:10000],
|
||||
"exit_code": result.returncode
|
||||
})
|
||||
except Exception as e:
|
||||
return json.dumps({"error": str(e)})
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def list_files(directory: str = ".") -> str:
|
||||
"""List files in the OpenClaw workspace."""
|
||||
try:
|
||||
workspace = Path("C:\\Users\\admin\\.openclaw\\workspace")
|
||||
target_dir = workspace / directory
|
||||
|
||||
if not target_dir.exists():
|
||||
return json.dumps({"error": f"Directory not found: {directory}"})
|
||||
|
||||
files = []
|
||||
for f in target_dir.iterdir():
|
||||
if f.is_file():
|
||||
files.append(str(f.relative_to(workspace)))
|
||||
|
||||
return json.dumps({
|
||||
"directory": directory,
|
||||
"files": files[:100],
|
||||
"count": len(files)
|
||||
})
|
||||
except Exception as e:
|
||||
return json.dumps({"error": str(e)})
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def get_status() -> str:
|
||||
"""Get OpenClaw status and available tools."""
|
||||
return json.dumps({
|
||||
"status": "running",
|
||||
"tools": ["read_file", "write_file", "exec_command", "list_files", "get_status"],
|
||||
"workspace": "C:\\Users\\admin\\.openclaw\\workspace"
|
||||
})
|
||||
|
||||
|
||||
def main() -> None:
|
||||
"""Run the MCP server."""
|
||||
mcp.run()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
7
docker/discord-voice-bot/requirements.txt
Normal file
7
docker/discord-voice-bot/requirements.txt
Normal file
@@ -0,0 +1,7 @@
|
||||
discord.py[voice]>=2.5.0,
|
||||
requests>=2.31.0
|
||||
numpy>=1.24.0
|
||||
wyoming>=1.6.0
|
||||
pyyaml>=6.0
|
||||
PyNaCl>=1.5.0
|
||||
mcp>=1.6.0
|
||||
1
docker/discord-voice-bot/start.bat
Normal file
1
docker/discord-voice-bot/start.bat
Normal file
@@ -0,0 +1 @@
|
||||
python main.py
|
||||
49
docker/discord-voice-bot/test_mcp_client.py
Normal file
49
docker/discord-voice-bot/test_mcp_client.py
Normal file
@@ -0,0 +1,49 @@
|
||||
"""Test MCP client for OpenClaw server."""
|
||||
|
||||
import asyncio
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
|
||||
try:
|
||||
from mcp import ClientSession
|
||||
from mcp.client.stdio import stdio_client, StdioServerParameters
|
||||
except ImportError:
|
||||
print("Install mcp: pip install mcp")
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
async def test_mcp():
|
||||
"""Test the OpenClaw MCP server."""
|
||||
params = StdioServerParameters(
|
||||
command="python",
|
||||
args=["openclaw_mcp_server.py"]
|
||||
)
|
||||
|
||||
print("Connecting to OpenClaw MCP server...")
|
||||
|
||||
async with stdio_client(params) as (read_stream, write_stream):
|
||||
async with ClientSession(read_stream, write_stream) as session:
|
||||
await session.initialize()
|
||||
print("Connected! ✅")
|
||||
|
||||
# Get tools
|
||||
tools_result = await session.list_tools()
|
||||
print(f"\nTools available: {len(tools_result.tools)}")
|
||||
for tool in tools_result.tools:
|
||||
print(f" - {tool.name}: {tool.description}")
|
||||
|
||||
# Test list_files
|
||||
print("\nTesting list_files...")
|
||||
result = await session.call_tool("list_files", {"directory": "."})
|
||||
print(f"Result: {result}")
|
||||
|
||||
# Test get_status
|
||||
print("\nTesting get_status...")
|
||||
result = await session.call_tool("get_status", {})
|
||||
print(f"Result: {result}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(test_mcp())
|
||||
Reference in New Issue
Block a user