Major refactor: Multi-user Chrome MCP extension with remote server architecture

This commit is contained in:
nasir@endelospay.com
2025-08-21 20:09:57 +05:00
parent d97cad1736
commit 5d869f6a7c
125 changed files with 16249 additions and 11906 deletions

View File

@@ -0,0 +1,338 @@
# Multi-User Chrome Extension to LiveKit Agent Integration
This document explains how the system automatically spawns LiveKit agents for each Chrome extension user connection, creating a seamless multi-user voice automation experience.
## Overview
The system now automatically creates a dedicated LiveKit agent for each user who installs and connects the Chrome extension. Each user gets:
- **Unique Random User ID** - Generated by Chrome extension and consistent across all components
- **Dedicated LiveKit Agent** - Automatically started for each user with the same user ID
- **Isolated Voice Room** - Each user gets their own LiveKit room (`mcp-chrome-user-{userId}`)
- **Session-Based Routing** - Voice commands routed to correct user's Chrome extension
- **Complete User ID Consistency** - Same user ID flows through Chrome → Server → Agent → Back to Chrome
## Architecture Flow
```
Chrome Extension (User 1) → Random User ID → LiveKit Agent (Room: mcp-chrome-user-{userId})
Chrome Extension (User 2) → Random User ID → LiveKit Agent (Room: mcp-chrome-user-{userId})
Chrome Extension (User 3) → Random User ID → LiveKit Agent (Room: mcp-chrome-user-{userId})
Remote Server (Session Manager)
Connection Router & LiveKit Agent Manager
```
## How It Works
### 1. Chrome Extension Connection
When a user installs and connects the Chrome extension:
```javascript
// Chrome extension generates unique user ID
const userId = `user_${Date.now()}_${Math.random().toString(36).substring(2, 15)}`;
// Chrome extension connects to ws://localhost:3001/chrome with user ID
const connectionInfo = {
type: 'connection_info',
userId: userId, // Chrome extension provides its own user ID
userAgent: navigator.userAgent,
timestamp: Date.now(),
extensionId: chrome.runtime.id,
};
// Remote server receives and uses the Chrome extension's user ID
// Session created with user-provided ID: session_user_1703123456_abc123
```
### 2. Manual LiveKit Agent Management
LiveKit agents are no longer started automatically. They should be started manually when needed:
```typescript
// LiveKit Agent Manager can spawn agent process with user ID when requested
const roomName = `mcp-chrome-user-${userId}`;
const agentProcess = spawn('python', ['livekit_agent.py', 'start'], {
env: {
...process.env,
CHROME_USER_ID: userId, // Pass user ID to LiveKit agent
LIVEKIT_URL: this.liveKitConfig.livekit_url,
LIVEKIT_API_KEY: this.liveKitConfig.api_key,
LIVEKIT_API_SECRET: this.liveKitConfig.api_secret,
MCP_SERVER_URL: 'http://localhost:3001/mcp',
},
});
```
### 3. User-Specific Voice Room
Each user gets their own LiveKit room:
```
User 1 → Room: mcp-chrome-user-user_1703123456_abc123
User 2 → Room: mcp-chrome-user-user_1703123457_def456
User 3 → Room: mcp-chrome-user-user_1703123458_ghi789
```
### 4. Session-Based Command Routing with User ID
Voice commands are routed to the correct Chrome extension using user ID:
```python
# LiveKit agent includes user ID in MCP requests
async def search_google(context: RunContext, query: str):
# MCP client automatically includes user ID in headers
result = await self.mcp_client._search_google_mcp(query)
return result
```
```typescript
// Remote server routes commands based on user ID
const result = await this.sendToExtensions(message, sessionId, userId);
// Connection router finds the correct Chrome extension by user ID
```
```
LiveKit Agent (User 1) → [User ID: user_123_abc] → Remote Server → Chrome Extension (User 1)
LiveKit Agent (User 2) → [User ID: user_456_def] → Remote Server → Chrome Extension (User 2)
LiveKit Agent (User 3) → [User ID: user_789_ghi] → Remote Server → Chrome Extension (User 3)
```
## Key Components
### LiveKitAgentManager
**Location**: `app/remote-server/src/server/livekit-agent-manager.ts`
**Features**:
- Automatic agent spawning on Chrome connection
- Process management and monitoring
- Agent cleanup on disconnection
- Room name generation based on user ID
### Enhanced ChromeTools
**Location**: `app/remote-server/src/server/chrome-tools.ts`
**Features**:
- Integrated LiveKit agent management
- Automatic agent startup in `registerExtension()`
- Automatic agent shutdown in `unregisterExtension()`
- Session-based routing with LiveKit context
### Enhanced LiveKit Agent
**Location**: `agent-livekit/livekit_agent.py`
**Features**:
- Room name parsing to extract Chrome user ID
- Chrome user session creation
- User-specific console logging
- Command line room name support
## Console Logging
### When Chrome Extension Connects:
```
🔗 Chrome extension connected - User: user_1703123456_abc123, Session: session_user_1703123456_abc123
🚀 Starting LiveKit agent for user: user_1703123456_abc123
✅ LiveKit agent started successfully for user user_1703123456_abc123
```
### When LiveKit Agent Starts:
```
============================================================
🔗 NEW USER SESSION CONNECTED
============================================================
👤 User ID: user_1703123456_abc123
🆔 Session ID: session_user_1703123456_abc123
🏠 Room Name: mcp-chrome-user-user_1703123456_abc123
🎭 Participant: chrome_user_user_1703123456_abc123
⏰ Connected At: 1703123456.789
📊 Total Active Sessions: 1
============================================================
🔗 Detected Chrome user ID from room name: user_1703123456_abc123
✅ LiveKit agent connected to Chrome user: user_1703123456_abc123
```
### When User Issues Voice Commands:
```
🌐 [Session: session_user_1703123456_abc123] Navigation to: https://google.com
✅ [Session: session_user_1703123456_abc123] Navigation completed
🔍 [Session: session_user_1703123456_abc123] Google search: 'python programming'
✅ [Session: session_user_1703123456_abc123] Search completed
```
### When Chrome Extension Disconnects:
```
🔌 Chrome extension disconnected
🛑 Stopping LiveKit agent for user: user_1703123456_abc123
✅ LiveKit agent stopped for user user_1703123456_abc123
```
## Setup Instructions
### 1. Start Remote Server
```bash
cd app/remote-server
npm start
```
### 2. Install Chrome Extensions (Multiple Users)
Each user:
1. Open Chrome → Extensions → Developer mode ON
2. Click "Load unpacked"
3. Select: `app/chrome-extension/.output/chrome-mv3/`
4. Extension automatically connects and gets unique user ID
### 3. Configure Cherry Studio (Each User)
Each user adds to their Cherry Studio:
```json
{
"mcpServers": {
"chrome-mcp-remote-server": {
"type": "streamableHttp",
"url": "http://localhost:3001/mcp"
}
}
}
```
### 4. Join LiveKit Rooms (Each User)
Each user joins their specific room:
- User 1: `mcp-chrome-user-user_1703123456_abc123`
- User 2: `mcp-chrome-user-user_1703123457_def456`
- User 3: `mcp-chrome-user-user_1703123458_ghi789`
## Testing
### Test Multi-User Integration:
```bash
cd app/remote-server
node test-multi-user-livekit.js
```
This test:
1. Simulates multiple Chrome extension connections
2. Verifies unique user ID generation
3. Checks LiveKit agent spawning
4. Tests session isolation
5. Validates room naming
### Expected Test Output:
```
👤 User 1: Chrome extension connected
📋 User 1: Received session info: { userId: "user_...", sessionId: "session_..." }
🚀 User 1: LiveKit agent should be starting for room: mcp-chrome-user-user_...
👤 User 2: Chrome extension connected
📋 User 2: Received session info: { userId: "user_...", sessionId: "session_..." }
🚀 User 2: LiveKit agent should be starting for room: mcp-chrome-user-user_...
✅ Session isolation: PASS
✅ User ID isolation: PASS
✅ Room isolation: PASS
```
## Benefits
### 1. **Zero Configuration**
- Users just install Chrome extension
- LiveKit agents start automatically
- No manual room setup required
### 2. **Complete Isolation**
- Each user has dedicated agent
- Separate voice rooms
- Independent command processing
### 3. **Scalable Architecture**
- Supports unlimited concurrent users
- Automatic resource management
- Process cleanup on disconnect
### 4. **Session Persistence**
- User sessions tracked across connections
- Automatic reconnection handling
- State management per user
## Monitoring
### Agent Statistics:
```javascript
// Get LiveKit agent stats
const stats = chromeTools.getLiveKitAgentStats();
console.log(stats);
// Output: { totalAgents: 3, runningAgents: 3, startingAgents: 0, ... }
```
### Active Agents:
```javascript
// Get all active agents
const agents = chromeTools.getAllActiveLiveKitAgents();
agents.forEach((agent) => {
console.log(`User: ${agent.userId}, Room: ${agent.roomName}, Status: ${agent.status}`);
});
```
## Troubleshooting
### Common Issues:
1. **LiveKit Agent Not Starting**
- Check Python environment in `agent-livekit/`
- Verify LiveKit server is running
- Check agent process logs
2. **Multiple Agents for Same User**
- Check user ID generation uniqueness
- Verify session cleanup on disconnect
3. **Voice Commands Not Working**
- Verify user is in correct LiveKit room
- Check session routing in logs
- Confirm Chrome extension connection
### Debug Commands:
```bash
# Check agent processes
ps aux | grep livekit_agent
# Monitor remote server logs
cd app/remote-server && npm start
# Test Chrome connection
node test-multi-user-livekit.js
```
The system now provides a complete multi-user voice automation experience where each Chrome extension user automatically gets their own dedicated LiveKit agent! 🎉