Major refactor: Multi-user Chrome MCP extension with remote server architecture

This commit is contained in:
nasir@endelospay.com
2025-08-21 20:09:57 +05:00
parent d97cad1736
commit 5d869f6a7c
125 changed files with 16249 additions and 11906 deletions

View File

@@ -0,0 +1,338 @@
# Multi-User Chrome Extension to LiveKit Agent Integration
This document explains how the system automatically spawns LiveKit agents for each Chrome extension user connection, creating a seamless multi-user voice automation experience.
## Overview
The system now automatically creates a dedicated LiveKit agent for each user who installs and connects the Chrome extension. Each user gets:
- **Unique Random User ID** - Generated by Chrome extension and consistent across all components
- **Dedicated LiveKit Agent** - Automatically started for each user with the same user ID
- **Isolated Voice Room** - Each user gets their own LiveKit room (`mcp-chrome-user-{userId}`)
- **Session-Based Routing** - Voice commands routed to correct user's Chrome extension
- **Complete User ID Consistency** - Same user ID flows through Chrome → Server → Agent → Back to Chrome
## Architecture Flow
```
Chrome Extension (User 1) → Random User ID → LiveKit Agent (Room: mcp-chrome-user-{userId})
Chrome Extension (User 2) → Random User ID → LiveKit Agent (Room: mcp-chrome-user-{userId})
Chrome Extension (User 3) → Random User ID → LiveKit Agent (Room: mcp-chrome-user-{userId})
Remote Server (Session Manager)
Connection Router & LiveKit Agent Manager
```
## How It Works
### 1. Chrome Extension Connection
When a user installs and connects the Chrome extension:
```javascript
// Chrome extension generates unique user ID
const userId = `user_${Date.now()}_${Math.random().toString(36).substring(2, 15)}`;
// Chrome extension connects to ws://localhost:3001/chrome with user ID
const connectionInfo = {
type: 'connection_info',
userId: userId, // Chrome extension provides its own user ID
userAgent: navigator.userAgent,
timestamp: Date.now(),
extensionId: chrome.runtime.id,
};
// Remote server receives and uses the Chrome extension's user ID
// Session created with user-provided ID: session_user_1703123456_abc123
```
### 2. Manual LiveKit Agent Management
LiveKit agents are no longer started automatically. They should be started manually when needed:
```typescript
// LiveKit Agent Manager can spawn agent process with user ID when requested
const roomName = `mcp-chrome-user-${userId}`;
const agentProcess = spawn('python', ['livekit_agent.py', 'start'], {
env: {
...process.env,
CHROME_USER_ID: userId, // Pass user ID to LiveKit agent
LIVEKIT_URL: this.liveKitConfig.livekit_url,
LIVEKIT_API_KEY: this.liveKitConfig.api_key,
LIVEKIT_API_SECRET: this.liveKitConfig.api_secret,
MCP_SERVER_URL: 'http://localhost:3001/mcp',
},
});
```
### 3. User-Specific Voice Room
Each user gets their own LiveKit room:
```
User 1 → Room: mcp-chrome-user-user_1703123456_abc123
User 2 → Room: mcp-chrome-user-user_1703123457_def456
User 3 → Room: mcp-chrome-user-user_1703123458_ghi789
```
### 4. Session-Based Command Routing with User ID
Voice commands are routed to the correct Chrome extension using user ID:
```python
# LiveKit agent includes user ID in MCP requests
async def search_google(context: RunContext, query: str):
# MCP client automatically includes user ID in headers
result = await self.mcp_client._search_google_mcp(query)
return result
```
```typescript
// Remote server routes commands based on user ID
const result = await this.sendToExtensions(message, sessionId, userId);
// Connection router finds the correct Chrome extension by user ID
```
```
LiveKit Agent (User 1) → [User ID: user_123_abc] → Remote Server → Chrome Extension (User 1)
LiveKit Agent (User 2) → [User ID: user_456_def] → Remote Server → Chrome Extension (User 2)
LiveKit Agent (User 3) → [User ID: user_789_ghi] → Remote Server → Chrome Extension (User 3)
```
## Key Components
### LiveKitAgentManager
**Location**: `app/remote-server/src/server/livekit-agent-manager.ts`
**Features**:
- Automatic agent spawning on Chrome connection
- Process management and monitoring
- Agent cleanup on disconnection
- Room name generation based on user ID
### Enhanced ChromeTools
**Location**: `app/remote-server/src/server/chrome-tools.ts`
**Features**:
- Integrated LiveKit agent management
- Automatic agent startup in `registerExtension()`
- Automatic agent shutdown in `unregisterExtension()`
- Session-based routing with LiveKit context
### Enhanced LiveKit Agent
**Location**: `agent-livekit/livekit_agent.py`
**Features**:
- Room name parsing to extract Chrome user ID
- Chrome user session creation
- User-specific console logging
- Command line room name support
## Console Logging
### When Chrome Extension Connects:
```
🔗 Chrome extension connected - User: user_1703123456_abc123, Session: session_user_1703123456_abc123
🚀 Starting LiveKit agent for user: user_1703123456_abc123
✅ LiveKit agent started successfully for user user_1703123456_abc123
```
### When LiveKit Agent Starts:
```
============================================================
🔗 NEW USER SESSION CONNECTED
============================================================
👤 User ID: user_1703123456_abc123
🆔 Session ID: session_user_1703123456_abc123
🏠 Room Name: mcp-chrome-user-user_1703123456_abc123
🎭 Participant: chrome_user_user_1703123456_abc123
⏰ Connected At: 1703123456.789
📊 Total Active Sessions: 1
============================================================
🔗 Detected Chrome user ID from room name: user_1703123456_abc123
✅ LiveKit agent connected to Chrome user: user_1703123456_abc123
```
### When User Issues Voice Commands:
```
🌐 [Session: session_user_1703123456_abc123] Navigation to: https://google.com
✅ [Session: session_user_1703123456_abc123] Navigation completed
🔍 [Session: session_user_1703123456_abc123] Google search: 'python programming'
✅ [Session: session_user_1703123456_abc123] Search completed
```
### When Chrome Extension Disconnects:
```
🔌 Chrome extension disconnected
🛑 Stopping LiveKit agent for user: user_1703123456_abc123
✅ LiveKit agent stopped for user user_1703123456_abc123
```
## Setup Instructions
### 1. Start Remote Server
```bash
cd app/remote-server
npm start
```
### 2. Install Chrome Extensions (Multiple Users)
Each user:
1. Open Chrome → Extensions → Developer mode ON
2. Click "Load unpacked"
3. Select: `app/chrome-extension/.output/chrome-mv3/`
4. Extension automatically connects and gets unique user ID
### 3. Configure Cherry Studio (Each User)
Each user adds to their Cherry Studio:
```json
{
"mcpServers": {
"chrome-mcp-remote-server": {
"type": "streamableHttp",
"url": "http://localhost:3001/mcp"
}
}
}
```
### 4. Join LiveKit Rooms (Each User)
Each user joins their specific room:
- User 1: `mcp-chrome-user-user_1703123456_abc123`
- User 2: `mcp-chrome-user-user_1703123457_def456`
- User 3: `mcp-chrome-user-user_1703123458_ghi789`
## Testing
### Test Multi-User Integration:
```bash
cd app/remote-server
node test-multi-user-livekit.js
```
This test:
1. Simulates multiple Chrome extension connections
2. Verifies unique user ID generation
3. Checks LiveKit agent spawning
4. Tests session isolation
5. Validates room naming
### Expected Test Output:
```
👤 User 1: Chrome extension connected
📋 User 1: Received session info: { userId: "user_...", sessionId: "session_..." }
🚀 User 1: LiveKit agent should be starting for room: mcp-chrome-user-user_...
👤 User 2: Chrome extension connected
📋 User 2: Received session info: { userId: "user_...", sessionId: "session_..." }
🚀 User 2: LiveKit agent should be starting for room: mcp-chrome-user-user_...
✅ Session isolation: PASS
✅ User ID isolation: PASS
✅ Room isolation: PASS
```
## Benefits
### 1. **Zero Configuration**
- Users just install Chrome extension
- LiveKit agents start automatically
- No manual room setup required
### 2. **Complete Isolation**
- Each user has dedicated agent
- Separate voice rooms
- Independent command processing
### 3. **Scalable Architecture**
- Supports unlimited concurrent users
- Automatic resource management
- Process cleanup on disconnect
### 4. **Session Persistence**
- User sessions tracked across connections
- Automatic reconnection handling
- State management per user
## Monitoring
### Agent Statistics:
```javascript
// Get LiveKit agent stats
const stats = chromeTools.getLiveKitAgentStats();
console.log(stats);
// Output: { totalAgents: 3, runningAgents: 3, startingAgents: 0, ... }
```
### Active Agents:
```javascript
// Get all active agents
const agents = chromeTools.getAllActiveLiveKitAgents();
agents.forEach((agent) => {
console.log(`User: ${agent.userId}, Room: ${agent.roomName}, Status: ${agent.status}`);
});
```
## Troubleshooting
### Common Issues:
1. **LiveKit Agent Not Starting**
- Check Python environment in `agent-livekit/`
- Verify LiveKit server is running
- Check agent process logs
2. **Multiple Agents for Same User**
- Check user ID generation uniqueness
- Verify session cleanup on disconnect
3. **Voice Commands Not Working**
- Verify user is in correct LiveKit room
- Check session routing in logs
- Confirm Chrome extension connection
### Debug Commands:
```bash
# Check agent processes
ps aux | grep livekit_agent
# Monitor remote server logs
cd app/remote-server && npm start
# Test Chrome connection
node test-multi-user-livekit.js
```
The system now provides a complete multi-user voice automation experience where each Chrome extension user automatically gets their own dedicated LiveKit agent! 🎉

View File

@@ -0,0 +1,222 @@
# Multi-User Session Management
This document explains how the Chrome MCP extension and LiveKit agent handle multiple users with session-based isolation.
## Overview
The system now supports multiple users connecting simultaneously to the same MCP server with proper session isolation. Each connection gets a unique session ID and user ID, ensuring that commands from different users don't interfere with each other.
## Key Features
### 1. Automatic Session ID Generation
- **No Authentication Required**: Users don't need to authenticate
- **Random Session IDs**: Each connection gets a unique session ID
- **User Isolation**: Each user's commands are routed to their specific Chrome extension
### 2. Session Management Components
#### SessionManager (`app/remote-server/src/server/session-manager.ts`)
- Tracks all active connections and sessions
- Manages user-to-session mappings
- Handles session cleanup and expiration
- Provides session statistics
#### ConnectionRouter (`app/remote-server/src/server/connection-router.ts`)
- Routes messages to the correct Chrome extension based on session ID
- Implements load balancing for general requests
- Supports different routing strategies (newest, oldest, most active)
#### ChromeTools (Enhanced)
- Integrates with SessionManager and ConnectionRouter
- Provides session-aware tool calling
- Supports multi-user command routing
## How It Works
### 1. Connection Flow
```
1. Chrome Extension connects to ws://localhost:3001/chrome
2. Server generates random user ID: user_{timestamp}_{random}
3. Server creates session with unique session ID
4. Extension sends connection_info message
5. Server responds with session_info containing:
- userId
- sessionId
- connectionId
6. Extension stores session info for future requests
```
### 2. Message Routing
```
1. MCP client sends tool request to server
2. Server determines target session (by session ID, user ID, or load balancing)
3. ConnectionRouter finds appropriate Chrome extension connection
4. Message is sent to specific extension instance
5. Response is routed back through the same session
```
### 3. Session Isolation
Each user session is completely isolated:
- **Separate Chrome Extension Instance**: Each user connects their own extension
- **Independent Command Queues**: Commands don't interfere between users
- **Session-Specific State**: Each session maintains its own state
- **Resource Isolation**: No shared resources between sessions
## Configuration
### Chrome Extension
No configuration needed - sessions are created automatically on connection.
### Remote Server
The server automatically handles multi-user sessions with these defaults:
- Session cleanup interval: 60 seconds
- Stale connection threshold: 5 minutes
- Maximum inactive time: 1 hour
### LiveKit Agent
Enhanced with multi-user support:
```yaml
# agent-livekit/livekit_config.yaml
livekit:
room:
user_room_prefix: 'mcp-chrome-user-'
agent:
session:
max_inactive_time: 3600 # seconds
cleanup_interval: 300 # seconds
max_concurrent_sessions: 50
```
## Usage Examples
### 1. Multiple Users with Chrome Extensions
Each user installs the Chrome extension and connects:
```javascript
// User 1's extension connects
// Gets: userId: "user_1703123456_abc123", sessionId: "session_user_1703123456_abc123"
// User 2's extension connects
// Gets: userId: "user_1703123457_def456", sessionId: "session_user_1703123457_def456"
```
### 2. Cherry Studio Configuration
Each user configures Cherry Studio with the same server URL:
```json
{
"mcpServers": {
"chrome-mcp-remote-server": {
"type": "streamableHttp",
"url": "http://localhost:3001/mcp",
"description": "Remote Chrome MCP Server - Multi-User Support"
}
}
}
```
### 3. LiveKit Agent Sessions
Each user gets their own LiveKit room:
```python
# User 1 joins room: "mcp-chrome-user-user_1703123456_abc123"
# User 2 joins room: "mcp-chrome-user-user_1703123457_def456"
```
## Testing
Run the multi-user test script:
```bash
cd app/remote-server
node test-multi-user.js
```
This test:
1. Creates multiple simulated connections
2. Verifies unique session IDs
3. Tests message routing
4. Validates session isolation
## Monitoring
### Session Statistics
Get current session stats via the server API:
```javascript
// In ChromeTools
const stats = chromeTools.getSessionStats();
console.log(stats);
// Output:
// {
// totalUsers: 3,
// totalSessions: 3,
// totalConnections: 3,
// activeConnections: 3,
// pendingRequests: 0
// }
```
### Routing Statistics
```javascript
const routingStats = chromeTools.getRoutingStats();
console.log(routingStats);
```
### Connection Monitoring
The server logs all connection events:
- New connections with session info
- Message routing decisions
- Session cleanup events
- Connection state changes
## Troubleshooting
### Common Issues
1. **Sessions Not Isolated**
- Check that each Chrome extension instance is running in a separate browser profile
- Verify unique session IDs in server logs
2. **Commands Going to Wrong User**
- Check session ID routing in ConnectionRouter
- Verify message contains correct session context
3. **Session Cleanup Issues**
- Monitor session cleanup logs
- Adjust cleanup intervals if needed
### Debug Logging
Enable detailed logging in the remote server:
```javascript
// In server logs, look for:
// "🟢 [Chrome Extension] Connection registered"
// "📤 [Chrome Tools] Routed to connection"
// "🔧 [Chrome Tools] Calling tool with routing context"
```
## Architecture Benefits
1. **Scalability**: Supports many concurrent users
2. **Isolation**: Complete separation between user sessions
3. **Reliability**: Automatic cleanup and error recovery
4. **Simplicity**: No authentication complexity
5. **Flexibility**: Multiple routing strategies available
## Future Enhancements
- Persistent sessions across reconnections
- User preference storage
- Advanced load balancing algorithms
- Session sharing capabilities
- Performance metrics and analytics

View File

@@ -50,6 +50,7 @@ Navigate to a URL with optional viewport control.
- `url` (string, required): URL to navigate to
- `newWindow` (boolean, optional): Create new window (default: false)
- `backgroundPage` (boolean, optional): Open URL in background page using full-size window that gets minimized. Creates window with proper dimensions first, then minimizes for background operation while maintaining web automation compatibility (default: false)
- `width` (number, optional): Viewport width in pixels (default: 1280)
- `height` (number, optional): Viewport height in pixels (default: 720)
@@ -64,6 +65,15 @@ Navigate to a URL with optional viewport control.
}
```
**Background Page Example**:
```json
{
"url": "https://example.com",
"backgroundPage": true
}
```
### `chrome_close_tabs`
Close specific tabs or windows.