Files
broswer-automation/app/remote-server/README.md

285 lines
7.6 KiB
Markdown

# MCP Chrome Remote Server
A remote server implementation for the MCP Chrome Bridge that allows external applications to control Chrome through **direct WebSocket connections**.
## 🚀 New Direct Connection Architecture
This server now supports **direct connections** from Chrome extensions, eliminating the need for native messaging hosts as intermediaries:
- **Cherry Studio** → **Remote Server** (via Streamable HTTP)
- **Chrome Extension** → **Remote Server** (via WebSocket)
- **No Native Server Required** for Chrome extension communication
### Benefits
- ✅ Eliminates 10-second timeout errors
- ✅ Faster response times
- ✅ Simplified architecture
- ✅ Better reliability
- ✅ Easier debugging
## Features
- **Remote Control**: Control Chrome browser remotely via WebSocket API
- **MCP Protocol**: Implements Model Context Protocol for tool-based interactions
- **HTTP Streaming**: Full support for MCP Streamable HTTP and SSE (Server-Sent Events)
- **Real-time Communication**: WebSocket-based real-time communication with Chrome extensions
- **RESTful Health Checks**: HTTP endpoints for monitoring server health
- **Extensible Architecture**: Easy to add new Chrome automation tools
- **Session Management**: Robust session handling for streaming connections
## Quick Start
### 1. Install Dependencies (from project root)
```bash
# Install all workspace dependencies
pnpm install
```
### 2. Build the Server
```bash
# From project root
npm run build:remote
# Or from this directory
npm run build
```
### 3. Start the Server
```bash
# From project root (recommended)
npm run start:server
# Or from this directory
npm run start:server
```
The server will start on `http://localhost:3001` by default.
### 4. Verify Server is Running
You should see output like:
```
🚀 MCP Remote Server started successfully!
📡 Server running at: http://0.0.0.0:3001
🔌 WebSocket endpoint: ws://0.0.0.0:3001/ws/mcp
🔌 Chrome extension endpoint: ws://0.0.0.0:3001/chrome
🌊 Streaming HTTP endpoint: http://0.0.0.0:3001/mcp
📡 SSE endpoint: http://0.0.0.0:3001/sse
```
### 5. Test the Connection
```bash
# Test WebSocket connection
node test-client.js
# Test streaming HTTP connection
node test-tools-list.js
# Test SSE connection
node test-sse-client.js
# Test simple health check
node test-health.js
```
## Available Scripts
- `npm run start:server` - Build and start the production server
- `npm run start` - Start the server (requires pre-built dist/)
- `npm run dev` - Start development server with auto-reload
- `npm run build` - Build TypeScript to JavaScript
- `npm run test` - Run tests
- `npm run lint` - Run ESLint
- `npm run format` - Format code with Prettier
## Environment Variables
- `PORT` - Server port (default: 3001)
- `HOST` - Server host (default: 0.0.0.0)
## API Endpoints
### HTTP Endpoints
- `GET /health` - Health check endpoint
### Streaming HTTP Endpoints (MCP Protocol)
- `POST /mcp` - Send MCP messages (initialization, tool calls, etc.)
- `GET /mcp` - Establish SSE stream for receiving responses (requires MCP-Session-ID header)
- `DELETE /mcp` - Terminate MCP session (requires MCP-Session-ID header)
### SSE Endpoints
- `GET /sse` - Server-Sent Events endpoint for MCP communication
- `POST /messages` - Send messages to SSE session (requires X-Session-ID header)
### WebSocket Endpoints
- `WS /ws/mcp` - MCP protocol WebSocket endpoint for Chrome control
- `WS /chrome` - Chrome extension WebSocket endpoint
## Available Tools
The server provides the following Chrome automation tools:
1. **navigate_to_url** - Navigate to a specific URL
2. **get_page_content** - Get page text content
3. **click_element** - Click on elements using CSS selectors
4. **fill_input** - Fill input fields with text
5. **take_screenshot** - Capture page screenshots
## Usage Examples
### Streamable HTTP Connection (Recommended)
```javascript
import fetch from 'node-fetch';
const SERVER_URL = 'http://localhost:3001';
const MCP_URL = `${SERVER_URL}/mcp`;
// Step 1: Initialize session
const initResponse = await fetch(MCP_URL, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
Accept: 'application/json, text/event-stream',
},
body: JSON.stringify({
jsonrpc: '2.0',
id: 1,
method: 'initialize',
params: {
protocolVersion: '2024-11-05',
capabilities: { tools: {} },
clientInfo: { name: 'my-client', version: '1.0.0' },
},
}),
});
const sessionId = initResponse.headers.get('mcp-session-id');
// Step 2: Call tools
const toolResponse = await fetch(MCP_URL, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
Accept: 'application/json, text/event-stream',
'MCP-Session-ID': sessionId,
},
body: JSON.stringify({
jsonrpc: '2.0',
id: 2,
method: 'tools/call',
params: {
name: 'navigate_to_url',
arguments: { url: 'https://example.com' },
},
}),
});
const result = await toolResponse.text(); // SSE format
```
### WebSocket Connection
```javascript
const ws = new WebSocket('ws://localhost:3001/ws/mcp');
// Navigate to a URL
ws.send(
JSON.stringify({
method: 'tools/call',
params: {
name: 'navigate_to_url',
arguments: { url: 'https://example.com' },
},
}),
);
// Get page content
ws.send(
JSON.stringify({
method: 'tools/call',
params: {
name: 'get_page_content',
arguments: {},
},
}),
);
```
## Streaming Capabilities
The MCP Remote Server now supports multiple connection types:
### 1. **Streamable HTTP (Recommended)**
- Full MCP protocol compliance
- Session-based communication
- Server-Sent Events for real-time responses
- Stateful connections with session management
- Compatible with MCP clients like CherryStudio
### 2. **Server-Sent Events (SSE)**
- Real-time streaming communication
- Lightweight alternative to WebSockets
- HTTP-based with automatic reconnection
### 3. **WebSocket (Legacy)**
- Real-time bidirectional communication
- Backward compatibility with existing clients
## Architecture
```
┌─────────────────┐ HTTP/SSE ┌──────────────────┐ WebSocket ┌─────────────────┐
│ MCP Client │ ◄──────────────► │ Remote Server │ ◄─────────────────► │ Chrome Extension │
│ (External App) │ WebSocket │ (This Server) │ │ │
└─────────────────┘ └──────────────────┘ └─────────────────┘
```
## Development
### Project Structure
```
src/
├── index.ts # Main server entry point
├── server/
│ ├── mcp-remote-server.ts # MCP protocol implementation
│ └── chrome-tools.ts # Chrome automation tools
└── types/ # TypeScript type definitions
```
### Adding New Tools
1. Add the tool definition in `mcp-remote-server.ts`
2. Implement the tool logic in `chrome-tools.ts`
3. Update the Chrome extension to handle new actions
## Troubleshooting
### Common Issues
1. **Server won't start**: Check if port 3000 is available
2. **Chrome extension not connecting**: Ensure the extension is installed and enabled
3. **WebSocket connection fails**: Check firewall settings and CORS configuration
### Logs
The server uses structured logging with Pino. Check console output for detailed error messages and debugging information.
## License
MIT License - see LICENSE file for details.