first commit

This commit is contained in:
nasir@endelospay.com
2025-08-12 02:54:17 +05:00
commit d97cad1736
225 changed files with 137626 additions and 0 deletions

308
docs/ARCHITECTURE.md Normal file
View File

@@ -0,0 +1,308 @@
# Chrome MCP Server Architecture 🏗️
This document provides a detailed technical overview of the Chrome MCP Server architecture, design decisions, and implementation details.
## 📋 Table of Contents
- [Overview](#overview)
- [System Architecture](#system-architecture)
- [Component Details](#component-details)
- [Data Flow](#data-flow)
- [AI Integration](#ai-integration)
- [Performance Optimizations](#performance-optimizations)
- [Security Considerations](#security-considerations)
## 🎯 Overview
Chrome MCP Server is a sophisticated browser automation platform that bridges AI assistants with Chrome browser capabilities through the Model Context Protocol (MCP). The architecture is designed for:
- **High Performance**: SIMD-optimized AI operations and efficient native messaging
- **Extensibility**: Modular tool system for easy feature additions
- **Reliability**: Robust error handling and graceful degradation
- **Security**: Sandboxed execution and permission-based access control
## 🏗️ System Architecture
```mermaid
graph TB
subgraph "AI Assistant Layer"
A[Claude Desktop]
B[Custom MCP Client]
C[Other AI Tools]
end
subgraph "MCP Protocol Layer"
D[HTTP/SSE Transport]
E[MCP Server Instance]
F[Tool Registry]
end
subgraph "Native Server Layer"
G[Fastify HTTP Server]
H[Native Messaging Host]
I[Session Management]
end
subgraph "Chrome Extension Layer"
J[Background Script]
K[Content Scripts]
L[Popup Interface]
M[Offscreen Documents]
end
subgraph "Browser APIs Layer"
N[Chrome APIs]
O[Web APIs]
P[Native Messaging]
end
subgraph "AI Processing Layer"
Q[Semantic Engine]
R[Vector Database]
S[SIMD Math Engine]
T[Web Workers]
end
A --> D
B --> D
C --> D
D --> E
E --> F
F --> G
G --> H
H --> P
P --> J
J --> K
J --> L
J --> M
J --> N
J --> O
J --> Q
Q --> R
Q --> S
Q --> T
```
## 🔧 Component Details
### 1. Native Server (`app/native-server/`)
**Purpose**: MCP protocol implementation and native messaging bridge
**Key Components**:
- **Fastify HTTP Server**: Handles MCP protocol over HTTP/SSE
- **Native Messaging Host**: Communicates with Chrome extension
- **Session Management**: Manages multiple MCP client sessions
- **Tool Registry**: Routes tool calls to Chrome extension
**Technologies**:
- TypeScript + Fastify
- MCP SDK (@modelcontextprotocol/sdk)
- Native messaging protocol
### 2. Chrome Extension (`app/chrome-extension/`)
**Purpose**: Browser automation and AI-powered content analysis
**Key Components**:
- **Background Script**: Main orchestrator and tool executor
- **Content Scripts**: Page interaction and content extraction
- **Popup Interface**: User configuration and status display
- **Offscreen Documents**: AI model processing in isolated context
**Technologies**:
- WXT Framework + Vue 3
- Chrome Extension APIs
- WebAssembly + SIMD
- Transformers.js
### 3. Shared Packages (`packages/`)
#### 3.1 Shared Types (`packages/shared/`)
- Tool schemas and type definitions
- Common interfaces and utilities
- MCP protocol types
#### 3.2 WASM SIMD (`packages/wasm-simd/`)
- Rust-based SIMD-optimized math functions
- WebAssembly compilation with Emscripten
- 4-8x performance improvement for vector operations
## 🔄 Data Flow
### Tool Execution Flow
```
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐ ┌──────────────┐
│ AI Assistant│ │ Native Server│ │ Chrome Extension│ │ Browser APIs │
└─────┬───────┘ └──────┬───────┘ └─────────┬───────┘ └──────┬───────┘
│ │ │ │
│ 1. Tool Call │ │ │
├──────────────────►│ │ │
│ │ 2. Native Message │ │
│ ├─────────────────────►│ │
│ │ │ 3. Execute Tool │
│ │ ├──────────────────►│
│ │ │ 4. API Response │
│ │ │◄──────────────────┤
│ │ 5. Tool Result │ │
│ │◄─────────────────────┤ │
│ 6. MCP Response │ │ │
│◄──────────────────┤ │ │
```
### AI Processing Flow
```
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐ ┌──────────────┐
│ Content │ │ Text Chunker │ │ Semantic Engine │ │ Vector DB │
│ Extraction │ │ │ │ │ │ │
└─────┬───────┘ └──────┬───────┘ └─────────┬───────┘ └──────┬───────┘
│ │ │ │
│ 1. Raw Content │ │ │
├──────────────────►│ │ │
│ │ 2. Text Chunks │ │
│ ├─────────────────────►│ │
│ │ │ 3. Embeddings │
│ │ ├──────────────────►│
│ │ │ │
│ │ 4. Search Query │ │
│ ├─────────────────────►│ │
│ │ │ 5. Query Vector │
│ │ ├──────────────────►│
│ │ │ 6. Similar Docs │
│ │ │◄──────────────────┤
│ │ 7. Search Results │ │
│ │◄─────────────────────┤ │
```
## 🧠 AI Integration
### Semantic Similarity Engine
**Architecture**:
- **Model Support**: BGE-small-en-v1.5, E5-small-v2, Universal Sentence Encoder
- **Execution Context**: Web Workers for non-blocking processing
- **Optimization**: SIMD acceleration for vector operations
- **Caching**: LRU cache for embeddings and tokenization
**Performance Optimizations**:
```typescript
// SIMD-accelerated cosine similarity
const similarity = await simdMath.cosineSimilarity(vecA, vecB);
// Batch processing for efficiency
const similarities = await simdMath.batchSimilarity(vectors, query, dimension);
// Memory-efficient matrix operations
const matrix = await simdMath.similarityMatrix(vectorsA, vectorsB, dimension);
```
### Vector Database (hnswlib-wasm)
**Features**:
- **Algorithm**: Hierarchical Navigable Small World (HNSW)
- **Implementation**: WebAssembly for near-native performance
- **Persistence**: IndexedDB storage with automatic cleanup
- **Scalability**: Handles 10,000+ documents efficiently
**Configuration**:
```typescript
const config: VectorDatabaseConfig = {
dimension: 384, // Model embedding dimension
maxElements: 10000, // Maximum documents
efConstruction: 200, // Build-time accuracy
M: 16, // Connectivity parameter
efSearch: 100, // Search-time accuracy
enableAutoCleanup: true, // Automatic old data removal
maxRetentionDays: 30, // Data retention period
};
```
## ⚡ Performance Optimizations
### 1. SIMD Acceleration
**Rust Implementation**:
```rust
use wide::f32x4;
fn cosine_similarity_simd(&self, vec_a: &[f32], vec_b: &[f32]) -> f32 {
let len = vec_a.len();
let simd_lanes = 4;
let simd_len = len - (len % simd_lanes);
let mut dot_sum_simd = f32x4::ZERO;
let mut norm_a_sum_simd = f32x4::ZERO;
let mut norm_b_sum_simd = f32x4::ZERO;
for i in (0..simd_len).step_by(simd_lanes) {
let a_chunk = f32x4::new(vec_a[i..i+4].try_into().unwrap());
let b_chunk = f32x4::new(vec_b[i..i+4].try_into().unwrap());
dot_sum_simd = a_chunk.mul_add(b_chunk, dot_sum_simd);
norm_a_sum_simd = a_chunk.mul_add(a_chunk, norm_a_sum_simd);
norm_b_sum_simd = b_chunk.mul_add(b_chunk, norm_b_sum_simd);
}
// Calculate final similarity
let dot_product = dot_sum_simd.reduce_add();
let norm_a = norm_a_sum_simd.reduce_add().sqrt();
let norm_b = norm_b_sum_simd.reduce_add().sqrt();
dot_product / (norm_a * norm_b)
}
```
### 2. Memory Management
**Strategies**:
- **Object Pooling**: Reuse Float32Array buffers
- **Lazy Loading**: Load AI models on-demand
- **Cache Management**: LRU eviction for embeddings
- **Garbage Collection**: Explicit cleanup of large objects
### 3. Concurrent Processing
**Web Workers**:
- **AI Processing**: Separate worker for model inference
- **Content Indexing**: Background indexing of tab content
- **Network Capture**: Parallel request processing
## 🔧 Extension Points
### Adding New Tools
1. **Define Schema** in `packages/shared/src/tools.ts`
2. **Implement Tool** extending `BaseBrowserToolExecutor`
3. **Register Tool** in tool index
4. **Add Tests** for functionality
### Custom AI Models
1. **Model Integration** in `SemanticSimilarityEngine`
2. **Worker Support** for processing
3. **Configuration** in model presets
4. **Performance Testing** with benchmarks
### Protocol Extensions
1. **MCP Extensions** for custom capabilities
2. **Transport Layers** for different communication methods
3. **Authentication** for secure connections
4. **Monitoring** for performance metrics
This architecture enables Chrome MCP Server to deliver high-performance browser automation with advanced AI capabilities while maintaining security and extensibility.

307
docs/ARCHITECTURE_zh.md Normal file
View File

@@ -0,0 +1,307 @@
# Chrome MCP Server 架构设计 🏗️
本文档提供 Chrome MCP Server 架构、设计决策和实现细节的详细技术概述。
## 📋 目录
- [概述](#概述)
- [系统架构](#系统架构)
- [组件详情](#组件详情)
- [数据流](#数据流)
- [AI 集成](#ai-集成)
- [性能优化](#性能优化)
- [安全考虑](#安全考虑)
## 🎯 概述
Chrome MCP Server 是一个复杂的浏览器自动化平台,通过模型上下文协议 (MCP) 将 AI 助手与 Chrome 浏览器功能连接起来。架构设计目标:
- **高性能**SIMD 优化的 AI 操作和高效的原生消息传递
- **可扩展性**:模块化工具系统,便于添加新功能
- **可靠性**:强大的错误处理和优雅降级
- **安全性**:沙盒执行和基于权限的访问控制
## 🏗️ 系统架构
```mermaid
graph TB
subgraph "AI 助手层"
A[Claude Desktop]
B[自定义 MCP 客户端]
C[其他 AI 工具]
end
subgraph "MCP 协议层"
D[HTTP/SSE 传输]
E[MCP 服务器实例]
F[工具注册表]
end
subgraph "原生服务器层"
G[Fastify HTTP 服务器]
H[原生消息主机]
I[会话管理]
end
subgraph "Chrome 扩展层"
J[后台脚本]
K[内容脚本]
L[弹窗界面]
M[离屏文档]
end
subgraph "浏览器 APIs 层"
N[Chrome APIs]
O[Web APIs]
P[原生消息]
end
subgraph "AI 处理层"
Q[语义引擎]
R[向量数据库]
S[SIMD 数学引擎]
T[Web Workers]
end
A --> D
B --> D
C --> D
D --> E
E --> F
F --> G
G --> H
H --> P
P --> J
J --> K
J --> L
J --> M
J --> N
J --> O
J --> Q
Q --> R
Q --> S
Q --> T
```
## 🔧 组件详情
### 1. 原生服务器 (`app/native-server/`)
**目的**MCP 协议实现和原生消息桥接
**核心组件**
- **Fastify HTTP 服务器**:处理基于 HTTP/SSE 的 MCP 协议
- **原生消息主机**:与 Chrome 扩展通信
- **会话管理**:管理多个 MCP 客户端会话
- **工具注册表**:将工具调用路由到 Chrome 扩展
**技术栈**
- TypeScript + Fastify
- MCP SDK (@modelcontextprotocol/sdk)
- 原生消息协议
### 2. Chrome 扩展 (`app/chrome-extension/`)
**目的**:浏览器自动化和 AI 驱动的内容分析
**核心组件**
- **后台脚本**:主要协调器和工具执行器
- **内容脚本**:页面交互和内容提取
- **弹窗界面**:用户配置和状态显示
- **离屏文档**:在隔离环境中进行 AI 模型处理
**技术栈**
- WXT 框架 + Vue 3
- Chrome 扩展 APIs
- WebAssembly + SIMD
- Transformers.js
### 3. 共享包 (`packages/`)
#### 3.1 共享类型 (`packages/shared/`)
- 工具模式和类型定义
- 通用接口和工具
- MCP 协议类型
#### 3.2 WASM SIMD (`packages/wasm-simd/`)
- 基于 Rust 的 SIMD 优化数学函数
- 使用 Emscripten 编译 WebAssembly
- 向量运算性能提升 4-8 倍
## 🔄 数据流
### 工具执行流程
```
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐ ┌──────────────┐
│ AI 助手 │ │ 原生服务器 │ │ Chrome 扩展 │ │ 浏览器 APIs │
└─────┬───────┘ └──────┬───────┘ └─────────┬───────┘ └──────┬───────┘
│ │ │ │
│ 1. 工具调用 │ │ │
├──────────────────►│ │ │
│ │ 2. 原生消息 │ │
│ ├─────────────────────►│ │
│ │ │ 3. 执行工具 │
│ │ ├──────────────────►│
│ │ │ 4. API 响应 │
│ │ │◄──────────────────┤
│ │ 5. 工具结果 │ │
│ │◄─────────────────────┤ │
│ 6. MCP 响应 │ │ │
│◄──────────────────┤ │ │
```
### AI 处理流程
```
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐ ┌──────────────┐
│ 内容提取 │ │ 文本分块器 │ │ 语义引擎 │ │ 向量数据库 │
└─────┬───────┘ └──────┬───────┘ └─────────┬───────┘ └──────┬───────┘
│ │ │ │
│ 1. 原始内容 │ │ │
├──────────────────►│ │ │
│ │ 2. 文本块 │ │
│ ├─────────────────────►│ │
│ │ │ 3. 嵌入向量 │
│ │ ├──────────────────►│
│ │ │ │
│ │ 4. 搜索查询 │ │
│ ├─────────────────────►│ │
│ │ │ 5. 查询向量 │
│ │ ├──────────────────►│
│ │ │ 6. 相似文档 │
│ │ │◄──────────────────┤
│ │ 7. 搜索结果 │ │
│ │◄─────────────────────┤ │
```
## 🧠 AI 集成
### 语义相似度引擎
**架构**
- **模型支持**BGE-small-en-v1.5、E5-small-v2、Universal Sentence Encoder
- **执行环境**Web Workers 用于非阻塞处理
- **优化**:向量运算的 SIMD 加速
- **缓存**:嵌入和分词的 LRU 缓存
**性能优化**
```typescript
// SIMD 加速的余弦相似度
const similarity = await simdMath.cosineSimilarity(vecA, vecB);
// 批处理提高效率
const similarities = await simdMath.batchSimilarity(vectors, query, dimension);
// 内存高效的矩阵运算
const matrix = await simdMath.similarityMatrix(vectorsA, vectorsB, dimension);
```
### 向量数据库 (hnswlib-wasm)
**特性**
- **算法**:分层导航小世界 (HNSW)
- **实现**WebAssembly 实现接近原生性能
- **持久化**IndexedDB 存储,自动清理
- **可扩展性**:高效处理 10,000+ 文档
**配置**
```typescript
const config: VectorDatabaseConfig = {
dimension: 384, // 模型嵌入维度
maxElements: 10000, // 最大文档数
efConstruction: 200, // 构建时精度
M: 16, // 连接参数
efSearch: 100, // 搜索时精度
enableAutoCleanup: true, // 自动清理旧数据
maxRetentionDays: 30, // 数据保留期
};
```
## ⚡ 性能优化
### 1. SIMD 加速
**Rust 实现**
```rust
use wide::f32x4;
fn cosine_similarity_simd(&self, vec_a: &[f32], vec_b: &[f32]) -> f32 {
let len = vec_a.len();
let simd_lanes = 4;
let simd_len = len - (len % simd_lanes);
let mut dot_sum_simd = f32x4::ZERO;
let mut norm_a_sum_simd = f32x4::ZERO;
let mut norm_b_sum_simd = f32x4::ZERO;
for i in (0..simd_len).step_by(simd_lanes) {
let a_chunk = f32x4::new(vec_a[i..i+4].try_into().unwrap());
let b_chunk = f32x4::new(vec_b[i..i+4].try_into().unwrap());
dot_sum_simd = a_chunk.mul_add(b_chunk, dot_sum_simd);
norm_a_sum_simd = a_chunk.mul_add(a_chunk, norm_a_sum_simd);
norm_b_sum_simd = b_chunk.mul_add(b_chunk, norm_b_sum_simd);
}
// 计算最终相似度
let dot_product = dot_sum_simd.reduce_add();
let norm_a = norm_a_sum_simd.reduce_add().sqrt();
let norm_b = norm_b_sum_simd.reduce_add().sqrt();
dot_product / (norm_a * norm_b)
}
```
### 2. 内存管理
**策略**
- **对象池**:重用 Float32Array 缓冲区
- **延迟加载**:按需加载 AI 模型
- **缓存管理**:嵌入的 LRU 淘汰
- **垃圾回收**:显式清理大对象
### 3. 并发处理
**Web Workers**
- **AI 处理**:模型推理的独立 worker
- **内容索引**:后台标签页内容索引
- **网络捕获**:并行请求处理
## 🔧 扩展点
### 添加新工具
1. **定义模式**`packages/shared/src/tools.ts`
2. **实现工具** 继承 `BaseBrowserToolExecutor`
3. **注册工具** 在工具索引中
4. **添加测试** 用于功能测试
### 自定义 AI 模型
1. **模型集成**`SemanticSimilarityEngine`
2. **Worker 支持** 用于处理
3. **配置** 在模型预设中
4. **性能测试** 使用基准测试
### 协议扩展
1. **MCP 扩展** 用于自定义功能
2. **传输层** 用于不同通信方法
3. **身份验证** 用于安全连接
4. **监控** 用于性能指标
此架构使 Chrome MCP Server 能够在保持安全性和可扩展性的同时,提供高性能的浏览器自动化和先进的 AI 功能。

120
docs/CHANGELOG.md Normal file
View File

@@ -0,0 +1,120 @@
# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [v0.0.5]
### Improved
- **Image Compression**: Compress base64 images when using screenshot tool
- **Interactive Elements Detection Optimization**: Enhanced interactive elements detection tool with expanded search scope, now supports finding interactive div elements
## [v0.0.4]
### Added
- **STDIO Connection Support**: Added support for connecting to the MCP server via standard input/output (stdio) method
- **Console Output Capture Tool**: New `chrome_console` tool for capturing browser console output
## [v0.0.3]
### Added
- **Inject script tool**: For injecting content scripts into web page
- **Send command to inject script tool**: For sending commands to the injected script
## [v0.0.2]
### Added
- **Conditional Semantic Engine Initialization**: Smart cache-based initialization that only loads models when cached versions are available
- **Enhanced Model Cache Management**: Comprehensive cache management system with automatic cleanup and size limits
- **Windows Platform Compatibility**: Full support for Windows Chrome Native Messaging with registry-based manifest detection
- **Cache Statistics and Manual Management**: User interface for viewing cache stats and manual cache cleanup
- **Concurrent Initialization Protection**: Prevents duplicate initialization attempts across components
### Improved
- **Startup Performance**: Dramatically reduced startup time when no model cache exists (from ~3s to ~0.5s)
- **Memory Usage**: Optimized memory consumption through on-demand model loading
- **Cache Expiration Logic**: Intelligent cache expiration (14 days) with automatic cleanup
- **Error Handling**: Enhanced error handling for model initialization failures
- **Component Coordination**: Simplified initialization flow between semantic engine and content indexer
### Fixed
- **Windows Native Host Issues**: Resolved Node.js environment conflicts with multiple NVM installations
- **Race Condition Prevention**: Eliminated concurrent initialization attempts that could cause conflicts
- **Cache Size Management**: Automatic cleanup when cache exceeds 500MB limit
- **Model Download Optimization**: Prevents unnecessary model downloads during plugin startup
### Technical Improvements
- **ModelCacheManager**: Added `isModelCached()` and `hasAnyValidCache()` methods for cache detection
- **SemanticSimilarityEngine**: Added cache checking functions and conditional initialization logic
- **Background Script**: Implemented smart initialization based on cache availability
- **VectorSearchTool**: Simplified to passive initialization model
- **ContentIndexer**: Enhanced with semantic engine readiness checks
### Documentation
- Added comprehensive conditional initialization documentation
- Updated cache management system documentation
- Created troubleshooting guides for Windows platform issues
## [v0.0.1]
### Added
- **Core Browser Tools**: Complete set of browser automation tools for web interaction
- **Click Tool**: Intelligent element clicking with coordinate and selector support
- **Fill Tool**: Form filling with text input and selection capabilities
- **Screenshot Tool**: Full page and element-specific screenshot capture
- **Navigation Tools**: URL navigation and page interaction utilities
- **Keyboard Tool**: Keyboard input simulation and hotkey support
- **Vector Search Engine**: Advanced semantic search capabilities
- **Content Indexing**: Automatic indexing of browser tab content
- **Semantic Similarity**: AI-powered text similarity matching
- **Vector Database**: Efficient storage and retrieval of embeddings
- **Multi-language Support**: Comprehensive multilingual text processing
- **Native Host Integration**: Seamless communication with external applications
- **Chrome Native Messaging**: Bidirectional communication channel
- **Cross-platform Support**: Windows, macOS, and Linux compatibility
- **Message Protocol**: Structured messaging system for tool execution
- **AI Model Integration**: State-of-the-art language models for semantic processing
- **Transformer Models**: Support for multiple pre-trained models
- **ONNX Runtime**: Optimized model inference with WebAssembly
- **Model Management**: Dynamic model loading and switching
- **Performance Optimization**: SIMD acceleration and memory pooling
- **User Interface**: Intuitive popup interface for extension management
- **Model Selection**: Easy switching between different AI models
- **Status Monitoring**: Real-time initialization and download progress
- **Settings Management**: User preferences and configuration options
- **Cache Management**: Visual cache statistics and cleanup controls
### Technical Foundation
- **Extension Architecture**: Robust Chrome extension with background scripts and content injection
- **Worker-based Processing**: Offscreen document for heavy computational tasks
- **Memory Management**: LRU caching and efficient resource utilization
- **Error Handling**: Comprehensive error reporting and recovery mechanisms
- **TypeScript Implementation**: Full type safety and modern JavaScript features
### Initial Features
- Multi-tab content analysis and search
- Real-time semantic similarity computation
- Automated web page interaction
- Cross-platform native messaging
- Extensible tool framework for future enhancements

265
docs/CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,265 @@
# Contributing Guide 🤝
Thank you for your interest in contributing to Chrome MCP Server! This document provides guidelines and information for contributors.
## 🎯 How to Contribute
We welcome contributions in many forms:
- 🐛 Bug reports and fixes
- ✨ New features and tools
- 📚 Documentation improvements
- 🧪 Tests and performance optimizations
- 🌐 Translations and internationalization
- 💡 Ideas and suggestions
## 🚀 Getting Started
### Prerequisites
- **Node.js 18.19.0+** and **pnpm or npm** (latest version)
- **Chrome/Chromium** browser for testing
- **Git** for version control
- **Rust** (for WASM development, optional)
- **TypeScript** knowledge
### Development Setup
1. **Fork and clone the repository**
```bash
git clone https://github.com/YOUR_USERNAME/chrome-mcp-server.git
cd chrome-mcp-server
```
2. **Install dependencies**
```bash
pnpm install
```
3. **Start the project**
```bash
npm run dev
```
4. **Load the extension in Chrome**
- Open `chrome://extensions/`
- Enable "Developer mode"
- Click "Load unpacked" and select `your/extension/dist`
## 🏗️ Project Structure
```
chrome-mcp-server/
├── app/
│ ├── chrome-extension/ # Chrome extension (WXT + Vue 3)
│ │ ├── entrypoints/ # Background scripts, popup, content scripts
│ │ ├── utils/ # AI models, vector database, utilities
│ │ └── workers/ # Web Workers for AI processing
│ └── native-server/ # Native messaging server (Fastify + TypeScript)
│ ├── src/mcp/ # MCP protocol implementation
│ └── src/server/ # HTTP server and native messaging
├── packages/
│ ├── shared/ # Shared types and utilities
│ └── wasm-simd/ # SIMD-optimized WebAssembly math functions
└── docs/ # Documentation
```
## 🛠️ Development Workflow
### Adding New Tools
1. **Define the tool schema in `packages/shared/src/tools.ts`**:
```typescript
{
name: 'your_new_tool',
description: 'Description of what your tool does',
inputSchema: {
type: 'object',
properties: {
// Define parameters
},
required: ['param1']
}
}
```
2. **Implement the tool in `app/chrome-extension/entrypoints/background/tools/browser/`**:
```typescript
class YourNewTool extends BaseBrowserToolExecutor {
name = TOOL_NAMES.BROWSER.YOUR_NEW_TOOL;
async execute(args: YourToolParams): Promise<ToolResult> {
// Implementation
}
}
```
3. **Export the tool in `app/chrome-extension/entrypoints/background/tools/browser/index.ts`**
4. **Add tests in the appropriate test directory**
### Code Style Guidelines
- **TypeScript**: Use strict TypeScript with proper typing
- **ESLint**: Follow the configured ESLint rules (`pnpm lint`)
- **Prettier**: Format code with Prettier (`pnpm format`)
- **Naming**: Use descriptive names and follow existing patterns
- **Comments**: Add JSDoc comments for public APIs
- **Error Handling**: Always handle errors gracefully
## 📝 Pull Request Process
1. **Create a feature branch**
```bash
git checkout -b feature/your-feature-name
```
2. **Make your changes**
- Follow the code style guidelines
- Add tests for new functionality
- Update documentation if needed
3. **Test your changes**
- Ensure all existing tests pass
- Test the Chrome extension manually
- Verify MCP protocol compatibility
4. **Commit your changes**
```bash
git add .
git commit -m "feat: add your feature description"
```
We use [Conventional Commits](https://www.conventionalcommits.org/):
- `feat:` for new features
- `fix:` for bug fixes
- `docs:` for documentation changes
- `test:` for adding tests
- `refactor:` for code refactoring
5. **Push and create a Pull Request**
```bash
git push origin feature/your-feature-name
```
## 🐛 Bug Reports
When reporting bugs, please include:
- **Environment**: OS, Chrome version, Node.js version
- **Steps to reproduce**: Clear, step-by-step instructions
- **Expected behavior**: What should happen
- **Actual behavior**: What actually happens
- **Screenshots/logs**: If applicable
- **MCP client**: Which MCP client you're using (Claude Desktop, etc.)
## 💡 Feature Requests
For feature requests, please provide:
- **Use case**: Why is this feature needed?
- **Proposed solution**: How should it work?
- **Alternatives**: Any alternative solutions considered?
- **Additional context**: Screenshots, examples, etc.
## 🔧 Development Tips
### Using WASM SIMD
If you're contributing to the WASM SIMD package:
```bash
cd packages/wasm-simd
# Install Rust and wasm-pack if not already installed
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
cargo install wasm-pack
# Build WASM package
pnpm build
# The built files will be copied to app/chrome-extension/workers/
```
### Debugging Chrome Extension
- Use Chrome DevTools for debugging extension popup and background scripts
- Check `chrome://extensions/` for extension errors
- Use `console.log` statements for debugging
- Monitor the native messaging connection in the background script
### Testing MCP Protocol
- Use MCP Inspector for protocol debugging
- Test with different MCP clients (Claude Desktop, custom clients)
- Verify tool schemas and responses match MCP specifications
## 📚 Resources
- [Model Context Protocol Specification](https://modelcontextprotocol.io/)
- [Chrome Extension Development](https://developer.chrome.com/docs/extensions/)
- [WXT Framework Documentation](https://wxt.dev/)
- [TypeScript Handbook](https://www.typescriptlang.org/docs/)
## 🤝 Community
- **GitHub Issues**: For bug reports and feature requests
- **GitHub Discussions**: For questions and general discussion
- **Pull Requests**: For code contributions
## 📄 License
By contributing to Chrome MCP Server, you agree that your contributions will be licensed under the MIT License.
## 🎯 Contributor Guidelines
### New Contributors
If you're contributing to an open source project for the first time:
1. **Start small**: Look for issues labeled "good first issue"
2. **Read the code**: Familiarize yourself with the project structure and coding style
3. **Ask questions**: Ask questions in GitHub Discussions
4. **Learn the tools**: Get familiar with Git, GitHub, TypeScript, and other tools
### Experienced Contributors
- **Architecture improvements**: Propose system-level improvements
- **Performance optimization**: Identify and fix performance bottlenecks
- **New features**: Design and implement complex new features
- **Mentor newcomers**: Help new contributors get started
### Documentation Contributions
- **API documentation**: Improve tool documentation and examples
- **Tutorials**: Create usage guides and best practices
- **Translations**: Help translate documentation to other languages
- **Video content**: Create demo videos and tutorials
### Testing Contributions
- **Unit tests**: Write tests for new features
- **Integration tests**: Test interactions between components
- **Performance tests**: Benchmark testing and performance regression detection
- **User testing**: Functional testing in real-world scenarios
## 🏆 Contributor Recognition
We value every contribution, no matter how big or small. Contributors will be recognized in the following ways:
- **README acknowledgments**: Contributors listed in the project README
- **Release notes**: Contributors thanked in version release notes
- **Contributor badges**: Contributor badges on GitHub profiles
- **Community recognition**: Special thanks in community discussions
Thank you for considering contributing to Chrome MCP Server! Your participation makes this project better.

265
docs/CONTRIBUTING_zh.md Normal file
View File

@@ -0,0 +1,265 @@
# 贡献指南 🤝
感谢您对 Chrome MCP Server 项目的贡献兴趣!本文档为贡献者提供指南和信息。
## 🎯 如何贡献
我们欢迎多种形式的贡献:
- 🐛 错误报告和修复
- ✨ 新功能和工具
- 📚 文档改进
- 🧪 测试和性能优化
- 🌐 翻译和国际化
- 💡 想法和建议
## 🚀 开始贡献
### 环境要求
- **Node.js 18+** 和 **pnpm**(最新版本)
- **Chrome/Chromium** 浏览器用于测试
- **Git** 版本控制
- **Rust**(用于 WASM 开发,可选)
- **TypeScript** 知识
### 开发环境设置
1. **Fork 并克隆仓库**
```bash
git clone https://github.com/YOUR_USERNAME/chrome-mcp-server.git
cd chrome-mcp-server
```
2. **安装依赖**
```bash
pnpm install
```
3. **启动项目**
```bash
npm run dev
```
4. **在 Chrome 中加载扩展**
- 打开 `chrome://extensions/`
- 启用"开发者模式"
- 点击"加载已解压的扩展程序",选择 `your/extension/dist`
## 🏗️ 项目结构
```
chrome-mcp-server/
├── app/
│ ├── chrome-extension/ # Chrome 扩展 (WXT + Vue 3)
│ │ ├── entrypoints/ # 后台脚本、弹窗、内容脚本
│ │ ├── utils/ # AI 模型、向量数据库、工具
│ │ └── workers/ # 用于 AI 处理的 Web Workers
│ └── native-server/ # 原生消息服务器 (Fastify + TypeScript)
│ ├── src/mcp/ # MCP 协议实现
│ └── src/server/ # HTTP 服务器和原生消息
├── packages/
│ ├── shared/ # 共享类型和工具
│ └── wasm-simd/ # SIMD 优化的 WebAssembly 数学函数
└── docs/ # 文档
```
## 🛠️ 开发工作流
### 添加新工具
1. **在 `packages/shared/src/tools.ts` 中定义工具模式**
```typescript
{
name: 'your_new_tool',
description: '描述您的工具功能',
inputSchema: {
type: 'object',
properties: {
// 定义参数
},
required: ['param1']
}
}
```
2. **在 `app/chrome-extension/entrypoints/background/tools/browser/` 中实现工具**
```typescript
class YourNewTool extends BaseBrowserToolExecutor {
name = TOOL_NAMES.BROWSER.YOUR_NEW_TOOL;
async execute(args: YourToolParams): Promise<ToolResult> {
// 实现
}
}
```
3. **在 `app/chrome-extension/entrypoints/background/tools/browser/index.ts` 中导出工具**
4. **在相应的测试目录中添加测试**
### 代码风格指南
- **TypeScript**:使用严格的 TypeScript 和适当的类型
- **ESLint**:遵循配置的 ESLint 规则(`pnpm lint`
- **Prettier**:使用 Prettier 格式化代码(`pnpm format`
- **命名**:使用描述性名称并遵循现有模式
- **注释**:为公共 API 添加 JSDoc 注释
- **错误处理**:始终优雅地处理错误
## 📝 Pull Request 流程
1. **创建功能分支**
```bash
git checkout -b feature/your-feature-name
```
2. **进行更改**
- 遵循代码风格指南
- 为新功能添加测试
- 如需要,更新文档
3. **测试您的更改**
- 确保所有现有测试通过
- 手动测试 Chrome 扩展
- 验证 MCP 协议兼容性
4. **提交您的更改**
```bash
git add .
git commit -m "feat: 添加您的功能描述"
```
我们使用 [约定式提交](https://www.conventionalcommits.org/)
- `feat:` 用于新功能
- `fix:` 用于错误修复
- `docs:` 用于文档更改
- `test:` 用于添加测试
- `refactor:` 用于代码重构
5. **推送并创建 Pull Request**
```bash
git push origin feature/your-feature-name
```
## 🐛 错误报告
报告错误时,请包含:
- **环境**操作系统、Chrome 版本、Node.js 版本
- **重现步骤**:清晰的分步说明
- **预期行为**:应该发生什么
- **实际行为**:实际发生了什么
- **截图/日志**:如果适用
- **MCP 客户端**:您使用的 MCP 客户端Claude Desktop 等)
## 💡 功能请求
对于功能请求,请提供:
- **用例**:为什么需要这个功能?
- **建议解决方案**:它应该如何工作?
- **替代方案**:考虑过的任何替代解决方案?
- **附加上下文**:截图、示例等
## 🔧 开发技巧
### 使用 WASM SIMD
如果您要为 WASM SIMD 包做贡献:
```bash
cd packages/wasm-simd
# 如果尚未安装,请安装 Rust 和 wasm-pack
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
cargo install wasm-pack
# 构建 WASM 包
pnpm build
# 构建的文件将复制到 app/chrome-extension/workers/
```
### 调试 Chrome 扩展
- 使用 Chrome DevTools 调试扩展弹窗和后台脚本
- 检查 `chrome://extensions/` 查看扩展错误
- 使用 `console.log` 语句进行调试
- 在后台脚本中监控原生消息连接
### 测试 MCP 协议
- 使用 MCP Inspector 进行协议调试
- 使用不同的 MCP 客户端测试Claude Desktop、自定义客户端
- 验证工具模式和响应符合 MCP 规范
## 📚 资源
- [模型上下文协议规范](https://modelcontextprotocol.io/)
- [Chrome 扩展开发](https://developer.chrome.com/docs/extensions/)
- [WXT 框架文档](https://wxt.dev/)
- [TypeScript 手册](https://www.typescriptlang.org/docs/)
## 🤝 社区
- **GitHub Issues**:用于错误报告和功能请求
- **GitHub Discussions**:用于问题和一般讨论
- **Pull Requests**:用于代码贡献
## 📄 许可证
通过为 Chrome MCP Server 做贡献,您同意您的贡献将在 MIT 许可证下获得许可。
## 🎯 贡献者指南
### 新手贡献者
如果您是第一次为开源项目做贡献:
1. **从小处开始**:寻找标有 "good first issue" 的问题
2. **阅读代码**:熟悉项目结构和编码风格
3. **提问**:在 GitHub Discussions 中提出问题
4. **学习工具**:了解 Git、GitHub、TypeScript 等工具
### 经验丰富的贡献者
- **架构改进**:提出系统级改进建议
- **性能优化**:识别和修复性能瓶颈
- **新功能**:设计和实现复杂的新功能
- **指导新手**:帮助新贡献者入门
### 文档贡献
- **API 文档**:改进工具文档和示例
- **教程**:创建使用指南和最佳实践
- **翻译**:帮助翻译文档到其他语言
- **视频内容**:创建演示视频和教程
### 测试贡献
- **单元测试**:为新功能编写测试
- **集成测试**:测试组件间的交互
- **性能测试**:基准测试和性能回归检测
- **用户测试**:真实场景下的功能测试
## 🏆 贡献者认可
我们重视每一个贡献,无论大小。贡献者将在以下方式获得认可:
- **README 致谢**:在项目 README 中列出贡献者
- **发布说明**:在版本发布说明中感谢贡献者
- **贡献者徽章**GitHub 个人资料上的贡献者徽章
- **社区认可**:在社区讨论中的特别感谢
感谢您考虑为 Chrome MCP Server 做贡献!您的参与使这个项目变得更好。

529
docs/TOOLS.md Normal file
View File

@@ -0,0 +1,529 @@
# Chrome MCP Server API Reference 📚
Complete reference for all available tools and their parameters.
## 📋 Table of Contents
- [Browser Management](#browser-management)
- [Screenshots & Visual](#screenshots--visual)
- [Network Monitoring](#network-monitoring)
- [Content Analysis](#content-analysis)
- [Interaction](#interaction)
- [Data Management](#data-management)
- [Response Format](#response-format)
## 📊 Browser Management
### `get_windows_and_tabs`
List all currently open browser windows and tabs.
**Parameters**: None
**Response**:
```json
{
"windowCount": 2,
"tabCount": 5,
"windows": [
{
"windowId": 123,
"tabs": [
{
"tabId": 456,
"url": "https://example.com",
"title": "Example Page",
"active": true
}
]
}
]
}
```
### `chrome_navigate`
Navigate to a URL with optional viewport control.
**Parameters**:
- `url` (string, required): URL to navigate to
- `newWindow` (boolean, optional): Create new window (default: false)
- `width` (number, optional): Viewport width in pixels (default: 1280)
- `height` (number, optional): Viewport height in pixels (default: 720)
**Example**:
```json
{
"url": "https://example.com",
"newWindow": true,
"width": 1920,
"height": 1080
}
```
### `chrome_close_tabs`
Close specific tabs or windows.
**Parameters**:
- `tabIds` (array, optional): Array of tab IDs to close
- `windowIds` (array, optional): Array of window IDs to close
**Example**:
```json
{
"tabIds": [123, 456],
"windowIds": [789]
}
```
### `chrome_go_back_or_forward`
Navigate browser history.
**Parameters**:
- `direction` (string, required): "back" or "forward"
- `tabId` (number, optional): Specific tab ID (default: active tab)
**Example**:
```json
{
"direction": "back",
"tabId": 123
}
```
## 📸 Screenshots & Visual
### `chrome_screenshot`
Take advanced screenshots with various options.
**Parameters**:
- `name` (string, optional): Screenshot filename
- `selector` (string, optional): CSS selector for element screenshot
- `width` (number, optional): Width in pixels (default: 800)
- `height` (number, optional): Height in pixels (default: 600)
- `storeBase64` (boolean, optional): Return base64 data (default: false)
- `fullPage` (boolean, optional): Capture full page (default: true)
**Example**:
```json
{
"selector": ".main-content",
"fullPage": true,
"storeBase64": true,
"width": 1920,
"height": 1080
}
```
**Response**:
```json
{
"success": true,
"base64": "...",
"dimensions": {
"width": 1920,
"height": 1080
}
}
```
## 🌐 Network Monitoring
### `chrome_network_capture_start`
Start capturing network requests using webRequest API.
**Parameters**:
- `url` (string, optional): URL to navigate to and capture
- `maxCaptureTime` (number, optional): Maximum capture time in ms (default: 30000)
- `inactivityTimeout` (number, optional): Stop after inactivity in ms (default: 3000)
- `includeStatic` (boolean, optional): Include static resources (default: false)
**Example**:
```json
{
"url": "https://api.example.com",
"maxCaptureTime": 60000,
"includeStatic": false
}
```
### `chrome_network_capture_stop`
Stop network capture and return collected data.
**Parameters**: None
**Response**:
```json
{
"success": true,
"capturedRequests": [
{
"url": "https://api.example.com/data",
"method": "GET",
"status": 200,
"requestHeaders": {...},
"responseHeaders": {...},
"responseTime": 150
}
],
"summary": {
"totalRequests": 15,
"captureTime": 5000
}
}
```
### `chrome_network_debugger_start`
Start capturing with Chrome Debugger API (includes response bodies).
**Parameters**:
- `url` (string, optional): URL to navigate to and capture
### `chrome_network_debugger_stop`
Stop debugger capture and return data with response bodies.
### `chrome_network_request`
Send custom HTTP requests.
**Parameters**:
- `url` (string, required): Request URL
- `method` (string, optional): HTTP method (default: "GET")
- `headers` (object, optional): Request headers
- `body` (string, optional): Request body
**Example**:
```json
{
"url": "https://api.example.com/data",
"method": "POST",
"headers": {
"Content-Type": "application/json"
},
"body": "{\"key\": \"value\"}"
}
```
## 🔍 Content Analysis
### `search_tabs_content`
AI-powered semantic search across browser tabs.
**Parameters**:
- `query` (string, required): Search query
**Example**:
```json
{
"query": "machine learning tutorials"
}
```
**Response**:
```json
{
"success": true,
"totalTabsSearched": 10,
"matchedTabsCount": 3,
"vectorSearchEnabled": true,
"indexStats": {
"totalDocuments": 150,
"totalTabs": 10,
"semanticEngineReady": true
},
"matchedTabs": [
{
"tabId": 123,
"url": "https://example.com/ml-tutorial",
"title": "Machine Learning Tutorial",
"semanticScore": 0.85,
"matchedSnippets": ["Introduction to machine learning..."],
"chunkSource": "content"
}
]
}
```
### `chrome_get_web_content`
Extract HTML or text content from web pages.
**Parameters**:
- `format` (string, optional): "html" or "text" (default: "text")
- `selector` (string, optional): CSS selector for specific elements
- `tabId` (number, optional): Specific tab ID (default: active tab)
**Example**:
```json
{
"format": "text",
"selector": ".article-content"
}
```
### `chrome_get_interactive_elements`
Find clickable and interactive elements on the page.
**Parameters**:
- `tabId` (number, optional): Specific tab ID (default: active tab)
**Response**:
```json
{
"elements": [
{
"selector": "#submit-button",
"type": "button",
"text": "Submit",
"visible": true,
"clickable": true
}
]
}
```
## 🎯 Interaction
### `chrome_click_element`
Click elements using CSS selectors.
**Parameters**:
- `selector` (string, required): CSS selector for target element
- `tabId` (number, optional): Specific tab ID (default: active tab)
**Example**:
```json
{
"selector": "#submit-button"
}
```
### `chrome_fill_or_select`
Fill form fields or select options.
**Parameters**:
- `selector` (string, required): CSS selector for target element
- `value` (string, required): Value to fill or select
- `tabId` (number, optional): Specific tab ID (default: active tab)
**Example**:
```json
{
"selector": "#email-input",
"value": "user@example.com"
}
```
### `chrome_keyboard`
Simulate keyboard input and shortcuts.
**Parameters**:
- `keys` (string, required): Key combination (e.g., "Ctrl+C", "Enter")
- `selector` (string, optional): Target element selector
- `delay` (number, optional): Delay between keystrokes in ms (default: 0)
**Example**:
```json
{
"keys": "Ctrl+A",
"selector": "#text-input",
"delay": 100
}
```
## 📚 Data Management
### `chrome_history`
Search browser history with filters.
**Parameters**:
- `text` (string, optional): Search text in URL/title
- `startTime` (string, optional): Start date (ISO format)
- `endTime` (string, optional): End date (ISO format)
- `maxResults` (number, optional): Maximum results (default: 100)
- `excludeCurrentTabs` (boolean, optional): Exclude current tabs (default: true)
**Example**:
```json
{
"text": "github",
"startTime": "2024-01-01",
"maxResults": 50
}
```
### `chrome_bookmark_search`
Search bookmarks by keywords.
**Parameters**:
- `query` (string, optional): Search keywords
- `maxResults` (number, optional): Maximum results (default: 100)
- `folderPath` (string, optional): Search within specific folder
**Example**:
```json
{
"query": "documentation",
"maxResults": 20,
"folderPath": "Work/Resources"
}
```
### `chrome_bookmark_add`
Add new bookmarks with folder support.
**Parameters**:
- `url` (string, optional): URL to bookmark (default: current tab)
- `title` (string, optional): Bookmark title (default: page title)
- `parentId` (string, optional): Parent folder ID or path
- `createFolder` (boolean, optional): Create folder if not exists (default: false)
**Example**:
```json
{
"url": "https://example.com",
"title": "Example Site",
"parentId": "Work/Resources",
"createFolder": true
}
```
### `chrome_bookmark_delete`
Delete bookmarks by ID or URL.
**Parameters**:
- `bookmarkId` (string, optional): Bookmark ID to delete
- `url` (string, optional): URL to find and delete
**Example**:
```json
{
"url": "https://example.com"
}
```
## 📋 Response Format
All tools return responses in the following format:
```json
{
"content": [
{
"type": "text",
"text": "JSON string containing the actual response data"
}
],
"isError": false
}
```
For errors:
```json
{
"content": [
{
"type": "text",
"text": "Error message describing what went wrong"
}
],
"isError": true
}
```
## 🔧 Usage Examples
### Complete Workflow Example
```javascript
// 1. Navigate to a page
await callTool('chrome_navigate', {
url: 'https://example.com',
});
// 2. Take a screenshot
const screenshot = await callTool('chrome_screenshot', {
fullPage: true,
storeBase64: true,
});
// 3. Start network monitoring
await callTool('chrome_network_capture_start', {
maxCaptureTime: 30000,
});
// 4. Interact with the page
await callTool('chrome_click_element', {
selector: '#load-data-button',
});
// 5. Search content semantically
const searchResults = await callTool('search_tabs_content', {
query: 'user data analysis',
});
// 6. Stop network capture
const networkData = await callTool('chrome_network_capture_stop');
// 7. Save bookmark
await callTool('chrome_bookmark_add', {
title: 'Data Analysis Page',
parentId: 'Work/Analytics',
});
```
This API provides comprehensive browser automation capabilities with AI-enhanced content analysis and semantic search features.

529
docs/TOOLS_zh.md Normal file
View File

@@ -0,0 +1,529 @@
# Chrome MCP Server API 参考 📚
所有可用工具及其参数的完整参考。
## 📋 目录
- [浏览器管理](#浏览器管理)
- [截图和视觉](#截图和视觉)
- [网络监控](#网络监控)
- [内容分析](#内容分析)
- [交互操作](#交互操作)
- [数据管理](#数据管理)
- [响应格式](#响应格式)
## 📊 浏览器管理
### `get_windows_and_tabs`
列出当前打开的所有浏览器窗口和标签页。
**参数**:无
**响应**
```json
{
"windowCount": 2,
"tabCount": 5,
"windows": [
{
"windowId": 123,
"tabs": [
{
"tabId": 456,
"url": "https://example.com",
"title": "示例页面",
"active": true
}
]
}
]
}
```
### `chrome_navigate`
导航到指定 URL可选择控制视口。
**参数**
- `url` (字符串,必需):要导航到的 URL
- `newWindow` (布尔值,可选)创建新窗口默认false
- `width` (数字,可选)视口宽度像素默认1280
- `height` (数字,可选)视口高度像素默认720
**示例**
```json
{
"url": "https://example.com",
"newWindow": true,
"width": 1920,
"height": 1080
}
```
### `chrome_close_tabs`
关闭指定的标签页或窗口。
**参数**
- `tabIds` (数组,可选):要关闭的标签页 ID 数组
- `windowIds` (数组,可选):要关闭的窗口 ID 数组
**示例**
```json
{
"tabIds": [123, 456],
"windowIds": [789]
}
```
### `chrome_go_back_or_forward`
浏览器历史导航。
**参数**
- `direction` (字符串,必需)"back" 或 "forward"
- `tabId` (数字,可选):特定标签页 ID默认活动标签页
**示例**
```json
{
"direction": "back",
"tabId": 123
}
```
## 📸 截图和视觉
### `chrome_screenshot`
使用各种选项进行高级截图。
**参数**
- `name` (字符串,可选):截图文件名
- `selector` (字符串,可选):元素截图的 CSS 选择器
- `width` (数字,可选)宽度像素默认800
- `height` (数字,可选)高度像素默认600
- `storeBase64` (布尔值,可选):返回 base64 数据默认false
- `fullPage` (布尔值,可选)捕获整个页面默认true
**示例**
```json
{
"selector": ".main-content",
"fullPage": true,
"storeBase64": true,
"width": 1920,
"height": 1080
}
```
**响应**
```json
{
"success": true,
"base64": "...",
"dimensions": {
"width": 1920,
"height": 1080
}
}
```
## 🌐 网络监控
### `chrome_network_capture_start`
使用 webRequest API 开始捕获网络请求。
**参数**
- `url` (字符串,可选):要导航并捕获的 URL
- `maxCaptureTime` (数字,可选)最大捕获时间毫秒默认30000
- `inactivityTimeout` (数字,可选)无活动后停止时间毫秒默认3000
- `includeStatic` (布尔值,可选)包含静态资源默认false
**示例**
```json
{
"url": "https://api.example.com",
"maxCaptureTime": 60000,
"includeStatic": false
}
```
### `chrome_network_capture_stop`
停止网络捕获并返回收集的数据。
**参数**:无
**响应**
```json
{
"success": true,
"capturedRequests": [
{
"url": "https://api.example.com/data",
"method": "GET",
"status": 200,
"requestHeaders": {...},
"responseHeaders": {...},
"responseTime": 150
}
],
"summary": {
"totalRequests": 15,
"captureTime": 5000
}
}
```
### `chrome_network_debugger_start`
使用 Chrome Debugger API 开始捕获(包含响应体)。
**参数**
- `url` (字符串,可选):要导航并捕获的 URL
### `chrome_network_debugger_stop`
停止调试器捕获并返回包含响应体的数据。
### `chrome_network_request`
发送自定义 HTTP 请求。
**参数**
- `url` (字符串,必需):请求 URL
- `method` (字符串,可选)HTTP 方法(默认:"GET"
- `headers` (对象,可选):请求头
- `body` (字符串,可选):请求体
**示例**
```json
{
"url": "https://api.example.com/data",
"method": "POST",
"headers": {
"Content-Type": "application/json"
},
"body": "{\"key\": \"value\"}"
}
```
## 🔍 内容分析
### `search_tabs_content`
跨浏览器标签页的 AI 驱动语义搜索。
**参数**
- `query` (字符串,必需):搜索查询
**示例**
```json
{
"query": "机器学习教程"
}
```
**响应**
```json
{
"success": true,
"totalTabsSearched": 10,
"matchedTabsCount": 3,
"vectorSearchEnabled": true,
"indexStats": {
"totalDocuments": 150,
"totalTabs": 10,
"semanticEngineReady": true
},
"matchedTabs": [
{
"tabId": 123,
"url": "https://example.com/ml-tutorial",
"title": "机器学习教程",
"semanticScore": 0.85,
"matchedSnippets": ["机器学习简介..."],
"chunkSource": "content"
}
]
}
```
### `chrome_get_web_content`
从网页提取 HTML 或文本内容。
**参数**
- `format` (字符串,可选)"html" 或 "text"(默认:"text"
- `selector` (字符串,可选):特定元素的 CSS 选择器
- `tabId` (数字,可选):特定标签页 ID默认活动标签页
**示例**
```json
{
"format": "text",
"selector": ".article-content"
}
```
### `chrome_get_interactive_elements`
查找页面上可点击和交互的元素。
**参数**
- `tabId` (数字,可选):特定标签页 ID默认活动标签页
**响应**
```json
{
"elements": [
{
"selector": "#submit-button",
"type": "button",
"text": "提交",
"visible": true,
"clickable": true
}
]
}
```
## 🎯 交互操作
### `chrome_click_element`
使用 CSS 选择器点击元素。
**参数**
- `selector` (字符串,必需):目标元素的 CSS 选择器
- `tabId` (数字,可选):特定标签页 ID默认活动标签页
**示例**
```json
{
"selector": "#submit-button"
}
```
### `chrome_fill_or_select`
填充表单字段或选择选项。
**参数**
- `selector` (字符串,必需):目标元素的 CSS 选择器
- `value` (字符串,必需):要填充或选择的值
- `tabId` (数字,可选):特定标签页 ID默认活动标签页
**示例**
```json
{
"selector": "#email-input",
"value": "user@example.com"
}
```
### `chrome_keyboard`
模拟键盘输入和快捷键。
**参数**
- `keys` (字符串,必需):按键组合(如:"Ctrl+C"、"Enter"
- `selector` (字符串,可选):目标元素选择器
- `delay` (数字,可选)按键间延迟毫秒默认0
**示例**
```json
{
"keys": "Ctrl+A",
"selector": "#text-input",
"delay": 100
}
```
## 📚 数据管理
### `chrome_history`
使用过滤器搜索浏览器历史记录。
**参数**
- `text` (字符串,可选):在 URL/标题中搜索文本
- `startTime` (字符串,可选)开始日期ISO 格式)
- `endTime` (字符串,可选)结束日期ISO 格式)
- `maxResults` (数字,可选)最大结果数默认100
- `excludeCurrentTabs` (布尔值,可选)排除当前标签页默认true
**示例**
```json
{
"text": "github",
"startTime": "2024-01-01",
"maxResults": 50
}
```
### `chrome_bookmark_search`
按关键词搜索书签。
**参数**
- `query` (字符串,可选):搜索关键词
- `maxResults` (数字,可选)最大结果数默认100
- `folderPath` (字符串,可选):在特定文件夹内搜索
**示例**
```json
{
"query": "文档",
"maxResults": 20,
"folderPath": "工作/资源"
}
```
### `chrome_bookmark_add`
添加支持文件夹的新书签。
**参数**
- `url` (字符串,可选):要收藏的 URL默认当前标签页
- `title` (字符串,可选):书签标题(默认:页面标题)
- `parentId` (字符串,可选):父文件夹 ID 或路径
- `createFolder` (布尔值,可选)如果不存在则创建文件夹默认false
**示例**
```json
{
"url": "https://example.com",
"title": "示例网站",
"parentId": "工作/资源",
"createFolder": true
}
```
### `chrome_bookmark_delete`
按 ID 或 URL 删除书签。
**参数**
- `bookmarkId` (字符串,可选):要删除的书签 ID
- `url` (字符串,可选):要查找并删除的 URL
**示例**
```json
{
"url": "https://example.com"
}
```
## 📋 响应格式
所有工具都返回以下格式的响应:
```json
{
"content": [
{
"type": "text",
"text": "包含实际响应数据的 JSON 字符串"
}
],
"isError": false
}
```
对于错误:
```json
{
"content": [
{
"type": "text",
"text": "描述出错原因的错误消息"
}
],
"isError": true
}
```
## 🔧 使用示例
### 完整工作流示例
```javascript
// 1. 导航到页面
await callTool('chrome_navigate', {
url: 'https://example.com',
});
// 2. 截图
const screenshot = await callTool('chrome_screenshot', {
fullPage: true,
storeBase64: true,
});
// 3. 开始网络监控
await callTool('chrome_network_capture_start', {
maxCaptureTime: 30000,
});
// 4. 与页面交互
await callTool('chrome_click_element', {
selector: '#load-data-button',
});
// 5. 语义搜索内容
const searchResults = await callTool('search_tabs_content', {
query: '用户数据分析',
});
// 6. 停止网络捕获
const networkData = await callTool('chrome_network_capture_stop');
// 7. 保存书签
await callTool('chrome_bookmark_add', {
title: '数据分析页面',
parentId: '工作/分析',
});
```
此 API 提供全面的浏览器自动化功能,具有 AI 增强的内容分析和语义搜索特性。

29
docs/TROUBLESHOOTING.md Normal file
View File

@@ -0,0 +1,29 @@
# 🚀 Installation and Connection Issues
### If Connection Fails After Clicking the Connect Button on the Extension
1. **Check if mcp-chrome-bridge is installed successfully**, ensure it's globally installed
```bash
mcp-chrome-bridge -v
```
<img width="612" alt="Screenshot 2025-06-11 15 09 57" src="https://github.com/user-attachments/assets/59458532-e6e1-457c-8c82-3756a5dbb28e" />
2. **Check if the manifest file is in the correct directory**
Windows path: C:\Users\xxx\AppData\Roaming\Google\Chrome\NativeMessagingHosts
Mac path: /Users/xxx/Library/Application\ Support/Google/Chrome/NativeMessagingHosts
If the npm package is installed correctly, a file named `com.chromemcp.nativehost.json` should be generated in this directory
3. **Check if there are logs in the npm package installation directory**
You need to check your installation path (if unclear, open the manifest file in step 2, the path field shows the installation directory). For example, if the installation path is as follows, check the log contents:
C:\Users\admin\AppData\Local\nvm\v20.19.2\node_modules\mcp-chrome-bridge\dist\logs
<img width="804" alt="Screenshot 2025-06-11 15 09 41" src="https://github.com/user-attachments/assets/ce7b7c94-7c84-409a-8210-c9317823aae1" />
4. **Check if you have execution permissions**
You need to check your installation path (if unclear, open the manifest file in step 2, the path field shows the installation directory). For example, if the Mac installation path is as follows:
`xxx/node_modules/mcp-chrome-bridge/dist/run_host.sh`
Check if this script has execution permissions

View File

@@ -0,0 +1,64 @@
## 🚀 安装和连接问题
### 常见问题
#### 连接成功,但是服务启动失败
启动失败基本上都是**权限问题**或者用包管理工具安装的**node**导致的启动脚本找不到对应的node核心排查流程
1. npm包全局安装后确认清单文件com.chromemcp.nativehost.json的位置里面有一个**path**字段,指向的是一个启动脚本:
1.1 **检查mcp-chrome-bridge是否安装成功**,确保是**全局安装**的
```bash
mcp-chrome-bridge -v
```
<img width="612" alt="截屏2025-06-11 15 09 57" src="https://github.com/user-attachments/assets/59458532-e6e1-457c-8c82-3756a5dbb28e" />
1.2 **检查清单文件是否已放在正确目录**
windows路径C:\Users\xxx\AppData\Roaming\Google\Chrome\NativeMessagingHosts
mac路径 /Users/xxx/Library/Application\ Support/Google/Chrome/NativeMessagingHosts
如果npm包安装正常的话这个目录下会生成一个`com.chromemcp.nativehost.json`
```json
{
"name": "com.chromemcp.nativehost",
"description": "Node.js Host for Browser Bridge Extension",
"path": "/Users/xxx/Library/pnpm/global/5/.pnpm/mcp-chrome-bridge@1.0.23/node_modules/mcp-chrome-bridge/dist/run_host.sh",
"type": "stdio",
"allowed_origins": [
"chrome-extension://hbdgbgagpkpjffpklnamcljpakneikee/"
]
}
```
> 如果发现没有此清单文件,可以尝试命令行执行:`mcp-chrome-bridge register`
2. Chrome浏览器会找到上面的清单文件指向的脚本路径来执行该脚本同时会在/Users/xxx/Library/pnpm/global/5/.pnpm/mcp-chrome-bridge@1.0.23/node_modules/mcp-chrome-bridge/dist/windows的自行查看清单文件对应的目录下生成logs文件夹里面会记录日志
具体要看你的安装路径如果不清楚可以打开上面提到的清单文件里面的path就是安装目录比如安装路径如下看下日志的内容
C:\Users\admin\AppData\Local\nvm\v20.19.2\node_modules\mcp-chrome-bridge\dist\logs
<img width="804" alt="截屏2025-06-11 15 09 41" src="https://github.com/user-attachments/assets/ce7b7c94-7c84-409a-8210-c9317823aae1" />
3. 一般失败的原因就是两种
3.1. run_host.sh(windows是run_host.bat)没有执行权限此时你可以自行赋予权限参考https://github.com/hangwin/mcp-chrome/issues/22#issuecomment-2990636930。 脚本路径在上述的清单文件可以查看
3.2. 脚本找不到node因为你可能电脑上装了不同版本的node脚本确认不了你把npm包装在哪个node底下了不同的人可能用了不同的node版本管理工具导致找不到
参考https://github.com/hangwin/mcp-chrome/issues/29#issuecomment-3003513940 (这个点目前正在优化中)
3.3 如果排除了以上两种原因都不行则查看日志目录的日志然后提issue
#### 工具执行超时
有可能长时间连接的时候session会超时这个时候重新连接即可
#### 效果问题
不同的agent不同的模型使用工具的效果是不一样的这些都需要你自行尝试我更推荐用聪明的agent比如augmentclaude code等等...

View File

@@ -0,0 +1,71 @@
# windows 安装指南 🔧
Chrome MCP Server 在windows电脑的详细安装和配置步骤
## 📋 安装
1. **从github上下载最新的chrome扩展**
下载地址https://github.com/hangwin/mcp-chrome/releases
2. **全局安装mcp-chrome-bridge**
确保电脑上已经安装了node如果没安装请自行先安装
```bash
# 确保安装的是最新版本的npm包(当前最新版本是1.0.14),否则可能有问题
npm install -g mcp-chrome-bridge
```
3. **加载 Chrome 扩展**
- 打开 Chrome 并访问 `chrome://extensions/`
- 启用"开发者模式"
- 点击"加载已解压的扩展程序",选择 `your/dowloaded/extension/folder`
- 点击插件图标打开插件点击连接即可看到mcp的配置
<img width="475" alt="截屏2025-06-09 15 52 06" src="https://github.com/user-attachments/assets/241e57b8-c55f-41a4-9188-0367293dc5bc" />
4. **在 CherryStudio 中使用**
类型选streamableHttpurl填http://127.0.0.1:12306/mcp
<img width="675" alt="截屏2025-06-11 15 00 29" src="https://github.com/user-attachments/assets/6631e9e4-57f9-477e-b708-6a285cc0d881" />
查看工具列表,如果能列出工具,说明已经可以使用了
<img width="672" alt="截屏2025-06-11 15 14 55" src="https://github.com/user-attachments/assets/d08b7e51-3466-4ab7-87fa-3f1d7be9d112" />
```json
{
"mcpServers": {
"streamable-mcp-server": {
"type": "streamable-http",
"url": "http://127.0.0.1:12306/mcp"
}
}
}
```
## 🚀 安装和连接问题
### 点击扩展的连接按钮后如果没连接成功
1. **检查mcp-chrome-bridge是否安装成功**,确保是全局安装的
```bash
mcp-chrome-bridge -v
```
<img width="612" alt="截屏2025-06-11 15 09 57" src="https://github.com/user-attachments/assets/59458532-e6e1-457c-8c82-3756a5dbb28e" />
2. **检查清单文件是否已放在正确目录**
路径C:\Users\xxx\AppData\Roaming\Google\Chrome\NativeMessagingHosts
3. **检查npm包的安装目录下是否有日志**
具体要看你的安装路径如果不清楚可以打开第2步的清单文件里面的path就是安装目录比如安装路径如下看下日志的内容
C:\Users\admin\AppData\Local\nvm\v20.19.2\node_modules\mcp-chrome-bridge\dist\logs
<img width="804" alt="截屏2025-06-11 15 09 41" src="https://github.com/user-attachments/assets/ce7b7c94-7c84-409a-8210-c9317823aae1" />