# Real-Time Form Discovery System ## Overview The LiveKit agent now features a **REAL-TIME ONLY** form discovery system that **NEVER uses cached selectors**. Every form field discovery is performed live using MCP tools, ensuring the most current and accurate form element detection. ## Key Principles ### 🚫 NO CACHE POLICY - **Zero cached selectors** - every request gets fresh selectors - **Real-time discovery only** - uses MCP tools on every call - **No hardcoded selectors** - all elements discovered dynamically - **Fresh page analysis** - adapts to dynamic content changes ### 🔄 Real-Time MCP Tools - **chrome_get_interactive_elements** - Gets current form elements - **chrome_get_content_web_form** - Analyzes form structure - **chrome_get_web_content** - Content analysis for field discovery - **Live selector testing** - Validates selectors before use ## How Real-Time Discovery Works ### 1. Voice Command Processing When a user says: `"fill email with john@example.com"` ```python # NO cache lookup - goes straight to real-time discovery field_name = "email" value = "john@example.com" # Step 1: Real-time MCP discovery discovery_result = await client._discover_form_fields_dynamically(field_name, value) # Step 2: Enhanced detection with retry (if needed) enhanced_result = await client._enhanced_field_detection_with_retry(field_name, value) # Step 3: Direct MCP element search (final fallback) direct_result = await client._direct_mcp_element_search(field_name, value) ``` ### 2. Real-Time Discovery Process #### Strategy 1: Interactive Elements Discovery ```python # Get ALL current interactive elements interactive_result = await client._call_mcp_tool("chrome_get_interactive_elements", { "types": ["input", "textarea", "select"] }) # Match field name to current elements for element in elements: if client._is_field_match(element, field_name): selector = client._extract_best_selector(element) # Try to fill immediately with fresh selector ``` #### Strategy 2: Form Content Analysis ```python # Get current form structure form_result = await client._call_mcp_tool("chrome_get_content_web_form", {}) # Parse form content for field patterns selector = client._parse_form_content_for_field(form_content, field_name) # Test and use selector immediately ``` #### Strategy 3: Direct Element Search ```python # Exhaustive search through ALL elements all_elements = await client._call_mcp_tool("chrome_get_interactive_elements", {}) # Very flexible matching for any possible match for element in all_elements: if client._is_very_flexible_match(element, field_name): # Generate and test selector immediately ``` ### 3. Real-Time Selector Generation The system generates selectors in real-time based on current element attributes: ```python def _extract_best_selector(element): attrs = element.get("attributes", {}) # Priority order for reliability if attrs.get("id"): return f"#{attrs['id']}" if attrs.get("name"): return f"input[name='{attrs['name']}']" if attrs.get("type") and attrs.get("name"): return f"input[type='{attrs['type']}'][name='{attrs['name']}']" # ... more patterns ``` ## API Reference ### Real-Time Functions #### `fill_field_by_name(field_name: str, value: str) -> str` **NOW REAL-TIME ONLY** - No cache, fresh discovery every call. #### `fill_field_realtime_only(field_name: str, value: str) -> str` **Guaranteed real-time** - Explicit real-time discovery function. #### `get_realtime_form_fields() -> str` **Live form discovery** - Gets current form fields using only MCP tools. #### `_discover_form_fields_dynamically(field_name: str, value: str) -> dict` **Pure real-time discovery** - Uses chrome_get_interactive_elements and chrome_get_content_web_form. #### `_direct_mcp_element_search(field_name: str, value: str) -> dict` **Exhaustive real-time search** - Final fallback using comprehensive MCP element search. ### Real-Time Matching Algorithms #### `_is_field_match(element: dict, field_name: str) -> bool` Standard real-time field matching using current element attributes. #### `_is_very_flexible_match(element: dict, field_name: str) -> bool` Very flexible real-time matching for challenging cases. #### `_generate_common_selectors(field_name: str) -> list` Generates common CSS selectors based on field name patterns. ## Usage Examples ### Voice Commands (All Real-Time) ``` User: "fill email with john@example.com" Agent: [Uses chrome_get_interactive_elements] ✓ Filled 'email' field using real-time discovery User: "enter password secret123" Agent: [Uses chrome_get_content_web_form] ✓ Filled 'password' field using form content analysis User: "type hello in search box" Agent: [Uses direct MCP search] ✓ Filled 'search' field using exhaustive element search ``` ### Programmatic Usage ```python # All these functions use ONLY real-time discovery result = await client.fill_field_by_name("email", "user@example.com") result = await client.fill_field_realtime_only("search", "python") result = await client._discover_form_fields_dynamically("username", "john_doe") ``` ## Real-Time Discovery Strategies ### 1. Interactive Elements Strategy - Uses `chrome_get_interactive_elements` to get current form elements - Matches field names to element attributes in real-time - Tests selectors immediately before use ### 2. Form Content Strategy - Uses `chrome_get_content_web_form` for form-specific analysis - Parses current form structure for field patterns - Generates selectors based on live content ### 3. Direct Search Strategy - Exhaustive search through ALL current page elements - Very flexible matching criteria - Tests multiple selector patterns ### 4. Common Selector Strategy - Generates intelligent selectors based on field name - Tests each selector against current page - Uses type-specific patterns for common fields ## Benefits of Real-Time Discovery ### 🎯 Accuracy - **Always current** - reflects actual page state - **No stale selectors** - eliminates cached selector failures - **Dynamic adaptation** - handles page changes automatically ### 🔄 Reliability - **Fresh discovery** - every request gets new selectors - **Multiple strategies** - comprehensive fallback methods - **Live validation** - selectors tested before use ### 🌐 Compatibility - **Works on any site** - no pre-configuration needed - **Handles dynamic content** - adapts to JavaScript-generated forms - **Cross-platform** - works with any web technology ### 🛠️ Maintainability - **Zero maintenance** - no selector databases to update - **Self-adapting** - automatically handles site changes - **Future-proof** - works with new web technologies ## Testing Real-Time Discovery Run the real-time test suite: ```bash python test_realtime_form_discovery.py ``` This tests: - Real-time discovery on Google search - Form field discovery on GitHub - Direct MCP element search - Very flexible matching algorithms - Cross-website compatibility ## Performance Considerations ### Real-Time vs Speed - **Slightly slower** than cached selectors (by design) - **More reliable** than cached approaches - **Eliminates cache invalidation** issues - **Prevents stale selector errors** ### Optimization Strategies - **Parallel discovery** - multiple strategies run concurrently - **Early termination** - stops on first successful match - **Intelligent prioritization** - most likely selectors first ## Error Handling ### Graceful Degradation 1. **Interactive elements** → **Form content** → **Direct search** → **Common selectors** 2. **Detailed logging** of each attempt 3. **Clear error messages** about what was tried 4. **No silent failures** - always reports what happened ### Retry Mechanism - **Multiple attempts** with increasing flexibility - **Different strategies** on each retry - **Configurable retry count** (default: 3) - **Delay between retries** to handle loading ## Future Enhancements ### Advanced Real-Time Features - **Visual element detection** using screenshots - **Machine learning** field recognition - **Context-aware** field relationships - **Performance optimization** for faster discovery ### Real-Time Analytics - **Discovery success rates** by strategy - **Performance metrics** for each method - **Field matching accuracy** tracking - **Site compatibility** reporting ## Migration from Cached System ### Automatic Migration - **No code changes** required for existing voice commands - **Backward compatibility** maintained - **Enhanced reliability** with real-time discovery - **Same API** with improved implementation ### Benefits of Migration - **Eliminates cache issues** - no more stale selectors - **Improves accuracy** - always uses current page state - **Reduces maintenance** - no cache management needed - **Increases reliability** - works on dynamic sites The real-time discovery system ensures that the LiveKit agent always works with the most current page state, providing maximum reliability and compatibility across all websites.