Files
broswer-automation/agent-livekit/DEBUGGING_GUIDE.md
nasir@endelospay.com d97cad1736 first commit
2025-08-12 02:54:17 +05:00

6.3 KiB

Browser Automation Debugging Guide

This guide explains how to use the enhanced debugging features to troubleshoot browser automation issues in the LiveKit Chrome Agent.

Overview

The enhanced debugging system provides comprehensive logging and troubleshooting tools to help identify and resolve issues when browser actions (like "click login button") are not being executed despite selectors being found correctly.

Enhanced Features

1. Enhanced Selector Logging

The system now provides detailed logging for every step of selector discovery and execution:

  • 🔍 SELECTOR SEARCH: Shows what element is being searched for
  • 📊 Found Elements: Lists all interactive elements found on the page
  • 🎯 Matching Elements: Shows which elements match the search criteria
  • 🚀 EXECUTING CLICK: Indicates when an action is being attempted
  • SUCCESS/ FAILURE: Clear indication of action results

2. Browser Connection Validation

Use validate_browser_connection() to check:

  • MCP server connectivity
  • Browser responsiveness
  • Page accessibility
  • Current URL and page title

3. Step-by-Step Command Debugging

Use debug_voice_command() to analyze:

  • How commands are parsed
  • Which selectors are generated
  • Why actions succeed or fail
  • Detailed execution flow

Using the Debugging Tools

In LiveKit Agent

When connected to the LiveKit agent, you can use these voice commands:

"debug voice command 'click login button'"
"validate browser connection"
"test selectors 'button.login, #login-btn, .signin'"
"capture browser state"
"get debug summary"

Standalone Testing

Run the test scripts to diagnose issues:

# Test enhanced logging features
python test_enhanced_logging.py

# Test specific login button scenario
python test_login_button_click.py

# Run comprehensive diagnostics
python debug_browser_actions.py

Common Issues and Solutions

Issue 1: "Selectors found but action not executed"

Symptoms:

  • Logs show selectors are discovered
  • No actual click happens in browser
  • No error messages

Debugging Steps:

  1. Run validate_browser_connection() to check connectivity
  2. Use debug_voice_command() to see execution details
  3. Check MCP server logs for errors
  4. Verify browser extension is active

Solution:

  • Ensure MCP server is properly connected to browser
  • Check browser console for JavaScript errors
  • Restart browser extension if needed

Issue 2: "No matching elements found"

Symptoms:

  • Logs show "No elements matched description"
  • Interactive elements are found but don't match

Debugging Steps:

  1. Use capture_browser_state() to see page state
  2. Use test_selectors() with common patterns
  3. Check if page has finished loading

Solution:

  • Try more specific or alternative descriptions
  • Wait for page to fully load
  • Use CSS selectors directly if needed

Issue 3: "Browser not responsive"

Symptoms:

  • Connection validation fails
  • No response from browser

Debugging Steps:

  1. Check if browser is running
  2. Verify MCP server is running on correct port
  3. Check browser extension status

Solution:

  • Restart browser and MCP server
  • Reinstall browser extension
  • Check firewall/network settings

Enhanced Logging Output

The enhanced logging provides detailed information at each step:

🔍 SELECTOR SEARCH: Looking for clickable element matching 'login button'
📋 Step 1: Getting interactive elements from page
📊 Found 15 interactive elements on page
🔍 Element 0: {"tag": "button", "text": "Sign In", "attributes": {"class": "btn-primary"}}
🔍 Element 1: {"tag": "a", "text": "Login", "attributes": {"href": "/login"}}
✅ Found 2 matching elements:
   🎯 Match 0: selector='button.btn-primary', reason='text_content=sign in'
   🎯 Match 1: selector='a[href="/login"]', reason='text_content=login'
🚀 EXECUTING CLICK: Using selector 'button.btn-primary' (reason: text_content=sign in)
✅ CLICK SUCCESS: Clicked on 'login button' using selector: button.btn-primary

Debug Tools Reference

SelectorDebugger Methods

  • debug_voice_command(command): Debug a voice command end-to-end
  • test_common_selectors(selector_list): Test multiple selectors
  • get_debug_summary(): Get summary of all debug sessions
  • export_debug_log(filename): Export debug history to file

BrowserStateMonitor Methods

  • capture_state(): Capture current browser state
  • detect_issues(state): Analyze state for potential issues

MCPChromeClient Enhanced Methods

  • validate_browser_connection(): Check browser connectivity
  • _smart_click_mcp(): Enhanced click with detailed logging
  • execute_voice_command(): Enhanced voice command processing

Best Practices

  1. Always validate connection first when troubleshooting
  2. Use debug_voice_command for step-by-step analysis
  3. Check browser state if actions aren't working
  4. Test selectors individually to find working patterns
  5. Export debug logs for detailed analysis
  6. Monitor logs in real-time during testing

Log Files

The system creates several log files for analysis:

  • enhanced_logging_test.log: Main test output
  • login_button_test.log: Specific login button tests
  • browser_debug.log: Browser diagnostics
  • debug_log_YYYYMMDD_HHMMSS.json: Exported debug sessions

Troubleshooting Workflow

  1. Validate Connection

    validation = await client.validate_browser_connection()
    
  2. Debug Command

    debug_result = await debugger.debug_voice_command("click login button")
    
  3. Capture State

    state = await monitor.capture_state()
    issues = monitor.detect_issues(state)
    
  4. Test Selectors

    results = await debugger.test_common_selectors(["button.login", "#login-btn"])
    
  5. Analyze and Fix

    • Review debug output
    • Identify failure points
    • Apply appropriate solutions

Getting Help

If issues persist after following this guide:

  1. Export debug logs using export_debug_log()
  2. Check browser console for JavaScript errors
  3. Verify MCP server configuration
  4. Test with simple selectors first
  5. Review the enhanced logging output for clues

The enhanced debugging system provides comprehensive visibility into the browser automation process, making it much easier to identify and resolve issues with selector discovery and action execution.