Complete Guide: Building an AI Knowledge Companion with Browser-Use, MCP, and Ollama for Advanced Web Automation and Information Processing

Beyond Research: Building a Modern AI Knowledge Companion
A Comprehensive Guide to Browser-Use, MCP, and AI-Powered Information Processing
1. Introduction to AI-Powered Knowledge Systems
In today's information landscape, the ability to efficiently gather, process, and synthesize knowledge has become essential. This guide transforms the concept of a basic research assistant into a comprehensive AI Knowledge Companion system—a versatile tool that not only conducts research but acts as your digital extension in navigating the vast information ecosystem.
What is Browser-Use? Browser-Use is a programmable interface that enables AI systems to interact with web browsers just as humans do—visiting websites, clicking links, filling forms, and extracting information. Unlike simple web scraping, Browser-Use provides true browser automation that can handle modern, JavaScript-heavy websites, captchas, and complex user interactions.
What is MCP (Model Context Protocol)? The Model Context Protocol is a standardized framework that facilitates secure communication between AI models and external tools or data sources. MCP defines how information is exchanged, permissions are granted, and results are returned, creating a universal "language" for AI systems to safely and effectively interface with the digital world.
2. Understanding the Core Technologies
Browser-Use: AI's Window to the Web
Browser-Use fundamentally transforms how AI interacts with the internet by:
- Providing visual context: Unlike API-based approaches, Browser-Use allows the AI to "see" what a human would see
- Enabling stateful navigation: Maintaining session information across multiple pages
- Handling dynamic content: Processing JavaScript-rendered pages that traditional scrapers cannot access
- Supporting authentication: Logging into services when needed
Implementation principle: Browser-Use creates a controlled browser instance that executes commands from your AI system through a dedicated interface, while feeding back visual and structural information about the pages it visits.
MCP: The Universal AI Connector
MCP serves as a standardized protocol for AI-to-tool communication, addressing several key challenges:
- Security: Defining clear permission boundaries and data access controls
- Interoperability: Creating a common language for diverse tools to connect to AI systems
- Context management: Efficiently transferring relevant information between systems
- Versioning and compatibility: Ensuring tools and AI models can evolve independently
Key concept: MCP treats external tools as "contexts" that an AI model can access, defining both how the AI can request information and how the external systems should respond.
3. Project Architecture: Building Your Knowledge Companion
System Overview
Our Knowledge Companion consists of five core components:
- User Interface: Accepts queries and displays results
- Orchestration Engine: Coordinates all system components
- LLM Core: Processes language, plans actions, and generates reports
- Browser-Use Module: Handles web navigation and extraction
- MCP Integration Layer: Connects to external knowledge sources
Component Interaction Flow
- User submits a query through the interface
- The orchestration engine passes the query to the LLM core
- The LLM plans a research strategy and generates actions
- Actions are executed through Browser-Use or MCP connections
- Retrieved information returns to the LLM for synthesis
- The final report is presented to the user
Design philosophy: This modular architecture allows each component to evolve independently while maintaining clear communication channels between them.
4. Setting Up Your Development Environment
Hardware and Software Requirements
For optimal performance, we recommend:
- CPU: 4+ cores (8+ preferred)
- RAM: 16GB minimum (32GB recommended)
- Storage: 20GB free space (SSD preferred)
- GPU: Optional but beneficial for larger models
- Operating System: Linux, macOS, or Windows 10/11
Installation Process
- Python Environment Setup:
Bash# Create a virtual environment python -m venv ai-companion source ai-companion/bin/activate # On Windows: ai-companion\Scripts\activate # Install core dependencies pip install browser-use ollama mcp-client pydantic fastapi uvicorn
- Ollama Configuration:
Bash# Download Ollama from https://ollama.com # Then pull the Llama 3.2 model ollama pull llama3.2 # Test the model ollama run llama3.2 "Hello, world!"
- Browser-Use Setup:
Python# Test browser-use functionality from browser_use import BrowserSession browser = BrowserSession() browser.navigate("https://www.example.com") content = browser.get_page_content() print(content) browser.close()
- MCP Configuration:
Python# Configure MCP client from mcp_client import MCPClient mcp = MCPClient( server_url="https://your-mcp-server.com", api_key="your_api_key", default_timeout=30 ) # Test connection status = mcp.check_connection() print(f"MCP Connection: {status}")
Important concept: The separation between the LLM runtime (Ollama) and your application code creates a clean architecture that can adapt to different models and execution environments.
5. Implementing Browser-Use Intelligence
Understanding Browser Automation Principles
When implementing Browser-Use, it's essential to understand that we're creating an AI system that can:
- Form intentions: Decide what information to seek
- Execute navigation: Move through websites purposefully
- Extract information: Identify and collect relevant data
- Process results: Transform raw web content into structured knowledge
Creating a Robust Browser-Use Module
Pythonclass IntelligentBrowser: def __init__(self, headless=True): """Initialize browser session with configurable visibility.""" self.browser = BrowserSession(headless=headless) self.history = [] def search(self, query, search_engine="google"): """Perform a search using specified engine.""" if search_engine == "google": self.browser.navigate("https://www.google.com") search_box = self.browser.find_element('input[name="q"]') self.browser.input_text(search_box, query) self.browser.press_enter() self.history.append({"action": "search", "query": query}) return self.get_search_results() def get_search_results(self): """Extract search results from the current page.""" results = [] elements = self.browser.find_elements("div.g") for element in elements: title_elem = self.browser.find_element_within(element, "h3") link_elem = self.browser.find_element_within(element, "a") snippet_elem = self.browser.find_element_within(element, "div.VwiC3b") if title_elem and link_elem and snippet_elem: title = self.browser.get_text(title_elem) link = self.browser.get_attribute(link_elem, "href") snippet = self.browser.get_text(snippet_elem) results.append({ "title": title, "url": link, "snippet": snippet }) return results def visit_page(self, url): """Navigate to a specific URL and extract content.""" self.browser.navigate(url) self.history.append({"action": "visit", "url": url}) # Wait for page to load completely self.browser.wait_for_page_load() # Extract main content, avoiding navigation elements content = self.extract_main_content() return { "url": url, "title": self.browser.get_page_title(), "content": content } def extract_main_content(self): """Intelligently extract the main content from the current page.""" # Try common content selectors content_selectors = [ "article", "main", ".content", "#content", "[role='main']", ".post-content" ] for selector in content_selectors: element = self.browser.find_element(selector) if element: return self.browser.get_text(element) # Fallback: use heuristics to find the largest text block paragraphs = self.browser.find_elements("p") if paragraphs: paragraph_texts = [self.browser.get_text(p) for p in paragraphs] # Filter out very short paragraphs substantial_paragraphs = [p for p in paragraph_texts if len(p) > 100] if substantial_paragraphs: return "\n\n".join(substantial_paragraphs) # Last resort: get body text return self.browser.get_body_text() def close(self): """Close the browser session.""" self.browser.close()
Key insight: The above implementation demonstrates how Browser-Use goes beyond simple scraping by making contextual decisions about what content is relevant, handling different site structures, and maintaining state across navigation.
6. Implementing the MCP Integration Layer
Understanding the Model Context Protocol
MCP enables standardized communication between your AI system and external tools through a structured protocol. Instead of custom code for each integration, MCP provides a unified framework for:
- Tool registration: Defining what tools are available
- Request formatting: Structuring how the AI requests information
- Response handling: Processing and validating tool outputs
- Error management: Handling failures in a consistent way
Building an MCP Client
Pythonclass KnowledgeSourceManager: def __init__(self, mcp_client): """Initialize with an MCP client.""" self.mcp = mcp_client self.available_sources = self._discover_sources() def _discover_sources(self): """Query the MCP server for available knowledge sources.""" try: sources = self.mcp.list_contexts() return { source["name"]: { "description": source["description"], "capabilities": source["capabilities"], "parameters": source["parameters"] } for source in sources } except Exception as e: print(f"Error discovering sources: {e}") return {} def query_source(self, source_name, query_params): """Query a specific knowledge source through MCP.""" if source_name not in self.available_sources: raise ValueError(f"Unknown source: {source_name}") try: response = self.mcp.query_context( context_name=source_name, parameters=query_params ) return response except Exception as e: print(f"Error querying {source_name}: {e}") return {"error": str(e)} def search_arxiv(self, query, max_results=5, categories=None): """Specialized method for arXiv searches.""" params = { "query": query, "max_results": max_results } if categories: params["categories"] = categories return self.query_source("arxiv", params) def search_wikipedia(self, query, depth=1): """Specialized method for Wikipedia searches.""" params = { "query": query, "depth": depth # How many links to follow } return self.query_source("wikipedia", params) def get_source_capabilities(self, source_name): """Get detailed information about a knowledge source.""" if source_name in self.available_sources: return self.available_sources[source_name] return None
MCP concept in practice: This implementation shows how MCP creates a uniform interface to diverse knowledge sources. The AI doesn't need to know the specifics of the arXiv API or Wikipedia's structure—it just makes standardized requests through the MCP protocol.
7. The Orchestration Engine: Coordinating Your AI System
Understanding Orchestration
The orchestration engine is the "brain" of your Knowledge Companion, responsible for:
- Query analysis: Understanding what the user is asking
- Planning: Determining which tools to use and in what sequence
- Execution: Calling the appropriate components
- Integration: Combining information from multiple sources
- Presentation: Formatting the final output for the user
Implementing the Orchestrator
Pythonclass KnowledgeOrchestrator: def __init__(self, llm_client, browser, knowledge_manager): """Initialize with core components.""" self.llm = llm_client self.browser = browser self.knowledge_manager = knowledge_manager async def process_query(self, user_query): """Process a user query from start to finish.""" # Step 1: Analyze the query to determine approach analysis = await self._analyze_query(user_query) # Step 2: Execute the research plan research_results = await self._execute_research_plan(analysis, user_query) # Step 3: Synthesize the findings into a coherent response final_report = await self._synthesize_report(user_query, research_results) return final_report async def _analyze_query(self, query): """Use the LLM to analyze the query and create a research plan.""" prompt = f""" Analyze the following research query and determine the best approach: QUERY: {query} Please determine: 1. What type of information is being requested? 2. Which knowledge sources would be most relevant (web search, arXiv, Wikipedia, etc.)? 3. What specific search terms should be used for each source? 4. What is the priority order for consulting these sources? 5. Are there any specialized domains or technical knowledge required? Return your analysis as a structured JSON object. """ response = await self.llm.complete(prompt) return json.loads(response) async def _execute_research_plan(self, analysis, original_query): """Execute the research plan across multiple sources.""" results = [] # Execute based on the priority order determined in the analysis for source in analysis["priority_order"]: if source == "web_search": search_terms = analysis["search_terms"]["web_search"] web_results = await self._perform_web_research(search_terms) results.append({ "source": "web_search", "data": web_results }) elif source == "arxiv": if "arxiv" in analysis["search_terms"]: arxiv_query = analysis["search_terms"]["arxiv"] categories = analysis.get("arxiv_categories", None) arxiv_results = self.knowledge_manager.search_arxiv( arxiv_query, categories=categories ) results.append({ "source": "arxiv", "data": arxiv_results }) elif source == "wikipedia": if "wikipedia" in analysis["search_terms"]: wiki_query = analysis["search_terms"]["wikipedia"] wiki_results = self.knowledge_manager.search_wikipedia(wiki_query) results.append({ "source": "wikipedia", "data": wiki_results }) # If needed, perform follow-up research based on initial findings if analysis.get("requires_followup", False): followup_results = await self._perform_followup_research(results, original_query) results.extend(followup_results) return results async def _perform_web_research(self, search_terms): """Conduct web research using Browser-Use.""" web_results = [] for term in search_terms: # Search and get results search_results = self.browser.search(term) # Visit the top 3 results and extract content for result in search_results[:3]: page_data = self.browser.visit_page(result["url"]) web_results.append({ "search_term": term, "page_data": page_data }) return web_results async def _perform_followup_research(self, initial_results, original_query): """Conduct follow-up research based on initial findings.""" # Generate follow-up questions using the LLM prompt = f""" Based on these initial research results and the original query: ORIGINAL QUERY: {original_query} INITIAL FINDINGS: {json.dumps(initial_results, indent=2)} Generate 3 follow-up questions that would help complete the research. Format as a JSON list of questions. """ followup_response = await self.llm.complete(prompt) followup_questions = json.loads(followup_response) followup_results = [] for question in followup_questions: # Recursively analyze and research each follow-up question followup_analysis = await self._analyze_query(question) question_results = await self._execute_research_plan(followup_analysis, question) followup_results.append({ "followup_question": question, "results": question_results }) return followup_results async def _synthesize_report(self, original_query, research_results): """Synthesize research results into a comprehensive report.""" prompt = f""" Create a comprehensive research report based on the following information: ORIGINAL QUERY: {original_query} RESEARCH FINDINGS: {json.dumps(research_results, indent=2)} Your report should: 1. Start with an executive summary 2. Organize information logically by topic and source 3. Highlight key findings and insights 4. Note any contradictions or gaps in the research 5. Include relevant citations to original sources 6. End with conclusions and potential next steps Format the report in Markdown for readability. """ report = await self.llm.complete(prompt) return report
Orchestration insight: The orchestrator demonstrates how to implement a multi-step research process that intelligently combines Browser-Use and MCP. Note how the system uses the LLM at multiple stages—for planning, for generating follow-up questions, and for synthesizing the final report.
8. Creating an Effective User Interface
Console-based Interface
For a simple but effective console interface:
Pythonimport asyncio import rich from rich.console import Console from rich.markdown import Markdown console = Console() class KnowledgeCompanionCLI: def __init__(self, orchestrator): self.orchestrator = orchestrator async def start(self): console.print("[bold blue]AI Knowledge Companion[/bold blue]") console.print("Ask me anything, and I'll research it for you.") console.print("Type 'exit' to quit.\n") while True: query = console.input("[bold green]Query:[/bold green] ") if query.lower() in ('exit', 'quit'): break console.print("\n[italic]Researching your query...[/italic]") try: with console.status("[bold green]Thinking..."): report = await self.orchestrator.process_query(query) console.print("\n[bold]Research Results:[/bold]\n") console.print(Markdown(report)) except Exception as e: console.print(f"[bold red]Error:[/bold red] {str(e)}") console.print("[bold blue]Thank you for using AI Knowledge Companion![/bold blue]") # Usage async def main(): # Setup components (simplified) llm_client = LlamaClient() browser = IntelligentBrowser() mcp_client = MCPClient(server_url="https://mcp.example.com") knowledge_manager = KnowledgeSourceManager(mcp_client) orchestrator = KnowledgeOrchestrator(llm_client, browser, knowledge_manager) # Start the CLI cli = KnowledgeCompanionCLI(orchestrator) await cli.start() # Cleanup browser.close() if __name__ == "__main__": asyncio.run(main())
Web-based Interface (FastAPI)
For a more versatile web interface:
Pythonfrom fastapi import FastAPI, BackgroundTasks from pydantic import BaseModel from typing import Optional, List import uvicorn app = FastAPI(title="AI Knowledge Companion API") # Data models class Query(BaseModel): text: str max_sources: Optional[int] = 5 preferred_sources: Optional[List[str]] = None class ResearchStatus(BaseModel): query_id: str status: str progress: float message: Optional[str] = None class ResearchReport(BaseModel): query_id: str query_text: str report: str sources: List[dict] execution_time: float # In-memory storage for demo purposes research_tasks = {} @app.post("/research/start", response_model=ResearchStatus) async def start_research(query: Query, background_tasks: BackgroundTasks): """Start a new research task.""" query_id = str(uuid.uuid4()) # Store initial status research_tasks[query_id] = { "status": "starting", "progress": 0.0, "query": query.text, "report": None } # Launch research in background background_tasks.add_task( perform_research, query_id, query.text, query.max_sources, query.preferred_sources ) return ResearchStatus( query_id=query_id, status="starting", progress=0.0, message="Research task initiated" ) @app.get("/research/{query_id}/status", response_model=ResearchStatus) async def get_research_status(query_id: str): """Get the status of a research task.""" if query_id not in research_tasks: raise HTTPException(status_code=404, detail="Research task not found") task = research_tasks[query_id] return ResearchStatus( query_id=query_id, status=task["status"], progress=task["progress"], message=task.get("message") ) @app.get("/research/{query_id}/report", response_model=ResearchReport) async def get_research_report(query_id: str): """Get the final report of a completed research task.""" if query_id not in research_tasks: raise HTTPException(status_code=404, detail="Research task not found") task = research_tasks[query_id] if task["status"] != "completed": raise HTTPException(status_code=400, detail="Research not yet completed") return ResearchReport( query_id=query_id, query_text=task["query"], report=task["report"], sources=task["sources"], execution_time=task["execution_time"] ) async def perform_research(query_id, query_text, max_sources, preferred_sources): """Background task to perform the actual research.""" try: # Update status research_tasks[query_id]["status"] = "researching" research_tasks[query_id]["progress"] = 0.1 # Create components (simplified) llm_client = LlamaClient() browser = IntelligentBrowser() mcp_client = MCPClient(server_url="https://mcp.example.com") knowledge_manager = KnowledgeSourceManager(mcp_client) orchestrator = KnowledgeOrchestrator(llm_client, browser, knowledge_manager) # Update progress periodically research_tasks[query_id]["progress"] = 0.3 # Perform the research start_time = time.time() report = await orchestrator.process_query(query_text) end_time = time.time() # Store the results research_tasks[query_id].update({ "status": "completed", "progress": 1.0, "report": report, "sources": orchestrator.get_used_sources(), "execution_time": end_time - start_time }) # Cleanup browser.close() except Exception as e: research_tasks[query_id].update({ "status": "error", "message": str(e) }) # Run with: uvicorn app:app --reload
UI design principle: Both interfaces demonstrate the importance of providing feedback during long-running research operations. The web interface adds asynchronous operation, allowing users to start research and check back later for results.
9. Advanced Features and Optimizations
Caching for Performance
Implement a caching layer to store frequently accessed information:
Pythonimport hashlib import json import aioredis from datetime import timedelta class KnowledgeCache: def __init__(self, redis_url="redis://localhost"): """Initialize the caching system.""" self.redis = None self.redis_url = redis_url async def connect(self): """Connect to Redis.""" self.redis = await aioredis.create_redis_pool(self.redis_url) async def close(self): """Close Redis connection.""" if self.redis: self.redis.close() await self.redis.wait_closed() async def get_cached_result(self, query, source): """Try to get a cached result for a query from a specific source.""" if not self.redis: return None cache_key = self._make_cache_key(query, source) cached_data = await self.redis.get(cache_key) if cached_data: return json.loads(cached_data) return None async def cache_result(self, query, source, result, ttl=timedelta(hours=24)): """Cache a result with an expiration time.""" if not self.redis: return cache_key = self._make_cache_key(query, source) await self.redis.set( cache_key, json.dumps(result), expire=int(ttl.total_seconds()) ) def _make_cache_key(self, query, source): """Create a deterministic cache key.""" data = f"{query}:{source}" return f"knowledge_cache:{hashlib.md5(data.encode()).hexdigest()}"
Parallel Processing
Optimize research by executing multiple sources in parallel:
Pythonasync def _execute_research_plan(self, analysis, original_query): """Execute the research plan with parallel processing.""" tasks = [] # Create tasks for each source in the research plan for source in analysis["priority_order"]: if source == "web_search" and "web_search" in analysis["search_terms"]: search_terms = analysis["search_terms"]["web_search"] task = asyncio.create_task( self._perform_web_research(search_terms) ) tasks.append(("web_search", task)) elif source == "arxiv" and "arxiv" in analysis["search_terms"]: arxiv_query = analysis["search_terms"]["arxiv"] categories = analysis.get("arxiv_categories", None) task = asyncio.create_task( self._perform_arxiv_research(arxiv_query, categories) ) tasks.append(("arxiv", task)) elif source == "wikipedia" and "wikipedia" in analysis["search_terms"]: wiki_query = analysis["search_terms"]["wikipedia"] task = asyncio.create_task( self._perform_wikipedia_research(wiki_query) ) tasks.append(("wikipedia", task)) # Wait for all tasks to complete results = [] for source_name, task in tasks: try: data = await task results.append({ "source": source_name, "data": data }) except Exception as e: print(f"Error researching {source_name}: {e}") results.append({ "source": source_name, "error": str(e) }) # If needed, perform follow-up research based on initial findings if analysis.get("requires_followup", False): followup_results = await self._perform_followup_research(results, original_query) results.extend(followup_results) return results
Smart Throttling
Prevent overloading external services:
Pythonclass RateLimiter: def __init__(self): """Initialize rate limiters for different domains.""" self.limiters = {} def register_domain(self, domain, requests_per_minute): """Register rate limits for a domain.""" self.limiters[domain] = { "rate": requests_per_minute, "tokens": requests_per_minute, "last_update": time.time(), "lock": asyncio.Lock() } async def acquire(self, url): """Acquire permission to make a request to a URL.""" domain = self._extract_domain(url) if domain not in self.limiters: # Default conservative limit self.register_domain(domain, 10) limiter = self.limiters[domain] async with limiter["lock"]: # Refill tokens based on time elapsed now = time.time() time_passed = now - limiter["last_update"] new_tokens = time_passed * (limiter["rate"] / 60.0) limiter["tokens"] = min(limiter["rate"], limiter["tokens"] + new_tokens) limiter["last_update"] = now if limiter["tokens"] < 1: # Calculate wait time until a token is available wait_time = (1 - limiter["tokens"]) * (60.0 / limiter["rate"]) await asyncio.sleep(wait_time) limiter["tokens"] = 1 limiter["last_update"] = time.time() # Consume a token limiter["tokens"] -= 1 def _extract_domain(self, url): """Extract the domain from a URL.""" parsed = urllib.parse.urlparse(url) return parsed.netloc
Advanced technique: These optimizations show how to balance speed and resource usage. Caching prevents redundant research, parallel processing maximizes throughput, and rate limiting ensures respectful use of external services.
10. Security and Ethical Considerations
Securing Your Knowledge Companion
Implement these security measures to protect your system and users:
- Input validation: Sanitize all user inputs to prevent injection attacks
- Rate limiting: Prevent abuse by limiting requests per user
- Authentication: Require user authentication for sensitive operations
- Secure storage: Encrypt sensitive data and API keys
- Audit logging: Track all system actions for review
Example implementation of authentication middleware:
Pythonfrom fastapi import Depends, HTTPException, status from fastapi.security import OAuth2PasswordBearer import jwt from datetime import datetime, timedelta # Setup oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token") SECRET_KEY = "your-secret-key" # Store securely in environment variables ALGORITHM = "HS256" # User authentication async def get_current_user(token: str = Depends(oauth2_scheme)): credentials_exception = HTTPException( status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid authentication credentials", headers={"WWW-Authenticate": "Bearer"}, ) try: payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM]) username: str = payload.get("sub") if username is None: raise credentials_exception except jwt.PyJWTError: raise credentials_exception # Get user from database (simplified) user = get_user(username) if user is None: raise credentials_exception return user # Example protected endpoint @app.post("/research/start", response_model=ResearchStatus) async def start_research( query: Query, background_tasks: BackgroundTasks, current_user = Depends(get_current_user) ): # Check user permissions if not user_can_research(current_user): raise HTTPException( status_code=status.HTTP_403_FORBIDDEN, detail="Not enough permissions" ) # Continue with research...
Ethical Web Scraping
Follow these guidelines for responsible web navigation:
- Respect robots.txt: Check for permission before crawling
- Identify your bot: Set proper User-Agent strings
- Rate limiting: Don't overload websites with requests
- Cache results: Minimize duplicate requests
- Honor copyright: Respect terms of service and licensing
Implementation example:
Pythonclass EthicalBrowser(IntelligentBrowser): def __init__(self, headless=True, user_agent=None): """Initialize with ethical browsing capabilities.""" super().__init__(headless) # Set an honest user agent if user_agent is None: user_agent = "KnowledgeCompanionBot/1.0 (+https://yourwebsite.com/bot.html)" self.browser.set_user_agent(user_agent) # Initialize robots.txt cache self.robots_cache = {} self.rate_limiter = RateLimiter() async def visit_page(self, url): """Ethically visit a page with proper checks.""" # Check robots.txt first domain = self._extract_domain(url) if not await self._can_access(url): return { "url": url, "error": "Access disallowed by robots.txt", "content": None } # Apply rate limiting await self.rate_limiter.acquire(url) # Now perform the visit return await super().visit_page(url) async def _can_access(self, url): """Check if a URL can be accessed according to robots.txt.""" domain = self._extract_domain(url) # Check cache first if domain in self.robots_cache: parser = self.robots_cache[domain]["parser"] last_checked = self.robots_cache[domain]["time"] # Refresh cache if older than 1 day if time.time() - last_checked > 86400: parser = await self._fetch_robots_txt(domain) else: # Fetch and parse robots.txt parser = await self._fetch_robots_txt(domain) # Check if our user agent can access the URL user_agent = self.browser.get_user_agent() path = urllib.parse.urlparse(url).path return parser.can_fetch(user_agent, path) async def _fetch_robots_txt(self, domain): """Fetch and parse robots.txt for a domain.""" robots_url = f"https://{domain}/robots.txt" # Use a simple GET request, not the browser async with aiohttp.ClientSession() as session: try: async with session.get(robots_url) as response: if response.status == 200: content = await response.text() parser = robotparser.RobotFileParser() parser.parse(content.splitlines()) else: # No robots.txt or can't access it - create a permissive parser parser = robotparser.RobotFileParser() parser.allow_all = True except: # Error accessing robots.txt - create a permissive parser parser = robotparser.RobotFileParser() parser.allow_all = True # Cache the result self.robots_cache[domain] = { "parser": parser, "time": time.time() } return parser def _extract_domain(self, url): """Extract domain from URL.""" parsed = urllib.parse.urlparse(url) return parsed.netloc
Ethical principle: This implementation demonstrates the technical aspects of ethical web scraping by checking robots.txt files, using honest user agents, and implementing rate limiting to be a good citizen of the web.
11. Testing and Quality Assurance
Unit Testing Core Components
Example of testing the Browser-Use module:
Pythonimport unittest from unittest.mock import MagicMock, patch import asyncio class TestIntelligentBrowser(unittest.TestCase): @patch('browser_use.BrowserSession') def setUp(self, MockBrowserSession): self.mock_browser = MockBrowserSession.return_value self.intelligent_browser = IntelligentBrowser(headless=True) def test_search(self): # Set up mocks self.mock_browser.find_element.return_value = "search_box" self.intelligent_browser.get_search_results = MagicMock(return_value=[ {"title": "Test Result", "url": "https://example.com", "snippet": "Example snippet"} ]) # Execute search results = self.intelligent_browser.search("test query") # Verify self.mock_browser.navigate.assert_called_with("https://www.google.com") self.mock_browser.input_text.assert_called_with("search_box", "test query") self.mock_browser.press_enter.assert_called_once() self.assertEqual(len(results), 1) self.assertEqual(results[0]["title"], "Test Result") def test_extract_main_content(self): # Set up mocks for different scenarios self.mock_browser.find_element.side_effect = [ None, # No article None, # No main "content_div" # Found .content ] self.mock_browser.get_text.return_value = "Extracted content" # Execute content = self.intelligent_browser.extract_main_content() # Verify self.assertEqual(content, "Extracted content") self.assertEqual(self.mock_browser.find_element.call_count, 3) class TestKnowledgeManager(unittest.TestCase): def setUp(self): self.mock_mcp = MagicMock() self.knowledge_manager = KnowledgeSourceManager(self.mock_mcp) def test_query_source(self): # Set up self.knowledge_manager.available_sources = { "test_source": {"description": "Test", "capabilities": [], "parameters": {}} } self.mock_mcp.query_context.return_value = {"result": "test_data"} # Execute result = self.knowledge_manager.query_source("test_source", {"param": "value"}) # Verify self.mock_mcp.query_context.assert_called_with( context_name="test_source", parameters={"param": "value"} ) self.assertEqual(result, {"result": "test_data"}) def test_query_unknown_source(self): # Verify exception for unknown source with self.assertRaises(ValueError): self.knowledge_manager.query_source("unknown_source", {})
Integration Testing
Test how components work together:
Pythonclass TestOrchestration(unittest.IsolatedAsyncioTestCase): async def asyncSetUp(self): # Create mocks for all components self.mock_llm = MagicMock() self.mock_browser = MagicMock() self.mock_knowledge_manager = MagicMock() # Setup orchestrator with mocks self.orchestrator = KnowledgeOrchestrator( self.mock_llm, self.mock_browser, self.mock_knowledge_manager ) # Setup common test data self.mock_llm.complete.return_value = json.dumps({ "priority_order": ["web_search", "arxiv"], "search_terms": { "web_search": ["test query"], "arxiv": "test query physics" }, "requires_followup": False }) async def test_full_query_processing(self): # Mock the research methods self.orchestrator._perform_web_research = MagicMock( return_value=asyncio.Future() ) self.orchestrator._perform_web_research.return_value.set_result([ {"title": "Web Result", "url": "https://example.com"} ]) self.mock_knowledge_manager.search_arxiv.return_value = { "papers": [{"title": "ArXiv Paper", "abstract": "Test abstract"}] } # Mock report synthesis final_report = "# Research Report\n\nThis is a test report." self.mock_llm.complete.side_effect = [ # First call - query analysis json.dumps({ "priority_order": ["web_search", "arxiv"], "search_terms": { "web_search": ["test query"], "arxiv": "test query physics" }, "requires_followup": False }), # Second call - report synthesis final_report ] # Execute full query processing result = await self.orchestrator.process_query("test query") # Verify self.assertEqual(result, final_report) self.assertEqual(self.mock_llm.complete.call_count, 2) self.orchestrator._perform_web_research.assert_called_once() self.mock_knowledge_manager.search_arxiv.assert_called_once()
End-to-End Testing
Test the entire system with real inputs:
Pythonclass TestEndToEnd(unittest.IsolatedAsyncioTestCase): async def asyncSetUp(self): # Create real components with test configuration self.llm_client = LlamaClient(model="llama3.2-test") self.browser = IntelligentBrowser(headless=True) self.mcp_client = MCPClient( server_url="https://test-mcp.example.com", api_key="test_key" ) self.knowledge_manager = KnowledgeSourceManager(self.mcp_client) # Set up the orchestrator with real components self.orchestrator = KnowledgeOrchestrator( self.llm_client, self.browser, self.knowledge_manager ) # Create the CLI interface self.cli = KnowledgeCompanionCLI(self.orchestrator) async def asyncTearDown(self): # Clean up resources self.browser.close() @unittest.skip("End-to-end test requires real services") async def test_basic_query(self): """Test end-to-end with a simple query.""" # This test is skipped by default as it requires real services query = "What is the capital of France?" report = await self.orchestrator.process_query(query) # Verify basic expectations about the report self.assertIn("Paris", report) self.assertIn("France", report) self.assertGreater(len(report), 100) # Ensure substantial content
Testing principle: The test suite demonstrates how to test at multiple levels—unit tests for individual functions, integration tests for component interactions, and end-to-end tests for the complete system. Note the use of mocks to isolate components during testing.
12. Deployment Strategies
Docker Containerization
Create a Dockerfile for your Knowledge Companion:
Dockerfile# Use a Python base image FROM python:3.9-slim # Install system dependencies including Chrome/Chromium RUN apt-get update && apt-get install -y \ wget \ gnupg \ unzip \ && wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \ && echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list \ && apt-get update \ && apt-get install -y google-chrome-stable \ && rm -rf /var/lib/apt/lists/* # Install ChromeDriver RUN CHROMEDRIVER_VERSION=`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE` && \ wget -N chromedriver.storage.googleapis.com/$CHROMEDRIVER_VERSION/chromedriver_linux64.zip -P ~/ && \ unzip ~/chromedriver_linux64.zip -d ~/ && \ rm ~/chromedriver_linux64.zip && \ mv -f ~/chromedriver /usr/local/bin/chromedriver && \ chmod +x /usr/local/bin/chromedriver # Set up workdir and install Python dependencies WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt # Download and configure Ollama (for local LLM support) RUN wget -O ollama https://ollama.ai/download/ollama-linux-amd64 && \ chmod +x ollama && \ mv ollama /usr/local/bin/ # Copy application code COPY . . # Create a non-root user RUN useradd -m appuser USER appuser # Set environment variables ENV PYTHONUNBUFFERED=1 ENV BROWSER_USE_HEADLESS=true ENV OLLAMA_HOST=host.docker.internal # Expose port for the API EXPOSE 8000 # Health check HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \ CMD curl -f http://localhost:8000/health || exit 1 # Command to run the application CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
Using Docker Compose for Multi-Container Deployment
Create a docker-compose.yml file:
YAMLversion: '3.8' services: # API Service api: build: . ports: - "8000:8000" environment: - REDIS_URL=redis://redis:6379 - MCP_SERVER_URL=http://mcp:5000 depends_on: - redis - mcp - ollama volumes: - ./app:/app networks: - knowledge-network # Redis for caching redis: image: redis:6.2-alpine ports: - "6379:6379" volumes: - redis-data:/data command: redis-server --appendonly yes networks: - knowledge-network # MCP Server mcp: build: ./mcp-server ports: - "5000:5000" environment: - PYTHONUNBUFFERED=1 volumes: - ./mcp-server:/app networks: - knowledge-network # Ollama for local LLM support ollama: image: ollama/ollama:latest volumes: - ollama-data:/root/.ollama ports: - "11434:11434" networks: - knowledge-network networks: knowledge-network: driver: bridge volumes: redis-data: ollama-data:
Serverless Deployment
For AWS Lambda deployment, create a serverless.yml configuration:
YAMLservice: knowledge-companion provider: name: aws runtime: python3.9 region: us-east-1 memorySize: 2048 timeout: 30 environment: MCP_SERVER_URL: ${env:MCP_SERVER_URL} REDIS_URL: ${env:REDIS_URL} functions: api: handler: serverless_handler.handler events: - http: path: / method: ANY - http: path: /{proxy+} method: ANY layers: - !Ref PythonDependenciesLambdaLayer layers: pythonDependencies: path: layer compatibleRuntimes: - python3.9 custom: pythonRequirements: dockerizePip: true slim: true layer: true plugins: - serverless-python-requirements - serverless-offline
Deployment concept: These examples show multiple deployment strategies—containerized for easy distribution, Docker Compose for multi-service orchestration, and serverless for scalable API deployment. The actual choice depends on your specific requirements and infrastructure constraints.
13. Practical Applications and Use Cases
Academic Research Assistant
Customize your Knowledge Companion for academic research:
Pythonclass AcademicResearchOrchestrator(KnowledgeOrchestrator): """Specialized orchestrator for academic research.""" async def _analyze_query(self, query): """Academic-focused query analysis.""" prompt = f""" Analyze the following academic research query and determine the best approach: QUERY: {query} Please determine: 1. Which academic fields are most relevant to this query? 2. What specific academic databases should be consulted? (arXiv, PubMed, etc.) 3. What are the key search terms for academic literature? 4. Are there specific authors or institutions known for work in this area? 5. What time period is most relevant for this research? Return your analysis as a structured JSON object. """ response = await self.llm.complete(prompt) return json.loads(response) async def _perform_literature_review(self, analysis): """Conduct a comprehensive literature review.""" sources = [] # Search across multiple academic databases if "arxiv" in analysis.get("databases", []): arxiv_results = await self._search_arxiv( analysis["search_terms"], analysis.get("categories", []) ) sources.append(("arxiv", arxiv_results)) if "pubmed" in analysis.get("databases", []): pubmed_results = await self._search_pubmed( analysis["search_terms"], analysis.get("date_range", {}) ) sources.append(("pubmed", pubmed_results)) # Find and analyze citation networks key_papers = self._identify_key_papers(sources) citation_network = await self._analyze_citations(key_papers) return { "primary_sources": sources, "key_papers": key_papers, "citation_network": citation_network } async def _generate_academic_report(self, query, research_results): """Generate an academic-style report with proper citations.""" prompt = f""" Create a comprehensive academic literature review based on the following research: QUERY: {query} RESEARCH FINDINGS: {json.dumps(research_results, indent=2)} Your review should: 1. Begin with an abstract summarizing the findings 2. Include a methodology section explaining the search strategy 3. Present findings organized by themes or chronologically 4. Discuss conflicts and agreements in the literature 5. Identify research gaps 6. Include a properly formatted bibliography (APA style) Format the review in Markdown. """ review = await self.llm.complete(prompt) return review
Business Intelligence Tool
Adapt your Knowledge Companion for business intelligence:
Pythonclass BusinessIntelligenceOrchestrator(KnowledgeOrchestrator): """Specialized orchestrator for business intelligence.""" async def _analyze_query(self, query): """Business-focused query analysis.""" prompt = f""" Analyze the following business intelligence query and determine the best approach: QUERY: {query} Please determine: 1. What industry sectors are relevant to this query? 2. What specific companies should be researched? 3. What types of business data would be most valuable (financial, strategic, competitive)? 4. What time frame is relevant for this analysis? 5. What business news sources would be most appropriate? Return your analysis as a structured JSON object. """ response = await self.llm.complete(prompt) return json.loads(response) async def _gather_company_data(self, companies): """Gather data about specific companies.""" results = [] for company in companies: # Financial data financial_data = await self._fetch_financial_data(company) # News and press releases news = await self._fetch_company_news(company) # Social media sentiment sentiment = await self._analyze_social_sentiment(company) results.append({ "company": company, "financial": financial_data, "news": news, "sentiment": sentiment }) return results async def _perform_competitive_analysis(self, primary_company, competitors): """Analyze competitive positioning.""" company_data = await self._gather_company_data([primary_company] + competitors) # Extract the primary company data primary_data = next(item for item in company_data if item["company"] == primary_company) # Extract competitor data competitor_data = [item for item in company_data if item["company"] != primary_company] # Perform SWOT analysis swot = await self._generate_swot_analysis(primary_data, competitor_data) return { "primary_company": primary_data, "competitors": competitor_data, "swot_analysis": swot } async def _generate_business_report(self, query, intelligence_data): """Generate a business intelligence report.""" prompt = f""" Create a comprehensive business intelligence report based on the following data: QUERY: {query} INTELLIGENCE DATA: {json.dumps(intelligence_data, indent=2)} Your report should: 1. Begin with an executive summary 2. Include market overview and trends 3. Provide detailed company analyses 4. Present competitive comparisons with data visualizations (described in text) 5. Offer strategic recommendations 6. Include reference sources Format the report in Markdown with sections for easy navigation. """ report = await self.llm.complete(prompt) return report
Personal Learning Assistant
Create a personal learning companion:
Pythonclass LearningPathOrchestrator(KnowledgeOrchestrator): """Specialized orchestrator for creating personalized learning paths.""" async def create_learning_path(self, subject, user_level="beginner", goals=None): """Create a personalized learning path for a subject.""" analysis = await self._analyze_learning_needs(subject, user_level, goals) resources = await self._discover_learning_resources(analysis) curriculum = await self._design_curriculum(analysis, resources) return { "subject": subject, "user_level": user_level, "goals": goals, "curriculum": curriculum } async def _analyze_learning_needs(self, subject, user_level, goals): """Analyze the learning requirements for a subject.""" prompt = f""" Analyze the following learning request and determine the optimal approach: SUBJECT: {subject} USER LEVEL: {user_level} LEARNING GOALS: {goals if goals else 'General proficiency'} Please determine: 1. What are the foundational concepts needed for this subject? 2. What is an appropriate learning sequence? 3. What types of resources would be most beneficial (textbooks, videos, interactive exercises)? 4. How should progress be measured? 5. What are common stumbling blocks for learners at this level? Return your analysis as a structured JSON object. """ response = await self.llm.complete(prompt) return json.loads(response) async def _discover_learning_resources(self, analysis): """Find optimal learning resources based on analysis.""" resources = { "textbooks": [], "online_courses": [], "videos": [], "practice_resources": [], "communities": [] } # Search for textbooks and books book_results = await self._search_books(analysis["subject"], analysis["key_concepts"]) resources["textbooks"] = book_results[:5] # Top 5 relevant books # Search for online courses course_results = await self._search_online_courses( analysis["subject"], analysis["user_level"] ) resources["online_courses"] = course_results[:5] # Find educational videos video_results = await self._search_educational_videos( analysis["subject"], analysis["key_concepts"] ) resources["videos"] = video_results[:8] # Find practice resources practice_results = await self._search_practice_resources(analysis["subject"]) resources["practice_resources"] = practice_results[:5] # Find learning communities community_results = await self._search_learning_communities(analysis["subject"]) resources["communities"] = community_results[:3] return resources async def _design_curriculum(self, analysis, resources): """Design a structured curriculum based on analysis and resources.""" prompt = f""" Create a comprehensive learning curriculum based on the following: SUBJECT ANALYSIS: {json.dumps(analysis, indent=2)} AVAILABLE RESOURCES: {json.dumps(resources, indent=2)} Your curriculum should: 1. Be organized in modules with clear learning objectives for each 2. Include a mix of theory and practice 3. Estimate time commitments for each module 4. Recommend specific resources for each module 5. Include checkpoints to assess understanding 6. Provide a progression from foundational to advanced concepts Format the curriculum in Markdown with clear sections. """ curriculum = await self.llm.complete(prompt) return curriculum
Application principle: These specialized orchestrators show how the same core technology can be adapted to different domains by modifying the analysis, research strategies, and report generation based on domain-specific requirements.
14. Conclusion: The Future of AI Knowledge Systems
The AI Knowledge Companion we've built represents just the beginning of a new era in information processing. As we look to the future, several trends are emerging:
-
Deeper reasoning capabilities: Future systems will go beyond information retrieval to perform logical reasoning, connecting disparate pieces of information to generate novel insights.
-
Multimodal understanding: The next generation of knowledge systems will process not just text but images, audio, and video seamlessly, extracting information from all media types.
-
Collective intelligence: Knowledge companions will facilitate collaboration between multiple human experts and AI systems, creating hybrid intelligence networks.
-
Continuous learning: Systems will remember past interactions and continuously refine their understanding based on user feedback and new information.
-
Specialized domain expertise: We'll see the rise of highly specialized companions focused on specific fields like medicine, law, or engineering.
The technologies we've explored—Browser-Use and MCP—are foundational pieces of this future, enabling AI systems to navigate the internet and connect to diverse knowledge sources in standardized ways. As these technologies mature, the boundary between AI assistants and knowledge workers will continue to blur, creating powerful partnerships that enhance human capabilities.
By building your own AI Knowledge Companion, you're not just creating a research tool—you're participating in the development of systems that will fundamentally transform how humans interact with the vast landscape of global knowledge.
Additional Resources
- Browser-Use Documentation
- Model Context Protocol Specification
- Ollama GitHub Repository
- Ethical Web Scraping Guidelines
- FastAPI Documentation
- Redis Documentation
- Docker and Docker Compose Documentation
- ArXiv API Documentation
Happy knowledge exploration!