AndrewAltimit · October 24, 2025 18:30 · Aug 16, 2025 · Aug 12, 2025 · Jul 28, 2025 · Jul 28, 2025
diff --git a/!README.md b/!README.md
@@ -11,7 +11,7 @@ A Model Context Protocol (MCP) server that integrates Google's Gemini AI for cod
 
 See the [template repository](https://github.com/AndrewAltimit/template-repo) for a complete example, including Gemini CLI automated PR reviews : [Example PR](https://github.com/AndrewAltimit/template-repo/pull/54) , [Script](https://github.com/AndrewAltimit/template-repo/blob/main/automation/review/gemini-pr-review.py).
 
-![mcp-demo](https://raw.githubusercontent.com/AndrewAltimit/template-repo/refs/heads/main/docs/template-repo.webp)
+![mcp-demo](https://raw.githubusercontent.com/AndrewAltimit/template-repo/refs/heads/main/docs/mcp/architecture/demo.gif)
 
 ## Features
 

diff --git a/!README.md b/!README.md
@@ -9,9 +9,9 @@ A Model Context Protocol (MCP) server that integrates Google's Gemini AI for cod
 
 ## Usage
 
-See the [template repository](https://github.com/AndrewAltimit/template-repo) for a complete example, including Gemini CLI automated PR reviews : [Example PR](https://github.com/AndrewAltimit/template-repo/pull/9) , [Script](https://github.com/AndrewAltimit/template-repo/blob/main/scripts/gemini-pr-review.py).
+See the [template repository](https://github.com/AndrewAltimit/template-repo) for a complete example, including Gemini CLI automated PR reviews : [Example PR](https://github.com/AndrewAltimit/template-repo/pull/54) , [Script](https://github.com/AndrewAltimit/template-repo/blob/main/automation/review/gemini-pr-review.py).
 
-![mcp-demo](https://gist.github.com/user-attachments/assets/a5646586-5b12-4d1f-bcfc-28ed84275c1f)
+![mcp-demo](https://raw.githubusercontent.com/AndrewAltimit/template-repo/refs/heads/main/docs/template-repo.webp)
 
 ## Features
 

diff --git a/!README.md b/!README.md
@@ -306,9 +306,3 @@ If you see this error, you're trying to run the server inside Docker. Exit the c
 3. **Rate Limiting**: Respect rate limits to avoid API quota issues
 4. **Error Handling**: Always handle potential timeout or API errors
 5. **Comparison Mode**: Use comparison mode to get diverse perspectives
-
-## Acknowledgments
-
-- Built for the MCP (Model Context Protocol) ecosystem
-- Integrates with Google's Gemini AI
-- Designed for Claude Desktop integration
diff --git a/!README.md b/!README.md
@@ -49,7 +49,7 @@ The Gemini CLI requires Docker access to function properly, which means it canno
 
 ## Running the Server
 
-### stdio Mode (Recommended for Claude Desktop)
+### stdio Mode (Recommended)
 
 ```bash
 # Run in stdio mode
@@ -59,7 +59,7 @@ python gemini_mcp_server.py --mode stdio
 python gemini_mcp_server.py --mode stdio --project-root /path/to/project
 ```
 
-### HTTP Mode (For testing and API access)
+### HTTP Mode
 
 ```bash
 # Run in HTTP mode on port 8006

diff --git a/!README.md b/!README.md
@@ -1,6 +1,6 @@
 # Gemini AI Integration MCP Server
 
-A Model Context Protocol (MCP) server that integrates Google's Gemini AI for code review, technical consultation, and AI-assisted development workflows. This server provides seamless integration with Claude Desktop and other MCP-compatible clients.
+A Model Context Protocol (MCP) server that integrates Google's Gemini AI for code review, technical consultation, and AI-assisted development workflows. This server provides seamless integration with Claude Code and other MCP-compatible clients.
 
 <div align="center">
   <img src="https://gist.github.com/user-attachments/assets/507ce5cd-30cd-4408-bb96-77508e7e4ac6" />

diff --git a/README.md → !README.md b/README.md → !README.md
diff --git a/.env.example b/.env.example
@@ -0,0 +1,38 @@
+# Gemini MCP Server Configuration
+# Copy this file to .env and update with your settings
+
+# Enable/disable Gemini integration
+GEMINI_ENABLED=true
+
+# Auto-consultation on uncertainty detection
+GEMINI_AUTO_CONSULT=false
+
+# Gemini CLI command (if not in PATH)
+GEMINI_CLI_COMMAND=gemini
+
+# Request timeout in seconds
+GEMINI_TIMEOUT=300
+
+# Rate limit delay between requests (seconds)
+GEMINI_RATE_LIMIT=2
+
+# Maximum context length (characters)
+GEMINI_MAX_CONTEXT=4000
+
+# Log consultations
+GEMINI_LOG_CONSULTATIONS=true
+
+# Gemini model to use
+GEMINI_MODEL=gemini-2.5-pro
+
+# Sandbox mode for testing (no actual API calls)
+GEMINI_SANDBOX=false
+
+# Debug mode (verbose logging)
+GEMINI_DEBUG=false
+
+# Include conversation history in consultations
+GEMINI_INCLUDE_HISTORY=true
+
+# Maximum number of history entries to keep
+GEMINI_MAX_HISTORY=10
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
-# Gemini CLI Integration for Claude Code MCP Server
+# Gemini AI Integration MCP Server
 
-A complete setup guide for integrating Google's Gemini CLI with Claude Code through an MCP (Model Context Protocol) server. This provides automatic second opinion consultation when Claude expresses uncertainty or encounters complex technical decisions.
+A Model Context Protocol (MCP) server that integrates Google's Gemini AI for code review, technical consultation, and AI-assisted development workflows. This server provides seamless integration with Claude Desktop and other MCP-compatible clients.
 
 <div align="center">
   <img src="https://gist.github.com/user-attachments/assets/507ce5cd-30cd-4408-bb96-77508e7e4ac6" />
@@ -13,460 +13,302 @@ See the [template repository](https://github.com/AndrewAltimit/template-repo) fo
 
 ![mcp-demo](https://gist.github.com/user-attachments/assets/a5646586-5b12-4d1f-bcfc-28ed84275c1f)
 
-## Quick Start
+## Features
 
-### 1. Install Gemini CLI (Host-based)
-```bash
-# Switch to Node.js 22.16.0
-nvm use 22.16.0
+- **AI Consultation**: Get second opinions on code and technical decisions
+- **Conversation History**: Maintain context across consultations  
+- **Auto-consultation**: Automatic AI consultation on uncertainty detection
+- **Comparison Mode**: Compare responses with previous Claude outputs
+- **Rate Limiting**: Built-in rate limiting to avoid API quota issues
+- **Dual Mode Support**: Runs in both stdio (for Claude Desktop) and HTTP modes
 
-# Install Gemini CLI globally
-npm install -g @google/gemini-cli
+## Important Requirements
 
-# Test installation
-gemini --help
+⚠️ **This server MUST run on the host system, not in a container!**
 
-# Authenticate with Google account (free tier: 60 req/min, 1,000/day)
-# Authentication happens automatically on first use
-```
+The Gemini CLI requires Docker access to function properly, which means it cannot run inside a container itself (would require Docker-in-Docker). Always launch this server directly on your host machine.
 
-### 2. Direct Usage (Fastest)
-```bash
-# Direct consultation (no container setup needed)
-echo "Your question here" | gemini
+## Prerequisites
 
-# Example: Technical questions
-echo "Best practices for microservice authentication?" | gemini -m gemini-2.5-pro
-```
+1. **Python 3.8+**: Ensure Python is installed
+2. **Gemini CLI**: Install the Gemini CLI tool and ensure it's in your PATH
+3. **Authentication**: Configure Gemini CLI authentication with `gemini auth`
+4. **Dependencies**: Install required Python packages:
+   ```bash
+   pip install mcp fastapi uvicorn pydantic
+   ```
 
-## Host-Based MCP Integration
+## Installation
 
-### Architecture Overview
-- **Host-Based Setup**: Both MCP server and Gemini CLI run on host machine
-- **Why Host-Only**: Gemini CLI requires interactive authentication and avoids Docker-in-Docker complexity
-- **Communication Modes**: 
-  - **stdio (recommended)**: Bidirectional streaming for production use
-  - **HTTP**: Simple request/response for testing
-- **Auto-consultation**: Detects uncertainty patterns in Claude responses
-- **Manual consultation**: On-demand second opinions via MCP tools
-- **Response synthesis**: Combines both AI perspectives
-- **Singleton Pattern**: Ensures consistent state management across all tool calls
+1. Clone this repository or download the files
+2. Install dependencies:
+   ```bash
+   pip install -r requirements.txt
+   ```
+3. Configure environment variables (see Configuration section)
 
-### Key Files Structure
-```
-├── gemini_mcp_server.py     # stdio-based MCP server with HTTP mode support
-├── gemini_mcp_server_http.py # HTTP server implementation (imported by main)
-├── gemini_integration.py     # Core integration module with singleton pattern  
-├── gemini-config.json        # Gemini configuration
-├── start-gemini-mcp.sh       # Startup script for both modes
-└── test_gemini_mcp.py        # Test script for both server modes
-```
+## Running the Server
 
-All files should be placed in the same directory for easy deployment.
+### stdio Mode (Recommended for Claude Desktop)
 
-### Host-Based MCP Server Setup
-
-#### stdio Mode (Recommended for Production)
 ```bash
-# Start MCP server in stdio mode (default)
-cd your-project
-python3 gemini_mcp_server.py --project-root .
-
-# Or with environment variables
-GEMINI_ENABLED=true \
-GEMINI_AUTO_CONSULT=true \
-GEMINI_CLI_COMMAND=gemini \
-GEMINI_TIMEOUT=200 \
-GEMINI_RATE_LIMIT=2 \
-python3 gemini_mcp_server.py --project-root .
+# Run in stdio mode
+python gemini_mcp_server.py --mode stdio
+
+# With custom project root
+python gemini_mcp_server.py --mode stdio --project-root /path/to/project
 ```
 
-#### HTTP Mode (For Testing)
+### HTTP Mode (For testing and API access)
+
 ```bash
-# Start MCP server in HTTP mode
-python3 gemini_mcp_server.py --project-root . --port 8006
+# Run in HTTP mode on port 8006
+python gemini_mcp_server.py --mode http
 
-# The main server automatically:
-# 1. Detects the --port argument
-# 2. Imports gemini_mcp_server_http module
-# 3. Starts the FastAPI server on the specified port
+# The server will be available at:
+# http://localhost:8006
 ```
 
-### Claude Code Configuration
+## Available Tools
+
+### consult_gemini
+
+Get AI assistance from Gemini for code review, problem-solving, or validation.
 
-#### stdio Configuration (Recommended)
-Add to your Claude Code's MCP settings:
+**Parameters:**
+- `query` (required): The question or code to consult Gemini about
+- `context`: Additional context for the consultation
+- `comparison_mode`: Compare with previous Claude response (default: true)
+- `force`: Force consultation even if disabled (default: false)
+
+**Example:**
 ```json
 {
-  "mcpServers": {
-    "gemini": {
-      "command": "python3",
-      "args": ["/path/to/gemini_mcp_server.py", "--project-root", "."],
-      "cwd": "/path/to/your/project",
-      "env": {
-        "GEMINI_ENABLED": "true",
-        "GEMINI_AUTO_CONSULT": "true", 
-        "GEMINI_CLI_COMMAND": "gemini"
-      }
-    }
+  "tool": "consult_gemini",
+  "arguments": {
+    "query": "Review this function for potential issues",
+    "context": "def factorial(n): return 1 if n <= 1 else n * factorial(n-1)",
+    "comparison_mode": true
   }
 }
 ```
 
-#### HTTP Configuration (For Testing)
+### gemini_status
+
+Get current status and statistics of the Gemini integration.
+
+**Example:**
 ```json
 {
-  "mcpServers": {
-    "gemini-http": {
-      "url": "http://localhost:8006",
-      "transport": "http"
-    }
-  }
+  "tool": "gemini_status",
+  "arguments": {}
 }
 ```
 
-## Server Mode Comparison
-
-| Feature | stdio Mode | HTTP Mode |
-|---------|-----------|-----------|
-| **Communication** | Bidirectional streaming | Request/Response |
-| **Performance** | Better for long operations | Good for simple queries |
-| **Real-time updates** | ✅ Supported | ❌ Not supported |
-| **Setup complexity** | Moderate | Simple |
-| **Use case** | Production | Testing/Development |
-
-## Core Features
+### clear_gemini_history
 
-### 1. Container Detection (Critical Feature)
-Both server modes automatically detect if running inside a container and exit immediately with helpful instructions. This is critical because:
-- Gemini CLI requires Docker access for containerized execution
-- Running Docker-in-Docker causes authentication and performance issues  
-- The server must run on the host system to access the Docker daemon
-- Detection happens before any imports to fail fast with clear error messages
+Clear the conversation history to start fresh consultations.
 
-### 2. Uncertainty Detection
-Automatically detects patterns like:
-- "I'm not sure", "I think", "possibly", "probably"
-- "Multiple approaches", "trade-offs", "alternatives"
-- Critical operations: "security", "production", "database migration"
+**Example:**
+```json
+{
+  "tool": "clear_gemini_history",
+  "arguments": {}
+}
+```
 
-### 3. MCP Tools Available
+### toggle_gemini_auto_consult
 
-#### `consult_gemini`
-Manual consultation with Gemini for second opinions or validation.
+Enable or disable automatic Gemini consultation when uncertainty is detected.
 
 **Parameters:**
-- `query` (required): The question or topic to consult Gemini about
-- `context` (optional): Additional context for the consultation
-- `comparison_mode` (optional, default: true): Whether to request structured comparison format
-- `force` (optional, default: false): Force consultation even if disabled
+- `enable`: true to enable, false to disable, omit to toggle
 
 **Example:**
-```python
-# In Claude Code
-Use the consult_gemini tool with:
-query: "Should I use WebSockets or gRPC for real-time communication?"
-context: "Building a multiplayer application with real-time updates"
-comparison_mode: true
+```json
+{
+  "tool": "toggle_gemini_auto_consult",
+  "arguments": {"enable": true}
+}
 ```
 
-#### `gemini_status`
-Check Gemini integration status and statistics.
+## Configuration
 
-**Returns:**
-- Configuration status (enabled, auto-consult, CLI command, timeout, rate limit)
-- Gemini CLI availability and version
-- Consultation statistics (total, completed, average time)
-- Conversation history size
+### Environment Variables
 
-**Example:**
-```python
-# Check current status
-Use the gemini_status tool
-```
+Create a `.env` file in your project root or set these environment variables:
 
-#### `toggle_gemini_auto_consult`
-Enable or disable automatic Gemini consultation on uncertainty detection.
+```bash
+# Enable/disable Gemini integration
+GEMINI_ENABLED=true
 
-**Parameters:**
-- `enable` (optional): true to enable, false to disable. If not provided, toggles current state.
+# Auto-consultation on uncertainty
+GEMINI_AUTO_CONSULT=false
 
-**Example:**
-```python
-# Toggle auto-consultation
-Use the toggle_gemini_auto_consult tool
+# Gemini CLI command (if not in PATH)
+GEMINI_CLI_COMMAND=gemini
 
-# Or explicitly enable/disable
-Use the toggle_gemini_auto_consult tool with:
-enable: false
-```
+# Request timeout in seconds
+GEMINI_TIMEOUT=300
 
-#### `clear_gemini_history`
-Clear Gemini conversation history to start fresh.
+# Rate limit delay between requests
+GEMINI_RATE_LIMIT=2
 
-**Example:**
-```python
-# Clear all consultation history
-Use the clear_gemini_history tool
-```
+# Maximum context length
+GEMINI_MAX_CONTEXT=4000
 
-### 4. Response Synthesis
-- Identifies agreement/disagreement between Claude and Gemini
-- Provides confidence levels (high/medium/low)
-- Generates combined recommendations
-- Tracks execution time and consultation ID
+# Log consultations
+GEMINI_LOG_CONSULTATIONS=true
 
-### 5. Advanced Features
+# Gemini model to use
+GEMINI_MODEL=gemini-2.5-pro
 
-#### Conversation History
-The integration maintains conversation history across consultations:
-- Configurable history size (default: 10 entries)
-- History included in subsequent consultations for context
-- Can be cleared with `clear_gemini_history` tool
+# Sandbox mode for testing
+GEMINI_SANDBOX=false
 
-#### Uncertainty Detection API
-The MCP server exposes methods for detecting uncertainty:
+# Debug mode
+GEMINI_DEBUG=false
 
-```python
-# Detect uncertainty in responses
-has_uncertainty, patterns = server.detect_response_uncertainty(response_text)
+# Include conversation history
+GEMINI_INCLUDE_HISTORY=true
 
-# Automatically consult if uncertain
-result = await server.maybe_consult_gemini(response_text, context)
+# Maximum history entries
+GEMINI_MAX_HISTORY=10
 ```
 
-#### Statistics Tracking
-- Total consultations attempted
-- Successful completions  
-- Average execution time per consultation
-- Total execution time across all consultations
-- Conversation history size
-- Last consultation timestamp
-- Error tracking and timeout monitoring
+### Configuration File
 
-## Configuration
+Create `gemini-config.json` in your project root:
 
-### Environment Variables
-```bash
-GEMINI_ENABLED=true                   # Enable integration
-GEMINI_AUTO_CONSULT=true              # Auto-consult on uncertainty
-GEMINI_CLI_COMMAND=gemini             # CLI command to use
-GEMINI_TIMEOUT=200                    # Query timeout in seconds
-GEMINI_RATE_LIMIT=5                   # Delay between calls (seconds)
-GEMINI_MAX_CONTEXT=4000               # Max context length
-GEMINI_MODEL=gemini-2.5-flash         # Model to use
-GEMINI_SANDBOX=false                  # Sandboxing isolates operations
-GEMINI_API_KEY=                       # Optional (blank for free tier)
-GEMINI_LOG_CONSULTATIONS=true         # Log consultation details
-GEMINI_DEBUG=false                    # Debug mode
-GEMINI_INCLUDE_HISTORY=true           # Include conversation history
-GEMINI_MAX_HISTORY=10                 # Max history entries to maintain
-GEMINI_MCP_PORT=8006                  # Port for HTTP mode (if used)
-GEMINI_MCP_HOST=127.0.0.1             # Host for HTTP mode (if used)
-```
-
-### Gemini Configuration File
-Create `gemini-config.json`:
 ```json
 {
   "enabled": true,
   "auto_consult": true,
-  "cli_command": "gemini",
-  "timeout": 300,
-  "rate_limit_delay": 5.0,
-  "log_consultations": true,
+  "timeout": 60,
   "model": "gemini-2.5-flash",
-  "sandbox_mode": true,
-  "debug_mode": false,
-  "include_history": true,
-  "max_history_entries": 10,
-  "uncertainty_thresholds": {
-    "uncertainty_patterns": true,
-    "complex_decisions": true,
-    "critical_operations": true
-  }
+  "max_context_length": 4000,
+  "rate_limit_delay": 2
 }
 ```
 
-## Integration Module Core
+## Integration with Claude Desktop
 
-### Uncertainty Patterns (Python)
-```python
-UNCERTAINTY_PATTERNS = [
-    r"\bI'm not sure\b",
-    r"\bI think\b", 
-    r"\bpossibly\b",
-    r"\bprobably\b",
-    r"\bmight be\b",
-    r"\bcould be\b",
-    # ... more patterns
-]
-
-COMPLEX_DECISION_PATTERNS = [
-    r"\bmultiple approaches\b",
-    r"\bseveral options\b", 
-    r"\btrade-offs?\b",
-    r"\balternatives?\b",
-    # ... more patterns
-]
-
-CRITICAL_OPERATION_PATTERNS = [
-    r"\bproduction\b",
-    r"\bdatabase migration\b",
-    r"\bsecurity\b",
-    r"\bauthentication\b",
-    # ... more patterns
-]
-```
+Add to your Claude Desktop configuration (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS):
 
-### Basic Integration Class Structure
-```python
-class GeminiIntegration:
-    def __init__(self, config: Optional[Dict[str, Any]] = None):
-        self.config = config or {}
-        self.enabled = self.config.get('enabled', True)
-        self.auto_consult = self.config.get('auto_consult', True)
-        self.cli_command = self.config.get('cli_command', 'gemini')
-        self.timeout = self.config.get('timeout', 60)
-        self.rate_limit_delay = self.config.get('rate_limit_delay', 2)
-        self.conversation_history = []
-        self.max_history_entries = self.config.get('max_history_entries', 10)
-
-    async def consult_gemini(self, query: str, context: str = "") -> Dict[str, Any]:
-        """Consult Gemini CLI for second opinion"""
-        # Rate limiting
-        await self._enforce_rate_limit()
-
-        # Prepare query with context and history
-        full_query = self._prepare_query(query, context)
-
-        # Execute Gemini CLI command
-        result = await self._execute_gemini_cli(full_query)
-
-        # Update conversation history
-        if self.include_history and result.get("output"):
-            self.conversation_history.append((query, result["output"]))
-            # Trim history if needed
-            if len(self.conversation_history) > self.max_history_entries:
-                self.conversation_history = self.conversation_history[-self.max_history_entries:]
-
-        return result
-
-    def detect_uncertainty(self, text: str) -> Tuple[bool, List[str]]:
-        """Detect if text contains uncertainty patterns"""
-        found_patterns = []
-        # Check all pattern categories
-        # Returns (has_uncertainty, list_of_matched_patterns)
-
-# Singleton pattern implementation
-_integration = None
-
-def get_integration(config: Optional[Dict[str, Any]] = None) -> GeminiIntegration:
-    """Get or create the global Gemini integration instance"""
-    global _integration
-    if _integration is None:
-        _integration = GeminiIntegration(config)
-    return _integration
+```json
+{
+  "mcpServers": {
+    "gemini": {
+      "command": "python",
+      "args": [
+        "/path/to/gemini_mcp_server.py",
+        "--mode", "stdio",
+        "--project-root", "/path/to/your/project"
+      ]
+    }
+  }
+}
 ```
 
-### Singleton Pattern Benefits
-The singleton pattern ensures:
-- **Consistent Rate Limiting**: All MCP tool calls share the same rate limiter
-- **Unified Configuration**: Changes to config affect all usage points
-- **State Persistence**: Consultation history and statistics are maintained
-- **Resource Efficiency**: Only one instance manages the Gemini CLI connection
+## Testing
 
-## Example Workflows
+### Test stdio Mode
+```bash
+python test_gemini_mcp.py --mode stdio
+```
 
-### Manual Consultation
-```python
-# In Claude Code
-Use the consult_gemini tool with:
-query: "Should I use WebSockets or gRPC for real-time communication?"
-context: "Building a multiplayer application with real-time updates"
+### Test HTTP Mode
+```bash
+# Start the server in HTTP mode first
+python gemini_mcp_server.py --mode http
+
+# In another terminal, run tests
+python test_gemini_mcp.py --mode http
 ```
 
-### Automatic Consultation Flow
+### Test State Management
+```bash
+python test_gemini_state.py
 ```
-User: "How should I handle authentication?"
 
-Claude: "I think OAuth might work, but I'm not certain about the security implications..."
+## HTTP API Endpoints
 
-[Auto-consultation triggered]
+When running in HTTP mode, the following endpoints are available:
 
-Gemini: "For authentication, consider these approaches: 1) OAuth 2.0 with PKCE for web apps..."
+- `GET /health` - Health check
+- `GET /mcp/tools` - List available tools
+- `POST /mcp/execute` - Execute a tool
+- `GET /mcp/stats` - Server statistics
+- `POST /messages` - MCP protocol messages
 
-Synthesis: Both suggest OAuth but Claude uncertain about security. Gemini provides specific implementation details. Recommendation: Follow Gemini's OAuth 2.0 with PKCE approach.
-```
+## Uncertainty Detection
 
-## Testing
+The server automatically detects uncertainty in responses and can trigger automatic Gemini consultation. Detected patterns include:
 
-### Test Both Server Modes
-```bash
-# Test stdio mode (default)
-python3 test_gemini_mcp.py
+- Uncertainty phrases: "I'm not sure", "I think", "possibly", "probably"
+- Complex decisions: "multiple approaches", "trade-offs", "alternatives"
+- Critical operations: "production", "security", "authentication"
 
-# Test HTTP mode
-python3 test_gemini_mcp.py --mode http
+## Example Workflow
 
-# Test specific server
-python3 test_gemini_mcp.py --mode stdio --verbose
-```
+```python
+# 1. Check status
+status = await client.execute_tool("gemini_status")
 
-### Manual Testing via HTTP
-```bash
-# Start HTTP server
-python3 gemini_mcp_server.py --port 8006
+# 2. Clear history for fresh start
+await client.execute_tool("clear_gemini_history")
 
-# Test endpoints
-curl http://localhost:8006/health
-curl http://localhost:8006/mcp/tools
+# 3. Consult about code
+result = await client.execute_tool("consult_gemini", {
+    "query": "Review this Python function for best practices",
+    "context": "def process_data(data): return [x*2 for x in data if x > 0]"
+})
 
-# Test Gemini consultation
-curl -X POST http://localhost:8006/mcp/tools/consult_gemini \
-  -H "Content-Type: application/json" \
-  -d '{"query": "What is the best Python web framework?"}'
+# 4. Disable auto-consult if needed
+await client.execute_tool("toggle_gemini_auto_consult", {"enable": False})
 ```
 
 ## Troubleshooting
 
-| Issue | Solution |
-|-------|----------|
-| Gemini CLI not found | Install Node.js 18+ and `npm install -g @google/gemini-cli` |
-| Authentication errors | Run `gemini` and sign in with Google account |
-| Node version issues | Use `nvm use 22.16.0` |
-| Timeout errors | Increase `GEMINI_TIMEOUT` (default: 60s) |
-| Auto-consult not working | Check `GEMINI_AUTO_CONSULT=true` |
-| Rate limiting | Adjust `GEMINI_RATE_LIMIT` (default: 2s) |
-| Container detection error | Ensure running on host system, not in Docker |
-| stdio connection issues | Check Claude Code MCP configuration |
-| HTTP connection refused | Verify port availability and firewall settings |
+### "Cannot run in container" Error
+
+If you see this error, you're trying to run the server inside Docker. Exit the container and run on your host system.
+
+### Gemini CLI Not Found
+
+1. Install the Gemini CLI tool
+2. Add it to your PATH
+3. Or set `GEMINI_CLI_COMMAND` to the full path
+
+### Authentication Issues
+
+1. Run `gemini auth` to configure authentication
+2. Ensure your credentials are valid, avoid API key if you want to stick to free tier.
+3. Check the Gemini CLI documentation
+
+### Timeout Errors
+
+1. Increase `GEMINI_TIMEOUT` for complex queries
+2. Simplify your queries
+3. Check network connectivity
 
 ## Security Considerations
 
-1. **API Credentials**: Store securely, use environment variables
-2. **Data Privacy**: Be cautious about sending proprietary code
-3. **Input Sanitization**: Sanitize queries before sending
-4. **Rate Limiting**: Respect API limits (free tier: 60/min, 1000/day)
-5. **Host-Based Architecture**: Both Gemini CLI and MCP server run on host for auth compatibility
-6. **Network Security**: HTTP mode binds to 127.0.0.1 by default (not 0.0.0.0)
+- API keys are managed by the Gemini CLI
+- No credentials are stored in the MCP server
+- Consultation logs can be disabled for sensitive code
+- Sandbox mode available for testing without API calls
 
 ## Best Practices
 
-1. **Rate Limiting**: Implement appropriate delays between calls
-2. **Context Management**: Keep context concise and relevant  
-3. **Error Handling**: Always handle Gemini failures gracefully
-4. **User Control**: Allow users to disable auto-consultation
-5. **Logging**: Log consultations for debugging and analysis
-6. **History Management**: Periodically clear history to avoid context bloat
-7. **Mode Selection**: Use stdio for production, HTTP for testing
-
-## Use Cases
-
-- **Architecture Decisions**: Get second opinions on design choices
-- **Security Reviews**: Validate security implementations
-- **Performance Optimization**: Compare optimization strategies  
-- **Code Quality**: Review complex algorithms or patterns
-- **Troubleshooting**: Debug complex technical issues
-- **API Design**: Validate REST/GraphQL/gRPC decisions
-- **Database Schema**: Review data modeling choices
+1. **Clear History Regularly**: Clear conversation history when switching contexts
+2. **Provide Context**: Include relevant context for better AI responses
+3. **Rate Limiting**: Respect rate limits to avoid API quota issues
+4. **Error Handling**: Always handle potential timeout or API errors
+5. **Comparison Mode**: Use comparison mode to get diverse perspectives
+
+## Acknowledgments
+
+- Built for the MCP (Model Context Protocol) ecosystem
+- Integrates with Google's Gemini AI
+- Designed for Claude Desktop integration
diff --git a/gemini-config.example.json b/gemini-config.example.json
@@ -0,0 +1,14 @@
+{
+  "enabled": true,
+  "auto_consult": false,
+  "cli_command": "gemini",
+  "timeout": 300,
+  "rate_limit_delay": 2.0,
+  "max_context_length": 4000,
+  "log_consultations": true,
+  "model": "gemini-2.5-pro",
+  "sandbox_mode": false,
+  "debug_mode": false,
+  "include_history": true,
+  "max_history_entries": 10
+}
diff --git a/gemini-config.json b/gemini-config.json
@@ -1,19 +0,0 @@
-{
-  "enabled": true,
-  "auto_consult": true,
-  "cli_command": "gemini",
-  "timeout": 300,
-  "rate_limit_delay": 5.0,
-  "max_context_length": 4000,
-  "log_consultations": true,
-  "model": "gemini-2.5-flash",
-  "sandbox_mode": true,
-  "debug_mode": false,
-  "include_history": true,
-  "max_history_entries": 10,
-  "uncertainty_thresholds": {
-    "uncertainty_patterns": true,
-    "complex_decisions": true,
-    "critical_operations": true
-  }
-}

diff --git a/gemini_integration.py b/gemini_integration.py
@@ -71,12 +71,12 @@ def __init__(self, config: Optional[Dict[str, Any]] = None):
         self.timeout = self.config.get("timeout", 60)
         self.rate_limit_delay = self.config.get("rate_limit_delay", 2.0)
         self.last_consultation = 0
-        self.consultation_log = []
+        self.consultation_log: List[Dict[str, Any]] = []
         self.max_context_length = self.config.get("max_context_length", 4000)
         self.model = self.config.get("model", "gemini-2.5-flash")
 
         # Conversation history for maintaining state
-        self.conversation_history = []
+        self.conversation_history: List[Tuple[str, str]] = []
         self.max_history_entries = self.config.get("max_history_entries", 10)
         self.include_history = self.config.get("include_history", True)
 
@@ -108,9 +108,7 @@ async def consult_gemini(
                 self.conversation_history.append((query, result["output"]))
                 # Trim history if it exceeds max entries
                 if len(self.conversation_history) > self.max_history_entries:
-                    self.conversation_history = self.conversation_history[
-                        -self.max_history_entries :
-                    ]
+                    self.conversation_history = self.conversation_history[-self.max_history_entries :]
 
             # Log consultation
             if self.config.get("log_consultations", True):
@@ -177,27 +175,15 @@ def get_consultation_stats(self) -> Dict[str, Any]:
             return {"total_consultations": 0}
 
         completed = [e for e in self.consultation_log if e.get("status") == "success"]
-        total_execution_time = sum(e.get("execution_time", 0) for e in completed)
 
-        stats = {
+        return {
             "total_consultations": len(self.consultation_log),
             "completed_consultations": len(completed),
             "average_execution_time": (
-                total_execution_time / len(completed)
-                if completed
-                else 0
+                sum(e.get("execution_time", 0) for e in completed) / len(completed) if completed else 0
             ),
-            "total_execution_time": total_execution_time,
             "conversation_history_size": len(self.conversation_history),
         }
-
-        # Add last consultation timestamp if available
-        if self.consultation_log:
-            last_entry = self.consultation_log[-1]
-            if last_entry.get("timestamp"):
-                stats["last_consultation"] = last_entry["timestamp"]
-
-        return stats
 
     async def _enforce_rate_limit(self):
         """Enforce rate limiting between consultations"""
@@ -222,9 +208,7 @@ def _prepare_query(self, query: str, context: str, comparison_mode: bool) -> str
         if self.include_history and self.conversation_history:
             parts.append("Previous conversation:")
             parts.append("-" * 40)
-            for i, (prev_q, prev_a) in enumerate(
-                self.conversation_history[-self.max_history_entries :], 1
-            ):
+            for i, (prev_q, prev_a) in enumerate(self.conversation_history[-self.max_history_entries :], 1):
                 parts.append(f"Q{i}: {prev_q}")
                 # Truncate long responses in history
                 if len(prev_a) > 500:
@@ -276,9 +260,7 @@ async def _execute_gemini_cli(self, query: str) -> Dict[str, Any]:
                 *cmd, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE
             )
 
-            stdout, stderr = await asyncio.wait_for(
-                process.communicate(), timeout=self.timeout
-            )
+            stdout, stderr = await asyncio.wait_for(process.communicate(), timeout=self.timeout)
 
             execution_time = time.time() - start_time
 

diff --git a/gemini_mcp_server.py b/gemini_mcp_server.py
@@ -1,391 +1,1060 @@
 #!/usr/bin/env python3
 """
-MCP Server with Gemini Integration
+Gemini AI Integration MCP Server
 Provides development workflow automation with AI second opinions
 """
 
 import asyncio
 import json
+import logging
 import os
 import sys
+from datetime import datetime
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple
+from typing import Any, Dict, List, Optional
 
 import mcp.server.stdio
 import mcp.types as types
-from mcp.server import NotificationOptions, Server, InitializationOptions
+from fastapi import FastAPI, HTTPException, Request
+from fastapi.responses import JSONResponse, RedirectResponse, Response
+from mcp.server import InitializationOptions, NotificationOptions, Server
+from pydantic import BaseModel
 
+# Setup logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
 
-# Check if running in container BEFORE any other imports or operations
-def check_container_and_exit():
-    """Check if running in a container and exit immediately if true."""
+
+def check_container_environment():
+    """Check if running in a container"""
     if os.path.exists("/.dockerenv") or os.environ.get("CONTAINER_ENV"):
-        print("ERROR: Gemini MCP Server cannot run inside a container!", file=sys.stderr)
-        print(
-            "The Gemini CLI requires Docker access and must run on the host system.",
-            file=sys.stderr,
-        )
-        print("Please launch this server directly on the host with:", file=sys.stderr)
-        print("  python gemini_mcp_server.py", file=sys.stderr)
-        sys.exit(1)
+        return True
+    return False
+
+
+def setup_logging(name: str):
+    """Setup logging for the server"""
+    logger = logging.getLogger(name)
+    logger.setLevel(logging.INFO)
+
+    # Create console handler with formatting
+    ch = logging.StreamHandler()
+    ch.setLevel(logging.INFO)
+    formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
+    ch.setFormatter(formatter)
+
+    # Add handler to logger
+    logger.addHandler(ch)
+
+    return logger
+
 
+class ToolRequest(BaseModel):
+    """Model for tool execution requests"""
+    tool: str
+    arguments: Optional[Dict[str, Any]] = None
+    parameters: Optional[Dict[str, Any]] = None
+    client_id: Optional[str] = None
 
-# Perform container check immediately
-check_container_and_exit()
+    def get_args(self) -> Dict[str, Any]:
+        """Get arguments, supporting both 'arguments' and 'parameters' fields"""
+        return self.arguments or self.parameters or {}
 
-# Now import the integration module
-from gemini_integration import get_integration
 
+class ToolResponse(BaseModel):
+    """Model for tool execution responses"""
+    success: bool
+    result: Any
+    error: Optional[str] = None
+
+
+class GeminiMCPServer:
+    """MCP Server for Gemini AI integration and consultation"""
+
+    def __init__(self, project_root: Optional[str] = None):
+        # Check if running in container and exit if true
+        if check_container_environment():
+            print(
+                "ERROR: Gemini MCP Server cannot run inside a container!",
+                file=sys.stderr,
+            )
+            print(
+                "The Gemini CLI requires Docker access and must run on the host system.",
+                file=sys.stderr,
+            )
+            print("Please launch this server directly on the host with:", file=sys.stderr)
+            print("  python gemini_mcp_server.py", file=sys.stderr)
+            sys.exit(1)
+
+        # Initialize base server attributes
+        self.name = "Gemini MCP Server"
+        self.version = "1.0.0"
+        self.port = 8006  # Standard Gemini MCP port
+        self.logger = setup_logging("GeminiMCP")
+        self.app = FastAPI(title=self.name, version=self.version)
+        self._setup_routes()
+        self._setup_events()
 
-class MCPServer:
-    def __init__(self, project_root: str = None):
         self.project_root = Path(project_root) if project_root else Path.cwd()
-        self.server = Server("gemini-mcp-server")
-
-        # Initialize Gemini integration with singleton pattern
+
+        # Initialize Gemini integration
         self.gemini_config = self._load_gemini_config()
-        # Get the singleton instance, passing config on first call
-        self.gemini = get_integration(self.gemini_config)
-
+        self.gemini = self._initialize_gemini()
+
         # Track uncertainty for auto-consultation
         self.last_response_uncertainty = None
-
-        self._setup_tools()
+
+    def _setup_events(self):
+        """Setup startup/shutdown events"""
+        @self.app.on_event("startup")
+        async def startup_event():
+            self.logger.info(f"{self.name} starting on port {self.port}")
+            self.logger.info(f"Server version: {self.version}")
+            self.logger.info("Server initialized successfully")
+
+    def _setup_routes(self):
+        """Setup common HTTP routes"""
+        self.app.get("/health")(self.health_check)
+        self.app.get("/mcp/tools")(self.list_tools)
+        self.app.post("/mcp/execute")(self.execute_tool)
+        self.app.post("/mcp/register")(self.register_client)
+        self.app.post("/register")(self.register_client_oauth)
+        self.app.post("/oauth/register")(self.register_client_oauth)
+        self.app.get("/authorize")(self.oauth_authorize_bypass)
+        self.app.post("/authorize")(self.oauth_authorize_bypass)
+        self.app.get("/oauth/authorize")(self.oauth_authorize_bypass)
+        self.app.post("/oauth/authorize")(self.oauth_authorize_bypass)
+        self.app.post("/token")(self.oauth_token_bypass)
+        self.app.post("/oauth/token")(self.oauth_token_bypass)
+        self.app.get("/mcp/clients")(self.list_clients)
+        self.app.get("/mcp/clients/{client_id}")(self.get_client_info)
+        self.app.get("/mcp/stats")(self.get_stats)
+        self.app.get("/.well-known/oauth-authorization-server")(self.oauth_discovery)
+        self.app.get("/.well-known/oauth-authorization-server/mcp")(self.oauth_discovery)
+        self.app.get("/.well-known/oauth-authorization-server/messages")(self.oauth_discovery)
+        self.app.get("/.well-known/oauth-protected-resource")(self.oauth_protected_resource)
+        self.app.get("/.well-known/mcp")(self.mcp_discovery)
+        self.app.post("/mcp/initialize")(self.mcp_initialize)
+        self.app.get("/mcp/capabilities")(self.mcp_capabilities)
+        self.app.get("/messages")(self.handle_messages_get)
+        self.app.post("/messages")(self.handle_messages)
+        self.app.get("/mcp")(self.handle_mcp_get)
+        self.app.post("/mcp")(self.handle_jsonrpc)
+        self.app.options("/mcp")(self.handle_options)
+        self.app.post("/mcp/rpc")(self.handle_jsonrpc)
+        self.app.get("/mcp/sse")(self.handle_mcp_sse)
+
+    async def health_check(self):
+        """Health check endpoint"""
+        return {"status": "healthy", "server": self.name, "version": self.version}
+
+    async def register_client(self, request: Dict[str, Any]):
+        """Register a client - simplified for home lab use"""
+        client_name = request.get("client", request.get("client_name", "unknown"))
+        client_id = request.get("client_id", f"{client_name}_simple")
+
+        self.logger.info(f"Client registration request from: {client_name}")
+
+        return {
+            "status": "registered",
+            "client": client_name,
+            "client_id": client_id,
+            "server": self.name,
+            "version": self.version,
+            "registration": {
+                "client_id": client_id,
+                "client_name": client_name,
+                "registered": True,
+                "is_update": False,
+                "registration_time": datetime.utcnow().isoformat(),
+                "server_time": datetime.utcnow().isoformat(),
+            },
+        }
+
+    async def register_client_oauth(self, request_data: Dict[str, Any], request: Request):
+        """OAuth2-style client registration - simplified for home lab use"""
+        redirect_uris = request_data.get("redirect_uris", [])
+        client_name = request_data.get("client_name", request_data.get("client", "claude-code"))
+        client_id = f"{client_name}_oauth"
+
+        self.logger.info(f"OAuth registration request from: {client_name}")
+
+        return {
+            "client_id": client_id,
+            "client_name": client_name,
+            "redirect_uris": redirect_uris if redirect_uris else ["http://localhost"],
+            "grant_types": request_data.get("grant_types", ["authorization_code"]),
+            "response_types": request_data.get("response_types", ["code"]),
+            "token_endpoint_auth_method": request_data.get("token_endpoint_auth_method", "none"),
+            "registration_access_token": "not-required-for-local-mcp",
+            "registration_client_uri": f"{request.url.scheme}://{request.url.netloc}/mcp/clients/{client_id}",
+            "client_id_issued_at": int(datetime.utcnow().timestamp()),
+            "client_secret_expires_at": 0,
+        }
+
+    async def oauth_authorize_bypass(self, request: Request):
+        """Bypass OAuth2 authorization - immediately approve without auth"""
+        params = dict(request.query_params)
+        redirect_uri = params.get("redirect_uri", "http://localhost")
+        state = params.get("state", "")
+
+        auth_code = "bypass-auth-code-no-auth-required"
+
+        separator = "&" if "?" in redirect_uri else "?"
+        redirect_url = f"{redirect_uri}{separator}code={auth_code}"
+        if state:
+            redirect_url += f"&state={state}"
+
+        return RedirectResponse(url=redirect_url, status_code=302)
+
+    async def oauth_token_bypass(self, request: Request):
+        """Bypass OAuth2 token exchange - immediately return access token"""
+        try:
+            if request.headers.get("content-type", "").startswith("application/json"):
+                request_data = await request.json()
+            else:
+                form_data = await request.form()
+                request_data = dict(form_data)
+        except Exception:
+            request_data = {}
+
+        self.logger.info(f"Token request data: {request_data}")
+
+        return {
+            "access_token": "bypass-token-no-auth-required",
+            "token_type": "Bearer",
+            "expires_in": 31536000,
+            "scope": "full_access",
+            "refresh_token": "bypass-refresh-token-no-auth-required",
+        }
+
+    async def oauth_discovery(self, request: Request):
+        """OAuth 2.0 authorization server metadata"""
+        base_url = f"{request.url.scheme}://{request.url.netloc}"
+        return {
+            "issuer": base_url,
+            "authorization_endpoint": f"{base_url}/authorize",
+            "token_endpoint": f"{base_url}/token",
+            "registration_endpoint": f"{base_url}/register",
+            "token_endpoint_auth_methods_supported": ["none"],
+            "response_types_supported": ["code"],
+            "grant_types_supported": ["authorization_code"],
+            "code_challenge_methods_supported": ["S256"],
+            "registration_endpoint_auth_methods_supported": ["none"],
+        }
+
+    async def oauth_protected_resource(self, request: Request):
+        """OAuth 2.0 protected resource metadata"""
+        base_url = f"{request.url.scheme}://{request.url.netloc}"
+        return {
+            "resource": f"{base_url}/messages",
+            "authorization_servers": [base_url],
+        }
+
+    async def handle_mcp_get(self, request: Request):
+        """Handle GET requests to /mcp endpoint for SSE streaming"""
+        import uuid
+        from fastapi.responses import StreamingResponse
+
+        session_id = request.headers.get("Mcp-Session-Id", str(uuid.uuid4()))
+
+        async def event_generator():
+            connection_data = {
+                "type": "connection",
+                "sessionId": session_id,
+                "status": "connected",
+            }
+            yield f"data: {json.dumps(connection_data)}\n\n"
+
+            while True:
+                await asyncio.sleep(15)
+                ping_data = {"type": "ping", "timestamp": datetime.utcnow().isoformat()}
+                yield f"data: {json.dumps(ping_data)}\n\n"
+
+        return StreamingResponse(
+            event_generator(),
+            media_type="text/event-stream",
+            headers={
+                "Cache-Control": "no-cache",
+                "Connection": "keep-alive",
+                "X-Accel-Buffering": "no",
+                "Mcp-Session-Id": session_id,
+            },
+        )
+
+    async def handle_mcp_sse(self, request: Request):
+        """Handle SSE requests for authenticated clients"""
+        from fastapi.responses import StreamingResponse
+
+        auth_header = request.headers.get("authorization", "")
+        if not auth_header.startswith("Bearer "):
+            raise HTTPException(status_code=401, detail="Unauthorized")
+
+        async def event_generator():
+            yield f"data: {json.dumps({'type': 'connected', 'message': 'SSE connection established'})}\n\n"
+
+            while True:
+                await asyncio.sleep(30)
+                yield f"data: {json.dumps({'type': 'ping'})}\n\n"
+
+        return StreamingResponse(
+            event_generator(),
+            media_type="text/event-stream",
+            headers={
+                "Cache-Control": "no-cache",
+                "Connection": "keep-alive",
+                "X-Accel-Buffering": "no",
+            },
+        )
+
+    async def handle_messages_get(self, request: Request):
+        """Handle GET requests to /messages endpoint"""
+        return {
+            "protocol": "mcp",
+            "version": "1.0",
+            "server": {
+                "name": self.name,
+                "version": self.version,
+                "description": f"{self.name} MCP Server",
+            },
+            "auth": {
+                "required": False,
+                "type": "none",
+            },
+            "transport": {
+                "type": "streamable-http",
+                "endpoint": "/messages",
+            },
+        }
+
+    async def handle_messages(self, request: Request):
+        """Handle POST requests to /messages endpoint (HTTP Stream Transport)"""
+        session_id = request.headers.get("Mcp-Session-Id")
+        response_mode = request.headers.get("Mcp-Response-Mode", "batch").lower()
+        protocol_version = request.headers.get("MCP-Protocol-Version")
+
+        self.logger.info(f"Messages request headers: {dict(request.headers)}")
+        self.logger.info(f"Session ID: {session_id}, Response Mode: {response_mode}, Protocol Version: {protocol_version}")
+
+        try:
+            body = await request.json()
+            self.logger.info(f"Messages request body: {json.dumps(body)}")
+
+            is_init_request = False
+            if isinstance(body, dict) and body.get("method") == "initialize":
+                is_init_request = True
+                if not session_id:
+                    import uuid
+                    session_id = str(uuid.uuid4())
+                    self.logger.info(f"Generated new session ID: {session_id}")
+
+            if response_mode == "stream":
+                from fastapi.responses import StreamingResponse
+
+                async def event_generator():
+                    if session_id:
+                        yield f"data: {json.dumps({'type': 'session', 'sessionId': session_id})}\n\n"
+
+                    if isinstance(body, list):
+                        for req in body:
+                            response = await self._process_jsonrpc_request(req)
+                            if response:
+                                yield f"data: {json.dumps(response)}\n\n"
+                    else:
+                        response = await self._process_jsonrpc_request(body)
+                        if response:
+                            yield f"data: {json.dumps(response)}\n\n"
+
+                    yield f"data: {json.dumps({'type': 'completion'})}\n\n"
+
+                return StreamingResponse(
+                    event_generator(),
+                    media_type="text/event-stream",
+                    headers={
+                        "Cache-Control": "no-cache",
+                        "Connection": "keep-alive",
+                        "X-Accel-Buffering": "no",
+                        "Mcp-Session-Id": session_id or "",
+                    },
+                )
+            else:
+                if isinstance(body, list):
+                    responses = []
+                    has_notifications = False
+                    for req in body:
+                        response = await self._process_jsonrpc_request(req)
+                        if response is None:
+                            has_notifications = True
+                        else:
+                            responses.append(response)
+
+                    if not responses and has_notifications:
+                        return Response(
+                            status_code=202,
+                            headers={
+                                "Mcp-Session-Id": session_id or "",
+                            },
+                        )
+
+                    return JSONResponse(
+                        content=responses,
+                        headers={
+                            "Content-Type": "application/json",
+                            "Mcp-Session-Id": session_id or "",
+                        },
+                    )
+                else:
+                    response = await self._process_jsonrpc_request(body)
+                    if response is None:
+                        return Response(
+                            status_code=202,
+                            headers={
+                                "Mcp-Session-Id": session_id or "",
+                            },
+                        )
+                    else:
+                        if is_init_request and session_id:
+                            self.logger.info(f"Returning session ID in response: {session_id}")
+
+                        return JSONResponse(
+                            content=response,
+                            headers={
+                                "Content-Type": "application/json",
+                                "Mcp-Session-Id": session_id or "",
+                            },
+                        )
+        except Exception as e:
+            self.logger.error(f"Messages endpoint error: {e}")
+            return JSONResponse(
+                content={
+                    "jsonrpc": "2.0",
+                    "error": {"code": -32700, "message": "Parse error", "data": str(e)},
+                    "id": None,
+                },
+                status_code=400,
+                headers={
+                    "Content-Type": "application/json",
+                    "Mcp-Session-Id": session_id or "",
+                },
+            )
+
+    async def handle_jsonrpc(self, request: Request):
+        """Handle JSON-RPC 2.0 requests for MCP protocol"""
+        return await self.handle_messages(request)
+
+    async def handle_options(self, request: Request):
+        """Handle OPTIONS requests for CORS preflight"""
+        return Response(
+            content="",
+            headers={
+                "Access-Control-Allow-Origin": "*",
+                "Access-Control-Allow-Methods": "GET, POST, OPTIONS",
+                "Access-Control-Allow-Headers": "Content-Type, Authorization, Mcp-Session-Id, Mcp-Response-Mode",
+                "Access-Control-Max-Age": "86400",
+            },
+        )
+
+    async def _process_jsonrpc_request(self, request: Dict[str, Any]) -> Optional[Dict[str, Any]]:
+        """Process a single JSON-RPC request"""
+        jsonrpc = request.get("jsonrpc", "2.0")
+        method = request.get("method")
+        params = request.get("params", {})
+        req_id = request.get("id")
+
+        self.logger.info(f"JSON-RPC request: method={method}, id={req_id}")
+
+        is_notification = req_id is None
+
+        try:
+            if method == "initialize":
+                result = await self._jsonrpc_initialize(params)
+            elif method == "initialized":
+                self.logger.info("Client sent initialized notification")
+                if is_notification:
+                    return None
+                result = {"status": "acknowledged"}
+            elif method == "tools/list":
+                result = await self._jsonrpc_list_tools(params)
+            elif method == "tools/call":
+                result = await self._jsonrpc_call_tool(params)
+            elif method == "completion/complete":
+                result = {"error": "Completions not supported"}
+            elif method == "ping":
+                result = {"pong": True}
+            else:
+                if not is_notification:
+                    return {
+                        "jsonrpc": jsonrpc,
+                        "error": {
+                            "code": -32601,
+                            "message": f"Method not found: {method}",
+                        },
+                        "id": req_id,
+                    }
+                return None
+
+            if not is_notification:
+                response = {"jsonrpc": jsonrpc, "result": result, "id": req_id}
+                self.logger.info(f"JSON-RPC response: {json.dumps(response)}")
+
+                if method == "initialize" and "protocolVersion" in result:
+                    self.logger.info("Initialization complete, ready for tools/list request")
+                    self.logger.info("Expecting client to send 'tools/list' request next")
+
+                return response
+            return None
+
+        except Exception as e:
+            self.logger.error(f"Error processing method {method}: {e}")
+            if not is_notification:
+                return {
+                    "jsonrpc": jsonrpc,
+                    "error": {
+                        "code": -32603,
+                        "message": "Internal error",
+                        "data": str(e),
+                    },
+                    "id": req_id,
+                }
+            return None
+
+    async def _jsonrpc_initialize(self, params: Dict[str, Any]) -> Dict[str, Any]:
+        """Handle initialize request"""
+        client_info = params.get("clientInfo", {})
+        protocol_version = params.get("protocolVersion", "2024-11-05")
+
+        self.logger.info(f"Client info: {client_info}, requested protocol: {protocol_version}")
+
+        self._protocol_version = protocol_version
+
+        return {
+            "protocolVersion": protocol_version,
+            "serverInfo": {"name": self.name, "version": self.version},
+            "capabilities": {
+                "tools": {"listChanged": True},
+                "resources": {},
+                "prompts": {},
+            },
+        }
+
+    async def _jsonrpc_list_tools(self, params: Dict[str, Any]) -> Dict[str, Any]:
+        """Handle tools/list request"""
+        tools = self.get_tools()
+        self.logger.info(f"Available tools from get_tools(): {list(tools.keys())}")
+
+        tool_list = []
+
+        for tool_name, tool_info in tools.items():
+            tool_list.append(
+                {
+                    "name": tool_name,
+                    "description": tool_info.get("description", ""),
+                    "inputSchema": tool_info.get("parameters", {}),
+                }
+            )
+
+        self.logger.info(f"Returning {len(tool_list)} tools to client")
+        return {"tools": tool_list}
+
+    async def _jsonrpc_call_tool(self, params: Dict[str, Any]) -> Dict[str, Any]:
+        """Handle tools/call request"""
+        tool_name = params.get("name")
+        arguments = params.get("arguments", {})
+
+        if not tool_name:
+            raise ValueError("Tool name is required")
+
+        tools = self.get_tools()
+        if tool_name not in tools:
+            raise ValueError(f"Tool '{tool_name}' not found")
+
+        tool_func = getattr(self, tool_name, None)
+        if not tool_func:
+            raise ValueError(f"Tool '{tool_name}' not implemented")
+
+        try:
+            result = await tool_func(**arguments)
+
+            if isinstance(result, dict):
+                content_text = json.dumps(result, indent=2)
+            else:
+                content_text = str(result)
+
+            return {"content": [{"type": "text", "text": content_text}]}
+        except Exception as e:
+            self.logger.error(f"Error calling tool {tool_name}: {e}")
+            return {
+                "content": [{"type": "text", "text": f"Error executing {tool_name}: {str(e)}"}],
+                "isError": True,
+            }
+
+    async def mcp_discovery(self):
+        """MCP protocol discovery endpoint"""
+        return {
+            "mcp_version": "1.0",
+            "server_name": self.name,
+            "server_version": self.version,
+            "capabilities": {
+                "tools": True,
+                "prompts": False,
+                "resources": False,
+            },
+            "endpoints": {
+                "tools": "/mcp/tools",
+                "execute": "/mcp/execute",
+                "initialize": "/mcp/initialize",
+                "capabilities": "/mcp/capabilities",
+            },
+        }
+
+    async def mcp_info(self):
+        """MCP server information"""
+        return {
+            "protocol": "mcp",
+            "version": "1.0",
+            "server": {
+                "name": self.name,
+                "version": self.version,
+                "description": f"{self.name} MCP Server",
+            },
+            "auth": {
+                "required": False,
+                "type": "none",
+            },
+        }
+
+    async def mcp_initialize(self, request: Dict[str, Any]):
+        """Initialize MCP session"""
+        client_info = request.get("client", {})
+        return {
+            "session_id": f"session-{client_info.get('name', 'unknown')}-{int(datetime.utcnow().timestamp())}",
+            "server": {
+                "name": self.name,
+                "version": self.version,
+            },
+            "capabilities": {
+                "tools": True,
+                "prompts": False,
+                "resources": False,
+            },
+        }
+
+    async def mcp_capabilities(self):
+        """Return server capabilities"""
+        tools = self.get_tools()
+        return {
+            "capabilities": {
+                "tools": {
+                    "list": list(tools.keys()),
+                    "count": len(tools),
+                },
+                "prompts": {
+                    "supported": False,
+                },
+                "resources": {
+                    "supported": False,
+                },
+            },
+        }
+
+    async def list_tools(self):
+        """List available tools"""
+        tools = self.get_tools()
+        return {
+            "tools": [
+                {
+                    "name": tool_name,
+                    "description": tool_info.get("description", ""),
+                    "parameters": tool_info.get("parameters", {}),
+                }
+                for tool_name, tool_info in tools.items()
+            ]
+        }
+
+    async def execute_tool(self, request: ToolRequest):
+        """Execute a tool with given arguments"""
+        try:
+            tools = self.get_tools()
+            if request.tool not in tools:
+                raise HTTPException(status_code=404, detail=f"Tool '{request.tool}' not found")
+
+            tool_func = getattr(self, request.tool, None)
+            if not tool_func:
+                raise HTTPException(status_code=501, detail=f"Tool '{request.tool}' not implemented")
+
+            result = await tool_func(**request.get_args())
+
+            return ToolResponse(success=True, result=result)
+
+        except Exception as e:
+            self.logger.error(f"Error executing tool {request.tool}: {str(e)}")
+            return ToolResponse(success=False, result=None, error=str(e))
+
+    async def list_clients(self, active_only: bool = True):
+        """List clients - returns empty for home lab use"""
+        return {"clients": [], "count": 0, "active_only": active_only}
+
+    async def get_client_info(self, client_id: str):
+        """Get client info - returns simple response for home lab use"""
+        return {
+            "client_id": client_id,
+            "client_name": client_id.replace("_oauth", "").replace("_simple", ""),
+            "active": True,
+            "registered_at": datetime.utcnow().isoformat(),
+        }
+
+    async def get_stats(self):
+        """Get server statistics - simplified for home lab use"""
+        return {
+            "server": {
+                "name": self.name,
+                "version": self.version,
+                "tools_count": len(self.get_tools()),
+            },
+            "clients": {
+                "total_clients": 0,
+                "active_clients": 0,
+                "inactive_clients": 0,
+                "clients_active_last_hour": 0,
+                "total_requests": 0,
+            },
+        }
 
     def _load_gemini_config(self) -> Dict[str, Any]:
-        """Load Gemini configuration from environment or config file."""
+        """Load Gemini configuration from environment or config file"""
         # Try to load .env file if it exists
-        env_file = self.project_root / '.env'
+        env_file = self.project_root / ".env"
         if env_file.exists():
             try:
-                with open(env_file, 'r') as f:
+                with open(env_file, "r") as f:
                     for line in f:
                         line = line.strip()
-                        if line and not line.startswith('#') and '=' in line:
-                            key, value = line.split('=', 1)
+                        if line and not line.startswith("#") and "=" in line:
+                            key, value = line.split("=", 1)
                             # Only set if not already in environment
                             if key not in os.environ:
                                 os.environ[key] = value
             except Exception as e:
-                print(f"Warning: Could not load .env file: {e}")
-        
+                self.logger.warning(f"Could not load .env file: {e}")
+
         config = {
-            'enabled': os.getenv('GEMINI_ENABLED', 'true').lower() == 'true',
-            'auto_consult': os.getenv('GEMINI_AUTO_CONSULT', 'true').lower() == 'true',
-            'cli_command': os.getenv('GEMINI_CLI_COMMAND', 'gemini'),
-            'timeout': int(os.getenv('GEMINI_TIMEOUT', '60')),
-            'rate_limit_delay': float(os.getenv('GEMINI_RATE_LIMIT', '2')),
-            'max_context_length': int(os.getenv('GEMINI_MAX_CONTEXT', '4000')),
-            'log_consultations': os.getenv('GEMINI_LOG_CONSULTATIONS', 'true').lower() == 'true',
-            'model': os.getenv('GEMINI_MODEL', 'gemini-2.5-flash'),
-            'sandbox_mode': os.getenv('GEMINI_SANDBOX', 'false').lower() == 'true',
-            'debug_mode': os.getenv('GEMINI_DEBUG', 'false').lower() == 'true',
-            'include_history': os.getenv('GEMINI_INCLUDE_HISTORY', 'true').lower() == 'true',
-            'max_history_entries': int(os.getenv('GEMINI_MAX_HISTORY', '10')),
+            "enabled": os.getenv("GEMINI_ENABLED", "true").lower() == "true",
+            "auto_consult": os.getenv("GEMINI_AUTO_CONSULT", "true").lower() == "true",
+            "cli_command": os.getenv("GEMINI_CLI_COMMAND", "gemini"),
+            "timeout": int(os.getenv("GEMINI_TIMEOUT", "60")),
+            "rate_limit_delay": float(os.getenv("GEMINI_RATE_LIMIT", "2")),
+            "max_context_length": int(os.getenv("GEMINI_MAX_CONTEXT", "4000")),
+            "log_consultations": os.getenv("GEMINI_LOG_CONSULTATIONS", "true").lower() == "true",
+            "model": os.getenv("GEMINI_MODEL", "gemini-2.5-flash"),
+            "sandbox_mode": os.getenv("GEMINI_SANDBOX", "false").lower() == "true",
+            "debug_mode": os.getenv("GEMINI_DEBUG", "false").lower() == "true",
+            "include_history": os.getenv("GEMINI_INCLUDE_HISTORY", "true").lower() == "true",
+            "max_history_entries": int(os.getenv("GEMINI_MAX_HISTORY", "10")),
         }
-        
+
         # Try to load from config file
-        config_file = self.project_root / 'gemini-config.json'
+        config_file = self.project_root / "gemini-config.json"
         if config_file.exists():
             try:
-                with open(config_file, 'r') as f:
+                with open(config_file, "r") as f:
                     file_config = json.load(f)
                 config.update(file_config)
             except Exception as e:
-                print(f"Warning: Could not load gemini-config.json: {e}")
-        
+                self.logger.warning(f"Could not load gemini-config.json: {e}")
+
         return config
 
-    def _setup_tools(self):
-        """Register all MCP tools"""
-
-        # Gemini consultation tool
-        @self.server.call_tool()
-        async def consult_gemini(arguments: Dict[str, Any]) -> List[types.TextContent]:
-            """Consult Gemini CLI for a second opinion or validation."""
-            query = arguments.get('query', '')
-            context = arguments.get('context', '')
-            comparison_mode = arguments.get('comparison_mode', True)
-            force_consult = arguments.get('force', False)
-
-            if not query:
-                return [types.TextContent(
-                    type="text",
-                    text="❌ Error: 'query' parameter is required for Gemini consultation"
-                )]
-
-            # Consult Gemini
-            result = await self.gemini.consult_gemini(
-                query=query,
-                context=context,
-                comparison_mode=comparison_mode,
-                force_consult=force_consult
-            )
-
-            # Format the response
-            return await self._format_gemini_response(result)
-
-        @self.server.call_tool()
-        async def gemini_status(arguments: Dict[str, Any]) -> List[types.TextContent]:
-            """Get Gemini integration status and statistics."""
-            return await self._get_gemini_status()
-
-        @self.server.call_tool()
-        async def toggle_gemini_auto_consult(arguments: Dict[str, Any]) -> List[types.TextContent]:
-            """Toggle automatic Gemini consultation on uncertainty detection."""
-            enable = arguments.get('enable', None)
-
-            if enable is None:
-                # Toggle current state
-                self.gemini.auto_consult = not self.gemini.auto_consult
-            else:
-                self.gemini.auto_consult = bool(enable)
-
-            status = "enabled" if self.gemini.auto_consult else "disabled"
-            return [types.TextContent(
-                type="text",
-                text=f"✅ Gemini auto-consultation is now {status}"
-            )]
-
-        @self.server.call_tool()
-        async def clear_gemini_history(arguments: Dict[str, Any]) -> List[types.TextContent]:
-            """Clear Gemini conversation history."""
-            result = self.gemini.clear_conversation_history()
-            return [types.TextContent(
-                type="text",
-                text=f"✅ {result['message']}"
-            )]
-
-    async def _format_gemini_response(self, result: Dict[str, Any]) -> List[types.TextContent]:
-        """Format Gemini consultation response for MCP output."""
+    def _initialize_gemini(self):
+        """Initialize Gemini integration with lazy loading"""
+        try:
+            from gemini_integration import get_integration
+
+            return get_integration(self.gemini_config)
+        except ImportError as e:
+            self.logger.error(f"Failed to import Gemini integration: {e}")
+
+            # Return a mock object that always returns disabled status
+            class MockGemini:
+                def __init__(self):
+                    self.auto_consult = False
+                    self.enabled = False
+
+                async def consult_gemini(self, **kwargs):
+                    return {
+                        "status": "disabled",
+                        "error": "Gemini integration not available",
+                    }
+
+                def clear_conversation_history(self):
+                    return {"message": "Gemini integration not available"}
+
+                def get_statistics(self):
+                    return {}
+
+            return MockGemini()
+
+    def get_tools(self) -> Dict[str, Dict[str, Any]]:
+        """Return available Gemini tools"""
+        return {
+            "consult_gemini": {
+                "description": "Consult Gemini AI for a second opinion or validation",
+                "parameters": {
+                    "type": "object",
+                    "properties": {
+                        "query": {
+                            "type": "string",
+                            "description": "The question or code to consult Gemini about",
+                        },
+                        "context": {
+                            "type": "string",
+                            "description": "Additional context for the consultation",
+                        },
+                        "comparison_mode": {
+                            "type": "boolean",
+                            "default": True,
+                            "description": "Compare with previous Claude response",
+                        },
+                        "force": {
+                            "type": "boolean",
+                            "default": False,
+                            "description": "Force consultation even if disabled",
+                        },
+                    },
+                    "required": ["query"],
+                },
+            },
+            "clear_gemini_history": {
+                "description": "Clear Gemini conversation history",
+                "parameters": {"type": "object", "properties": {}},
+            },
+            "gemini_status": {
+                "description": "Get Gemini integration status and statistics",
+                "parameters": {"type": "object", "properties": {}},
+            },
+            "toggle_gemini_auto_consult": {
+                "description": "Toggle automatic Gemini consultation on uncertainty detection",
+                "parameters": {
+                    "type": "object",
+                    "properties": {
+                        "enable": {
+                            "type": "boolean",
+                            "description": "Enable or disable auto-consultation",
+                        }
+                    },
+                },
+            },
+        }
+
+    async def consult_gemini(
+        self,
+        query: str,
+        context: str = "",
+        comparison_mode: bool = True,
+        force: bool = False,
+    ) -> Dict[str, Any]:
+        """Consult Gemini AI for a second opinion
+
+        Args:
+            query: The question or code to consult about
+            context: Additional context
+            comparison_mode: Compare with previous Claude response
+            force: Force consultation even if disabled
+
+        Returns:
+            Dictionary with consultation results
+        """
+        if not query:
+            return {
+                "success": False,
+                "error": "'query' parameter is required for Gemini consultation",
+            }
+
+        # Consult Gemini
+        result = await self.gemini.consult_gemini(
+            query=query,
+            context=context,
+            comparison_mode=comparison_mode,
+            force_consult=force,
+        )
+
+        # Format the response
+        formatted_response = self._format_gemini_response(result)
+
+        return {
+            "success": result.get("status") == "success",
+            "result": formatted_response,
+            "raw_result": result,
+        }
+
+    async def clear_gemini_history(self) -> Dict[str, Any]:
+        """Clear Gemini conversation history"""
+        result = self.gemini.clear_conversation_history()
+        return {"success": True, "message": result.get("message", "History cleared")}
+
+    async def gemini_status(self) -> Dict[str, Any]:
+        """Get Gemini integration status and statistics"""
+        stats = self.gemini.get_statistics() if hasattr(self.gemini, "get_statistics") else {}
+
+        status_info = {
+            "enabled": getattr(self.gemini, "enabled", False),
+            "auto_consult": getattr(self.gemini, "auto_consult", False),
+            "model": self.gemini_config.get("model", "unknown"),
+            "timeout": self.gemini_config.get("timeout", 60),
+            "statistics": stats,
+        }
+
+        return {"success": True, "status": status_info}
+
+    async def toggle_gemini_auto_consult(self, enable: Optional[bool] = None) -> Dict[str, Any]:
+        """Toggle automatic Gemini consultation
+
+        Args:
+            enable: True to enable, False to disable, None to toggle
+
+        Returns:
+            Dictionary with new status
+        """
+        if enable is None:
+            # Toggle current state
+            self.gemini.auto_consult = not getattr(self.gemini, "auto_consult", False)
+        else:
+            self.gemini.auto_consult = bool(enable)
+
+        status = "enabled" if self.gemini.auto_consult else "disabled"
+        return {
+            "success": True,
+            "status": status,
+            "message": f"Gemini auto-consultation is now {status}",
+        }
+
+    def _format_gemini_response(self, result: Dict[str, Any]) -> str:
+        """Format Gemini consultation response"""
         output_lines = []
         output_lines.append("🤖 Gemini Consultation Response")
         output_lines.append("=" * 40)
         output_lines.append("")
-        
-        if result['status'] == 'success':
-            output_lines.append(f"✅ Consultation ID: {result['consultation_id']}")
-            output_lines.append(f"⏱️  Execution time: {result['execution_time']:.2f}s")
+
+        if result["status"] == "success":
+            output_lines.append(f"✅ Consultation ID: {result.get('consultation_id', 'N/A')}")
+            output_lines.append(f"⏱️  Execution time: {result.get('execution_time', 0):.2f}s")
             output_lines.append("")
-            
-            # Display the raw response (simplified format)
-            response = result.get('response', '')
+
+            # Display the raw response
+            response = result.get("response", "")
             if response:
                 output_lines.append("📄 Response:")
                 output_lines.append(response)
-        
-        elif result['status'] == 'disabled':
+
+        elif result["status"] == "disabled":
             output_lines.append("ℹ️  Gemini consultation is currently disabled")
             output_lines.append("💡 Enable with: toggle_gemini_auto_consult")
-        
-        elif result['status'] == 'timeout':
-            output_lines.append(f"❌ {result['error']}")
+
+        elif result["status"] == "timeout":
+            output_lines.append(f"❌ {result.get('error', 'Timeout error')}")
             output_lines.append("💡 Try increasing the timeout or simplifying the query")
-        
+
         else:  # error
             output_lines.append(f"❌ Error: {result.get('error', 'Unknown error')}")
             output_lines.append("")
             output_lines.append("💡 Troubleshooting:")
             output_lines.append("  1. Check if Gemini CLI is installed and in PATH")
             output_lines.append("  2. Verify Gemini CLI authentication")
             output_lines.append("  3. Check the logs for more details")
-
-        return [types.TextContent(type="text", text="\n".join(output_lines))]
-
-    async def _get_gemini_status(self) -> List[types.TextContent]:
-        """Get Gemini integration status and statistics."""
-        output_lines = []
-        output_lines.append("🤖 Gemini Integration Status")
-        output_lines.append("=" * 40)
-        output_lines.append("")
-
-        # Configuration status
-        output_lines.append("⚙️  Configuration:")
-        output_lines.append(f"  • Enabled: {'✅ Yes' if self.gemini.enabled else '❌ No'}")
-        output_lines.append(f"  • Auto-consult: {'✅ Yes' if self.gemini.auto_consult else '❌ No'}")
-        output_lines.append(f"  • CLI command: {self.gemini.cli_command}")
-        output_lines.append(f"  • Timeout: {self.gemini.timeout}s")
-        output_lines.append(f"  • Rate limit: {self.gemini.rate_limit_delay}s")
-        output_lines.append("")
-
-        # Check if Gemini CLI is available
-        try:
-            # Test with a simple prompt rather than --version (which may not be supported)
-            check_process = await asyncio.create_subprocess_exec(
-                self.gemini.cli_command, "-p", "test",
-                stdout=asyncio.subprocess.PIPE,
-                stderr=asyncio.subprocess.PIPE
-            )
-            stdout, stderr = await asyncio.wait_for(check_process.communicate(), timeout=10)
-
-            if check_process.returncode == 0:
-                output_lines.append("✅ Gemini CLI is available and working")
-                # Try to get version info from help or other means
-                try:
-                    help_process = await asyncio.create_subprocess_exec(
-                        self.gemini.cli_command, "--help",
-                        stdout=asyncio.subprocess.PIPE,
-                        stderr=asyncio.subprocess.PIPE
+
+        return "\n".join(output_lines)
+
+    async def run_stdio(self):
+        """Run the server in stdio mode (for Claude desktop app)"""
+        server = Server(self.name)
+
+        # Store tools and their functions for later access
+        self._tools = self.get_tools()
+        self._tool_funcs = {}
+        for tool_name, tool_info in self._tools.items():
+            tool_func = getattr(self, tool_name, None)
+            if tool_func:
+                self._tool_funcs[tool_name] = tool_func
+
+        @server.list_tools()
+        async def list_tools() -> List[types.Tool]:
+            """List available tools"""
+            tools = []
+            for tool_name, tool_info in self._tools.items():
+                tools.append(
+                    types.Tool(
+                        name=tool_name,
+                        description=tool_info.get("description", ""),
+                        inputSchema=tool_info.get("parameters", {}),
                     )
-                    help_stdout, _ = await help_process.communicate()
-                    help_text = help_stdout.decode()
-                    # Look for version in help output
-                    if "version" in help_text.lower():
-                        for line in help_text.split('\n'):
-                            if 'version' in line.lower():
-                                output_lines.append(f"  {line.strip()}")
-                                break
-                except Exception:
-                    pass
-            else:
-                error_msg = stderr.decode() if stderr else "Unknown error"
-                output_lines.append("❌ Gemini CLI found but not working properly")
-                output_lines.append(f"  Command tested: {self.gemini.cli_command}")
-                output_lines.append(f"  Error: {error_msg}")
-
-                # Check for authentication issues
-                if "authentication" in error_msg.lower() or "api key" in error_msg.lower():
-                    output_lines.append("")
-                    output_lines.append("🔑 Authentication required:")
-                    output_lines.append("  1. Set GEMINI_API_KEY environment variable, or")
-                    output_lines.append("  2. Run 'gemini' interactively to authenticate with Google")
-
-        except asyncio.TimeoutError:
-            output_lines.append("❌ Gemini CLI test timed out")
-            output_lines.append("  This may indicate authentication is required")
-        except FileNotFoundError:
-            output_lines.append("❌ Gemini CLI not found in PATH")
-            output_lines.append(f"  Expected command: {self.gemini.cli_command}")
-            output_lines.append("")
-            output_lines.append("📦 Installation:")
-            output_lines.append("  npm install -g @google/gemini-cli")
-            output_lines.append("  OR")
-            output_lines.append("  npx @google/gemini-cli")
-        except Exception as e:
-            output_lines.append(f"❌ Error checking Gemini CLI: {str(e)}")
-
-        output_lines.append("")
-
-        # Consultation statistics
-        stats = self.gemini.get_consultation_stats()
-        output_lines.append("📊 Consultation Statistics:")
-        output_lines.append(f"  • Total consultations: {stats.get('total_consultations', 0)}")
-
-        completed = stats.get('completed_consultations', 0)
-        output_lines.append(f"  • Completed: {completed}")
-
-        if completed > 0:
-            avg_time = stats.get('average_execution_time', 0)
-            output_lines.append(f"  • Average time: {avg_time:.2f}s")
-            total_time = sum(
-                e.get("execution_time", 0)
-                for e in self.gemini.consultation_log
-                if e.get("status") == "success"
-            )
-            output_lines.append(f"  • Total time: {total_time:.2f}s")
-
-        output_lines.append("")
-        output_lines.append("💡 Usage:")
-        output_lines.append("  • Direct: Use 'consult_gemini' tool")
-        output_lines.append("  • Auto: Enable auto-consult for uncertainty detection")
-        output_lines.append("  • Toggle: Use 'toggle_gemini_auto_consult' tool")
-
-        return [types.TextContent(type="text", text="\n".join(output_lines))]
-
-    def detect_response_uncertainty(self, response: str) -> Tuple[bool, List[str]]:
-        """
-        Detect uncertainty in a response for potential auto-consultation.
-        This is a wrapper around the GeminiIntegration's detection.
-        """
-        return self.gemini.detect_uncertainty(response)
-
-    async def maybe_consult_gemini(self, response: str, context: str = "") -> Optional[Dict[str, Any]]:
-        """
-        Check if response contains uncertainty and consult Gemini if needed.
-        
-        Args:
-            response: The response to check for uncertainty
-            context: Additional context for the consultation
-            
-        Returns:
-            Gemini consultation result if consulted, None otherwise
-        """
-        if not self.gemini.auto_consult or not self.gemini.enabled:
-            return None
-
-        has_uncertainty, patterns = self.detect_response_uncertainty(response)
-
-        if has_uncertainty:
-            # Extract the main question or topic from the response
-            query = f"Please provide a second opinion on this analysis:\n\n{response}"
-
-            # Add uncertainty patterns to context
-            enhanced_context = f"{context}\n\nUncertainty detected in: {', '.join(patterns)}"
-
-            result = await self.gemini.consult_gemini(
-                query=query,
-                context=enhanced_context,
-                comparison_mode=True
-            )
-
-            return result
-
-        return None
-
-    def run(self):
-        """Run the MCP server."""
-        async def main():
-            async with mcp.server.stdio.stdio_server() as (read_stream, write_stream):
-                await self.server.run(
-                    read_stream,
-                    write_stream,
-                    InitializationOptions(
-                        server_name="gemini-mcp-server",
-                        server_version="1.0.0",
-                        capabilities=self.server.get_capabilities(
-                            notification_options=NotificationOptions(),
-                            experimental_capabilities={},
-                        ),
-                    ),
                 )
-
-        asyncio.run(main())
+            return tools
 
+        @server.call_tool()
+        async def call_tool(name: str, arguments: Dict[str, Any]) -> List[types.TextContent]:
+            """Call a tool with given arguments"""
+            if name not in self._tool_funcs:
+                return [types.TextContent(type="text", text=f"Tool '{name}' not found")]
 
-if __name__ == "__main__":
+            try:
+                # Call the tool function
+                result = await self._tool_funcs[name](**arguments)
+
+                # Convert result to MCP response format
+                if isinstance(result, dict):
+                    return [types.TextContent(type="text", text=json.dumps(result, indent=2))]
+                return [types.TextContent(type="text", text=str(result))]
+            except Exception as e:
+                self.logger.error(f"Error calling tool {name}: {str(e)}")
+                return [types.TextContent(type="text", text=f"Error: {str(e)}")]
+
+        # Run the stdio server
+        async with mcp.server.stdio.stdio_server() as (read_stream, write_stream):
+            await server.run(
+                read_stream,
+                write_stream,
+                InitializationOptions(
+                    server_name=self.name,
+                    server_version=self.version,
+                    capabilities=server.get_capabilities(
+                        notification_options=NotificationOptions(),
+                        experimental_capabilities={},
+                    ),
+                ),
+            )
+
+    def run_http(self):
+        """Run the server in HTTP mode"""
+        import uvicorn
+
+        uvicorn.run(self.app, host="0.0.0.0", port=self.port)
+
+    def run(self, mode: str = "http"):
+        """Run the server in specified mode"""
+        if mode == "stdio":
+            asyncio.run(self.run_stdio())
+        elif mode == "http":
+            self.run_http()
+        else:
+            raise ValueError(f"Unknown mode: {mode}. Use 'stdio' or 'http'.")
+
+
+def main():
+    """Run the Gemini MCP Server"""
     import argparse
-
-    parser = argparse.ArgumentParser(description="MCP Server with Gemini Integration")
-    parser.add_argument(
-        "--project-root",
-        type=str,
-        default=".",
-        help="Path to the project root directory"
-    )
+
+    parser = argparse.ArgumentParser(description="Gemini AI Integration MCP Server")
     parser.add_argument(
-        "--port",
-        type=int,
-        default=None,
-        help="Port for HTTP mode (if specified, runs as HTTP server instead of stdio)"
+        "--mode",
+        choices=["http", "stdio"],
+        default="stdio",  # Default to stdio for Gemini
+        help="Server mode (http or stdio)",
     )
-
+    parser.add_argument("--project-root", default=None, help="Project root directory")
     args = parser.parse_args()
-
-    # Check if running in container - exit with instructions if true
-    check_container_and_exit()
-
-    # If port is specified, run as HTTP server (for backward compatibility/testing)
-    if args.port:
-        print("Warning: Running in HTTP mode. For production, use stdio mode (no --port argument)")
-        # Import and run the HTTP server
-        try:
-            from gemini_mcp_server_http import run_http_server
-            run_http_server(args.port)
-        except ImportError:
-            print("Error: gemini_mcp_server_http.py not found", file=sys.stderr)
-            print("HTTP mode requires the HTTP server implementation file", file=sys.stderr)
-            sys.exit(1)
-    else:
-        # Run as stdio MCP server (recommended)
-        server = MCPServer(args.project_root)
-        server.run()
+
+    server = GeminiMCPServer(project_root=args.project_root)
+    server.run(mode=args.mode)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/gemini_mcp_server_http.py b/gemini_mcp_server_http.py
@@ -1,237 +0,0 @@
-#!/usr/bin/env python3
-"""
-HTTP-based Gemini MCP Server
-Provides REST API interface for Gemini consultation (for testing/development)
-"""
-
-import os
-import sys
-from pathlib import Path
-from typing import Any, Dict, Optional
-
-from fastapi import FastAPI, HTTPException
-from pydantic import BaseModel
-import uvicorn
-
-# Check if running in container BEFORE any other imports or operations
-def check_container_and_exit():
-    """Check if running in a container and exit immediately if true."""
-    if os.path.exists("/.dockerenv") or os.environ.get("CONTAINER_ENV"):
-        print("ERROR: Gemini MCP Server cannot run inside a container!", file=sys.stderr)
-        print(
-            "The Gemini CLI requires Docker access and must run on the host system.",
-            file=sys.stderr,
-        )
-        print("Please launch this server directly on the host with:", file=sys.stderr)
-        print("  python gemini_mcp_server_http.py", file=sys.stderr)
-        sys.exit(1)
-
-
-# Perform container check immediately
-check_container_and_exit()
-
-# Now import the integration module
-from gemini_integration import get_integration
-
-# Initialize FastAPI app
-app = FastAPI(
-    title="Gemini MCP Server (HTTP Mode)",
-    version="1.0.0",
-    description="HTTP interface for Gemini CLI integration"
-)
-
-
-class ConsultRequest(BaseModel):
-    query: str
-    context: Optional[str] = ""
-    comparison_mode: Optional[bool] = True
-    force: Optional[bool] = False
-
-
-class ToggleRequest(BaseModel):
-    enable: Optional[bool] = None
-
-
-class ConsultResponse(BaseModel):
-    status: str
-    response: Optional[str] = None
-    error: Optional[str] = None
-    consultation_id: Optional[str] = None
-    execution_time: Optional[float] = None
-    timestamp: Optional[str] = None
-
-
-# Initialize Gemini integration
-project_root = Path(os.environ.get("PROJECT_ROOT", "."))
-config = {
-    'enabled': os.getenv('GEMINI_ENABLED', 'true').lower() == 'true',
-    'auto_consult': os.getenv('GEMINI_AUTO_CONSULT', 'true').lower() == 'true',
-    'cli_command': os.getenv('GEMINI_CLI_COMMAND', 'gemini'),
-    'timeout': int(os.getenv('GEMINI_TIMEOUT', '60')),
-    'rate_limit_delay': float(os.getenv('GEMINI_RATE_LIMIT', '2')),
-    'max_context_length': int(os.getenv('GEMINI_MAX_CONTEXT', '4000')),
-    'log_consultations': os.getenv('GEMINI_LOG_CONSULTATIONS', 'true').lower() == 'true',
-    'model': os.getenv('GEMINI_MODEL', 'gemini-2.5-flash'),
-}
-
-# Get the singleton instance
-gemini = get_integration(config)
-
-
-@app.get("/")
-async def root():
-    """Root endpoint showing server info"""
-    return {
-        "name": "Gemini MCP Server (HTTP Mode)",
-        "version": "1.0.0",
-        "mode": "http",
-        "endpoints": {
-            "health": "/health",
-            "tools": "/mcp/tools",
-            "consult": "/mcp/tools/consult_gemini",
-            "status": "/mcp/tools/gemini_status",
-            "toggle": "/mcp/tools/toggle_gemini_auto_consult",
-            "clear_history": "/mcp/tools/clear_gemini_history"
-        }
-    }
-
-
-@app.get("/health")
-async def health():
-    """Health check endpoint"""
-    return {"status": "healthy", "mode": "http"}
-
-
-@app.get("/mcp/tools")
-async def list_tools():
-    """List available MCP tools"""
-    return {
-        "tools": [
-            {
-                "name": "consult_gemini",
-                "description": "Consult Gemini CLI for a second opinion or validation",
-                "parameters": {
-                    "query": {"type": "string", "required": True},
-                    "context": {"type": "string", "required": False},
-                    "comparison_mode": {"type": "boolean", "required": False, "default": True},
-                    "force": {"type": "boolean", "required": False, "default": False}
-                }
-            },
-            {
-                "name": "gemini_status",
-                "description": "Get Gemini integration status and statistics",
-                "parameters": {}
-            },
-            {
-                "name": "toggle_gemini_auto_consult",
-                "description": "Toggle automatic Gemini consultation on uncertainty detection",
-                "parameters": {
-                    "enable": {"type": "boolean", "required": False}
-                }
-            },
-            {
-                "name": "clear_gemini_history",
-                "description": "Clear Gemini conversation history",
-                "parameters": {}
-            }
-        ]
-    }
-
-
-@app.post("/mcp/tools/consult_gemini", response_model=ConsultResponse)
-async def consult_gemini_endpoint(request: ConsultRequest):
-    """Consult Gemini for a second opinion"""
-    if not request.query:
-        raise HTTPException(status_code=400, detail="Query parameter is required")
-
-    result = await gemini.consult_gemini(
-        query=request.query,
-        context=request.context,
-        comparison_mode=request.comparison_mode,
-        force_consult=request.force
-    )
-
-    return ConsultResponse(**result)
-
-
-@app.get("/mcp/tools/gemini_status")
-async def gemini_status_endpoint():
-    """Get Gemini integration status and statistics"""
-    import asyncio
-
-    # Build status information
-    status = {
-        "configuration": {
-            "enabled": gemini.enabled,
-            "auto_consult": gemini.auto_consult,
-            "cli_command": gemini.cli_command,
-            "timeout": gemini.timeout,
-            "rate_limit": gemini.rate_limit_delay,
-            "model": gemini.model
-        },
-        "statistics": gemini.get_consultation_stats()
-    }
-
-    # Check CLI availability
-    try:
-        check_process = await asyncio.create_subprocess_exec(
-            gemini.cli_command, "-p", "test",
-            stdout=asyncio.subprocess.PIPE,
-            stderr=asyncio.subprocess.PIPE
-        )
-        stdout, stderr = await asyncio.wait_for(check_process.communicate(), timeout=5)
-
-        status["cli_available"] = check_process.returncode == 0
-        if not status["cli_available"]:
-            status["cli_error"] = stderr.decode() if stderr else "Unknown error"
-    except Exception as e:
-        status["cli_available"] = False
-        status["cli_error"] = str(e)
-
-    return status
-
-
-@app.post("/mcp/tools/toggle_gemini_auto_consult")
-async def toggle_auto_consult_endpoint(request: ToggleRequest):
-    """Toggle automatic Gemini consultation"""
-    if request.enable is None:
-        # Toggle current state
-        gemini.auto_consult = not gemini.auto_consult
-    else:
-        gemini.auto_consult = bool(request.enable)
-
-    return {
-        "status": "success",
-        "auto_consult": gemini.auto_consult,
-        "message": f"Gemini auto-consultation is now {'enabled' if gemini.auto_consult else 'disabled'}"
-    }
-
-
-@app.post("/mcp/tools/clear_gemini_history")
-async def clear_history_endpoint():
-    """Clear Gemini conversation history"""
-    result = gemini.clear_conversation_history()
-    return result
-
-
-def run_http_server(port: int):
-    """Run the HTTP server"""
-    host = os.environ.get("GEMINI_MCP_HOST", "127.0.0.1")
-    print(f"Starting Gemini MCP Server in HTTP mode on {host}:{port}")
-    print("WARNING: HTTP mode is for testing only. Use stdio mode for production.")
-    print(f"Access the API at: http://{host}:{port}")
-    print(f"Health check: http://{host}:{port}/health")
-    print(f"API docs: http://{host}:{port}/docs")
-
-    uvicorn.run(app, host=host, port=port)
-
-
-if __name__ == "__main__":
-    # Check if running in container
-    check_container_and_exit()
-
-    # Default port from environment or 8006
-    default_port = int(os.environ.get("GEMINI_MCP_PORT", "8006"))
-
-    # Run server
-    run_http_server(default_port)

diff --git a/requirements.txt b/requirements.txt
@@ -0,0 +1,6 @@
+# Gemini MCP Server Requirements
+mcp>=0.9.0
+fastapi>=0.104.0
+uvicorn>=0.24.0
+pydantic>=2.0.0
+python-multipart>=0.0.6
diff --git a/start-gemini-mcp.sh b/start-gemini-mcp.sh
@@ -1,69 +1,80 @@
 #!/bin/bash
-# Start Gemini MCP Server in either stdio or HTTP mode
+#
+# Start script for Gemini MCP Server
+# Supports both stdio and HTTP modes
+#
 
-# Default to stdio mode
+# Default values
 MODE="stdio"
+PROJECT_ROOT="."
+PORT="8006"
 
-# Check for --http flag
-if [[ "$1" == "--http" ]]; then
-    MODE="http"
+# Parse command line arguments
+while [[ $# -gt 0 ]]; do
+    case $1 in
+        --mode)
+            MODE="$2"
+            shift 2
+            ;;
+        --project-root)
+            PROJECT_ROOT="$2"
+            shift 2
+            ;;
+        --port)
+            PORT="$2"
+            shift 2
+            ;;
+        --help)
+            echo "Usage: $0 [options]"
+            echo "Options:"
+            echo "  --mode <stdio|http>     Server mode (default: stdio)"
+            echo "  --project-root <path>   Project root directory (default: .)"
+            echo "  --port <port>           Port for HTTP mode (default: 8006)"
+            echo "  --help                  Show this help message"
+            exit 0
+            ;;
+        *)
+            echo "Unknown option: $1"
+            exit 1
+            ;;
+    esac
+done
+
+# Check if running in container
+if [ -f /.dockerenv ] || [ -n "$CONTAINER_ENV" ]; then
+    echo "ERROR: Gemini MCP Server cannot run inside a container!"
+    echo "The Gemini CLI requires Docker access and must run on the host system."
+    echo "Please run this script on your host machine."
+    exit 1
+fi
+
+# Check if Python is available
+if ! command -v python &> /dev/null && ! command -v python3 &> /dev/null; then
+    echo "ERROR: Python is not installed or not in PATH"
+    exit 1
+fi
+
+# Use python3 if available, otherwise python
+PYTHON_CMD="python"
+if command -v python3 &> /dev/null; then
+    PYTHON_CMD="python3"
 fi
 
-# Set default environment variables if not already set
-export GEMINI_ENABLED="${GEMINI_ENABLED:-true}"
-export GEMINI_AUTO_CONSULT="${GEMINI_AUTO_CONSULT:-true}"
-export GEMINI_CLI_COMMAND="${GEMINI_CLI_COMMAND:-gemini}"
-export GEMINI_TIMEOUT="${GEMINI_TIMEOUT:-200}"
-export GEMINI_RATE_LIMIT="${GEMINI_RATE_LIMIT:-2}"
-export GEMINI_MODEL="${GEMINI_MODEL:-gemini-2.5-flash}"
-export GEMINI_MCP_PORT="${GEMINI_MCP_PORT:-8006}"
-export GEMINI_MCP_HOST="${GEMINI_MCP_HOST:-127.0.0.1}"
+# Get the directory where this script is located
+SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
+
+# Change to script directory
+cd "$SCRIPT_DIR"
 
-# Show current configuration
-echo "🤖 Gemini MCP Server Startup"
-echo "============================"
+# Start the server
+echo "Starting Gemini MCP Server..."
 echo "Mode: $MODE"
-echo "Environment:"
-echo "  GEMINI_ENABLED=$GEMINI_ENABLED"
-echo "  GEMINI_AUTO_CONSULT=$GEMINI_AUTO_CONSULT"
-echo "  GEMINI_CLI_COMMAND=$GEMINI_CLI_COMMAND"
-echo "  GEMINI_MODEL=$GEMINI_MODEL"
-echo ""
+echo "Project Root: $PROJECT_ROOT"
+
+if [ "$MODE" = "http" ]; then
+    echo "Port: $PORT"
+    export UVICORN_PORT=$PORT
+fi
 
-if [[ "$MODE" == "http" ]]; then
-    echo "Starting HTTP server on $GEMINI_MCP_HOST:$GEMINI_MCP_PORT..."
-    echo ""
-
-    # Start HTTP server
-    python3 gemini_mcp_server.py --port $GEMINI_MCP_PORT
-
-    # If server stops, show how to test it
-    echo ""
-    echo "Server stopped. To test the HTTP server:"
-    echo "  curl http://$GEMINI_MCP_HOST:$GEMINI_MCP_PORT/health"
-    echo "  curl http://$GEMINI_MCP_HOST:$GEMINI_MCP_PORT/mcp/tools"
-else
-    echo "stdio mode instructions:"
-    echo ""
-    echo "To use with Claude Code, add this to your MCP settings:"
-    echo ""
-    echo '{'
-    echo '  "mcpServers": {'
-    echo '    "gemini": {'
-    echo '      "command": "python3",'
-    echo "      \"args\": [\"$(pwd)/gemini_mcp_server.py\", \"--project-root\", \".\"],"
-    echo "      \"cwd\": \"$(pwd)\","
-    echo '      "env": {'
-    echo '        "GEMINI_ENABLED": "true",'
-    echo '        "GEMINI_AUTO_CONSULT": "true",'
-    echo '        "GEMINI_CLI_COMMAND": "gemini"'
-    echo '      }'
-    echo '    }'
-    echo '  }'
-    echo '}'
-    echo ""
-    echo "Or run directly for testing:"
-    echo "  python3 gemini_mcp_server.py --project-root ."
-    echo ""
-    echo "For HTTP mode, run: $0 --http"
-fi
+# Run the server
+exec $PYTHON_CMD gemini_mcp_server.py --mode "$MODE" --project-root "$PROJECT_ROOT"
diff --git a/test_gemini_mcp.py b/test_gemini_mcp.py
@@ -1,205 +1,209 @@
 #!/usr/bin/env python3
 """
 Test script for Gemini MCP Server
-Tests both stdio and HTTP modes
+Tests both HTTP and stdio modes
 """
 
 import argparse
 import asyncio
 import json
-import os
 import sys
+from typing import Any, Dict
 
-import requests
+import httpx
 
 
-def test_http_server(verbose=False):
-    """Test HTTP server endpoints"""
-    host = os.environ.get("GEMINI_MCP_HOST", "127.0.0.1")
-    port = int(os.environ.get("GEMINI_MCP_PORT", "8006"))
-    base_url = f"http://{host}:{port}"
+async def test_http_mode(base_url: str = "http://localhost:8006"):
+    """Test the HTTP mode of the server"""
+    print("Testing HTTP mode...")
 
-    print(f"🧪 Testing HTTP server at {base_url}")
-    print("=" * 50)
-
-    # Test 1: Health check
-    print("\n1. Testing health endpoint...")
-    try:
-        response = requests.get(f"{base_url}/health", timeout=5)
+    async with httpx.AsyncClient() as client:
+        # Test health endpoint
+        print("\n1. Testing health check...")
+        response = await client.get(f"{base_url}/health")
         if response.status_code == 200:
-            print("✅ Health check passed")
-            if verbose:
-                print(f"   Response: {response.json()}")
+            print(f"✅ Health check passed: {response.json()}")
         else:
             print(f"❌ Health check failed: {response.status_code}")
-            return False
-    except Exception as e:
-        print(f"❌ Cannot connect to server: {e}")
-        print("\nMake sure the HTTP server is running:")
-        print("  python3 gemini_mcp_server.py --port 8006")
-        return False
-
-    # Test 2: Root endpoint
-    print("\n2. Testing root endpoint...")
-    try:
-        response = requests.get(base_url)
+
+        # Test tools listing
+        print("\n2. Testing tools listing...")
+        response = await client.get(f"{base_url}/mcp/tools")
         if response.status_code == 200:
-            print("✅ Root endpoint accessible")
-            if verbose:
-                print(f"   Endpoints: {json.dumps(response.json()['endpoints'], indent=2)}")
-        else:
-            print(f"❌ Root endpoint failed: {response.status_code}")
-    except Exception as e:
-        print(f"❌ Root endpoint error: {e}")
-
-    # Test 3: List tools
-    print("\n3. Testing tools listing...")
-    try:
-        response = requests.get(f"{base_url}/mcp/tools")
-        if response.status_code == 200:
-            tools = response.json()["tools"]
-            print(f"✅ Found {len(tools)} tools:")
-            for tool in tools:
+            tools = response.json()
+            print(f"✅ Found {len(tools['tools'])} tools:")
+            for tool in tools['tools']:
                 print(f"   - {tool['name']}: {tool['description']}")
         else:
             print(f"❌ Tools listing failed: {response.status_code}")
-    except Exception as e:
-        print(f"❌ Tools listing error: {e}")
-
-    # Test 4: Gemini status
-    print("\n4. Testing Gemini status...")
-    try:
-        response = requests.get(f"{base_url}/mcp/tools/gemini_status")
+            
+        # Test Gemini status
+        print("\n3. Testing Gemini status...")
+        response = await client.post(
+            f"{base_url}/mcp/execute",
+            json={"tool": "gemini_status"}
+        )
         if response.status_code == 200:
-            status = response.json()
-            print("✅ Status endpoint working")
-            print(f"   Gemini enabled: {status['configuration']['enabled']}")
-            print(f"   Auto-consult: {status['configuration']['auto_consult']}")
-            print(f"   CLI available: {status.get('cli_available', 'Unknown')}")
-            if not status.get('cli_available') and verbose:
-                print(f"   CLI error: {status.get('cli_error', 'Unknown')}")
+            result = response.json()
+            if result['success']:
+                print(f"✅ Gemini status: {json.dumps(result['result'], indent=2)}")
+            else:
+                print(f"❌ Gemini status failed: {result.get('error')}")
         else:
-            print(f"❌ Status check failed: {response.status_code}")
-    except Exception as e:
-        print(f"❌ Status check error: {e}")
-
-    # Test 5: Simple consultation (if Gemini is available)
-    print("\n5. Testing Gemini consultation...")
-    try:
-        payload = {
-            "query": "What is 2+2?",
-            "context": "This is a test query",
-            "comparison_mode": False
-        }
-        response = requests.post(
-            f"{base_url}/mcp/tools/consult_gemini",
-            json=payload,
-            timeout=30
+            print(f"❌ Status request failed: {response.status_code}")
+
+        # Test simple consultation
+        print("\n4. Testing Gemini consultation...")
+        response = await client.post(
+            f"{base_url}/mcp/execute",
+            json={
+                "tool": "consult_gemini",
+                "arguments": {
+                    "query": "What is 2 + 2?",
+                    "force": True
+                }
+            }
         )
         if response.status_code == 200:
             result = response.json()
-            if result["status"] == "success":
-                print("✅ Consultation successful")
-                if verbose:
-                    print(f"   Response preview: {result['response'][:100]}...")
-                    print(f"   Execution time: {result.get('execution_time', 0):.2f}s")
+            if result['success']:
+                print(f"✅ Consultation successful")
+                print(f"   Response: {result['result'][:200]}...")
             else:
-                print(f"⚠️  Consultation returned status: {result['status']}")
-                if result.get("error"):
-                    print(f"   Error: {result['error']}")
+                print(f"❌ Consultation failed: {result.get('error')}")
         else:
-            print(f"❌ Consultation failed: {response.status_code}")
-            if verbose:
-                print(f"   Response: {response.text}")
-    except Exception as e:
-        print(f"❌ Consultation error: {e}")
-
-    print("\n" + "=" * 50)
-    print("✅ HTTP server tests completed")
-    return True
+            print(f"❌ Consultation request failed: {response.status_code}")
 
 
-async def test_stdio_server(verbose=False):
-    """Test stdio server basic connectivity"""
-    print("🧪 Testing stdio server")
-    print("=" * 50)
+async def test_stdio_mode():
+    """Test the stdio mode of the server"""
+    print("Testing stdio mode...")
+    print("\n⚠️  Note: stdio mode requires manual testing")
+    print("To test stdio mode:")
+    print("1. Run: python gemini_mcp_server.py --mode stdio")
+    print("2. Send JSON-RPC messages via stdin")
+    print("3. Example initialization message:")
 
-    print("\n1. Testing basic MCP protocol...")
-    print("   Note: Full stdio testing requires MCP client setup")
+    init_message = {
+        "jsonrpc": "2.0",
+        "method": "initialize",
+        "params": {
+            "protocolVersion": "2024-11-05",
+            "clientInfo": {
+                "name": "test-client",
+                "version": "1.0.0"
+            }
+        },
+        "id": 1
+    }
 
-    # Try to import and initialize the server
-    try:
-        from gemini_mcp_server import MCPServer
-        server = MCPServer()
-        print("✅ Server initialization successful")
-
-        # Test configuration loading
-        print("\n2. Testing configuration...")
-        print(f"   Gemini enabled: {server.gemini.enabled}")
-        print(f"   Auto-consult: {server.gemini.auto_consult}")
-        print(f"   CLI command: {server.gemini.cli_command}")
+    print(json.dumps(init_message, indent=2))
+    print("\n4. Then send tools/list request:")
+
+    list_message = {
+        "jsonrpc": "2.0",
+        "method": "tools/list",
+        "params": {},
+        "id": 2
+    }
+
+    print(json.dumps(list_message, indent=2))
+
+
+async def test_mcp_protocol(base_url: str = "http://localhost:8006"):
+    """Test MCP protocol endpoints"""
+    print("\nTesting MCP Protocol endpoints...")
+
+    async with httpx.AsyncClient() as client:
+        # Test messages endpoint
+        print("\n1. Testing /messages endpoint...")
 
-        # Test uncertainty detection
-        print("\n3. Testing uncertainty detection...")
-        test_phrases = [
-            "I'm not sure about this approach",
-            "This is definitely the right way",
-            "We should consider multiple options"
-        ]
-        for phrase in test_phrases:
-            has_uncertainty, patterns = server.detect_response_uncertainty(phrase)
-            status = "✅ Uncertain" if has_uncertainty else "❌ Certain"
-            print(f"   '{phrase[:30]}...' -> {status}")
-            if verbose and has_uncertainty:
-                print(f"     Patterns: {', '.join(patterns)}")
+        # Initialize session
+        init_request = {
+            "jsonrpc": "2.0",
+            "method": "initialize",
+            "params": {
+                "protocolVersion": "2024-11-05",
+                "clientInfo": {
+                    "name": "test-client",
+                    "version": "1.0.0"
+                }
+            },
+            "id": 1
+        }
 
-        print("\n" + "=" * 50)
-        print("✅ stdio server tests completed")
-        print("\nFor full stdio testing, configure Claude Code with:")
-        print('  python3 gemini_mcp_server.py --project-root .')
+        response = await client.post(
+            f"{base_url}/messages",
+            json=init_request,
+            headers={"Content-Type": "application/json"}
+        )
 
-    except ImportError as e:
-        print(f"❌ Cannot import server: {e}")
-        print("\nMake sure you're in the correct directory with:")
-        print("  - gemini_mcp_server.py")
-        print("  - gemini_integration.py")
-        return False
-    except Exception as e:
-        print(f"❌ Server initialization error: {e}")
-        return False
-
-    return True
+        if response.status_code == 200:
+            result = response.json()
+            print(f"✅ Initialization successful: {json.dumps(result, indent=2)}")
+            session_id = response.headers.get("Mcp-Session-Id")
+            print(f"   Session ID: {session_id}")
+
+            # List tools
+            print("\n2. Listing tools via MCP protocol...")
+            list_request = {
+                "jsonrpc": "2.0",
+                "method": "tools/list",
+                "params": {},
+                "id": 2
+            }
+
+            response = await client.post(
+                f"{base_url}/messages",
+                json=list_request,
+                headers={
+                    "Content-Type": "application/json",
+                    "Mcp-Session-Id": session_id
+                }
+            )
+
+            if response.status_code == 200:
+                result = response.json()
+                print(f"✅ Tools listed: {len(result['result']['tools'])} tools available")
+            else:
+                print(f"❌ Tools listing failed: {response.status_code}")
+        else:
+            print(f"❌ Initialization failed: {response.status_code}")
 
 
-def main():
+async def main():
     parser = argparse.ArgumentParser(description="Test Gemini MCP Server")
     parser.add_argument(
         "--mode",
-        choices=["http", "stdio"],
-        default="stdio",
-        help="Server mode to test (default: stdio)"
+        choices=["http", "stdio", "both"],
+        default="http",
+        help="Test mode"
     )
     parser.add_argument(
-        "--verbose",
-        "-v",
-        action="store_true",
-        help="Show detailed output"
+        "--url",
+        default="http://localhost:8006",
+        help="Server URL for HTTP mode"
     )
 
     args = parser.parse_args()
 
-    print("🤖 Gemini MCP Server Test Suite")
-    print("==============================")
-    print(f"Testing mode: {args.mode}")
-    print("")
+    print("🧪 Gemini MCP Server Test Suite")
+    print("=" * 40)
+
+    if args.mode in ["http", "both"]:
+        try:
+            await test_http_mode(args.url)
+            await test_mcp_protocol(args.url)
+        except httpx.ConnectError:
+            print(f"\n❌ Could not connect to server at {args.url}")
+            print("   Make sure the server is running: python gemini_mcp_server.py --mode http")
+            sys.exit(1)
 
-    if args.mode == "http":
-        success = test_http_server(args.verbose)
-    else:
-        success = asyncio.run(test_stdio_server(args.verbose))
+    if args.mode in ["stdio", "both"]:
+        await test_stdio_mode()
 
-    sys.exit(0 if success else 1)
+    print("\n✅ Test suite completed!")
 
 
 if __name__ == "__main__":
-    main()
+    asyncio.run(main())
diff --git a/test_gemini_state.py b/test_gemini_state.py
@@ -1,131 +0,0 @@
-#!/usr/bin/env python3
-"""Test Gemini state management with automatic history inclusion."""
-
-import asyncio
-import sys
-from pathlib import Path
-
-from gemini_integration import get_integration
-
-async def test_automatic_state():
-    """Test if Gemini automatically maintains state through history."""
-    print("🧪 Testing Automatic Gemini State Management")
-    print("=" * 50)
-
-    # Use singleton
-    gemini = get_integration({
-        'enabled': True,
-        'cli_command': 'gemini',
-        'timeout': 30,
-        'include_history': True,  # This enables automatic history inclusion
-        'max_history_entries': 10,
-        'debug_mode': False
-    })
-
-    # Clear any existing history
-    gemini.clear_conversation_history()
-
-    print("\n1️⃣  First Question: What is 2+2?")
-    print("-" * 30)
-
-    try:
-        response1 = await gemini.consult_gemini(
-            query="What is 2+2?",
-            context="",  # No context needed
-            force_consult=True
-        )
-
-        if response1.get('status') == 'success':
-            response_text = response1.get('response', '')
-            print(f"✅ Success! Gemini responded.")
-
-            # Find and show the answer
-            lines = response_text.strip().split('\n')
-            for line in lines:
-                if '4' in line or 'four' in line.lower():
-                    print(f"📝 Found answer: {line.strip()[:100]}...")
-                    break
-        else:
-            print(f"❌ Error: {response1.get('error', 'Unknown error')}")
-            return
-
-    except Exception as e:
-        print(f"❌ Exception: {e}")
-        return
-
-    print(f"\n📊 Conversation history size: {len(gemini.conversation_history)}")
-
-    print("\n2️⃣  Second Question: What is that doubled?")
-    print("   (No context provided - relying on automatic history)")
-    print("-" * 30)
-
-    try:
-        # This time, provide NO context at all - let the history do the work
-        response2 = await gemini.consult_gemini(
-            query="What is that doubled?",
-            context="",  # Empty context - history should provide the context
-            force_consult=True
-        )
-
-        if response2.get('status') == 'success':
-            response_text = response2.get('response', '')
-            print(f"✅ Success! Gemini responded.")
-
-            # Check if it understood the context
-            if '8' in response_text or 'eight' in response_text.lower():
-                print("🎉 STATE MAINTAINED! Gemini understood 'that' referred to 4")
-                print("📝 Found reference to 8 in the response")
-
-                # Find and show where 8 appears
-                for line in response_text.split('\n'):
-                    if '8' in line or 'eight' in line.lower():
-                        print(f"📝 Context: {line.strip()[:100]}...")
-                        break
-            else:
-                print("⚠️  Gemini may not have maintained state properly")
-                print("📝 Response doesn't clearly reference 8")
-                print(f"First 200 chars: {response_text[:200]}...")
-
-        else:
-            print(f"❌ Error: {response2.get('error', 'Unknown error')}")
-
-    except Exception as e:
-        print(f"❌ Exception: {e}")
-
-    print(f"\n📊 Final conversation history size: {len(gemini.conversation_history)}")
-
-    # Test with history disabled
-    print("\n3️⃣  Testing with History Disabled")
-    print("-" * 30)
-
-    # Disable history
-    gemini.include_history = False
-    gemini.clear_conversation_history()
-
-    # Ask first question
-    await gemini.consult_gemini(
-        query="What is 3+3?",
-        context="",
-        force_consult=True
-    )
-
-    # Ask follow-up
-    response3 = await gemini.consult_gemini(
-        query="What is that tripled?",
-        context="",
-        force_consult=True
-    )
-
-    if response3.get('status') == 'success':
-        response_text = response3.get('response', '')
-        if '18' in response_text or 'eighteen' in response_text.lower():
-            print("❌ UNEXPECTED: Found 18 even without history!")
-        else:
-            print("✅ EXPECTED: Without history, Gemini doesn't understand 'that'")
-            # Show what Gemini says when it doesn't have context
-            print(f"Response preview: {response_text[:200]}...")
-
-    print("\n✅ Test complete!")
-
-if __name__ == "__main__":
-    asyncio.run(test_automatic_state())

diff --git a/README.md b/README.md
@@ -6,9 +6,12 @@ A complete setup guide for integrating Google's Gemini CLI with Claude Code thro
   <img src="https://gist.github.com/user-attachments/assets/507ce5cd-30cd-4408-bb96-77508e7e4ac6" />
 </div>
 
-- [Template repository](https://github.com/AndrewAltimit/template-repo) with Gemini CLI Integration
-- Gemini CLI Automated PR Reviews : [Example PR](https://github.com/AndrewAltimit/template-repo/pull/9) , [Automation Script](https://github.com/AndrewAltimit/template-repo/blob/main/scripts/gemini-pr-review.py)
 
+## Usage
+
+See the [template repository](https://github.com/AndrewAltimit/template-repo) for a complete example, including Gemini CLI automated PR reviews : [Example PR](https://github.com/AndrewAltimit/template-repo/pull/9) , [Script](https://github.com/AndrewAltimit/template-repo/blob/main/scripts/gemini-pr-review.py).
+
+![mcp-demo](https://gist.github.com/user-attachments/assets/a5646586-5b12-4d1f-bcfc-28ed84275c1f)
 
 ## Quick Start
 

diff --git a/README.md b/README.md
@@ -10,7 +10,7 @@ A complete setup guide for integrating Google's Gemini CLI with Claude Code thro
 - Gemini CLI Automated PR Reviews : [Example PR](https://github.com/AndrewAltimit/template-repo/pull/9) , [Automation Script](https://github.com/AndrewAltimit/template-repo/blob/main/scripts/gemini-pr-review.py)
 
 
-## 🚀 Quick Start
+## Quick Start
 
 ### 1. Install Gemini CLI (Host-based)
 ```bash
@@ -36,7 +36,7 @@ echo "Your question here" | gemini
 echo "Best practices for microservice authentication?" | gemini -m gemini-2.5-pro
 ```
 
-## 🏠 Host-Based MCP Integration
+## Host-Based MCP Integration
 
 ### Architecture Overview
 - **Host-Based Setup**: Both MCP server and Gemini CLI run on host machine
@@ -122,7 +122,7 @@ Add to your Claude Code's MCP settings:
 }
 ```
 
-## 🔄 Server Mode Comparison
+## Server Mode Comparison
 
 | Feature | stdio Mode | HTTP Mode |
 |---------|-----------|-----------|
@@ -132,7 +132,7 @@ Add to your Claude Code's MCP settings:
 | **Setup complexity** | Moderate | Simple |
 | **Use case** | Production | Testing/Development |
 
-## 🤖 Core Features
+## Core Features
 
 ### 1. Container Detection (Critical Feature)
 Both server modes automatically detect if running inside a container and exit immediately with helpful instructions. This is critical because:
@@ -241,7 +241,7 @@ result = await server.maybe_consult_gemini(response_text, context)
 - Last consultation timestamp
 - Error tracking and timeout monitoring
 
-## ⚙️ Configuration
+## Configuration
 
 ### Environment Variables
 ```bash
@@ -285,7 +285,7 @@ Create `gemini-config.json`:
 }
 ```
 
-## 🧠 Integration Module Core
+## Integration Module Core
 
 ### Uncertainty Patterns (Python)
 ```python
@@ -373,7 +373,7 @@ The singleton pattern ensures:
 - **State Persistence**: Consultation history and statistics are maintained
 - **Resource Efficiency**: Only one instance manages the Gemini CLI connection
 
-## 📋 Example Workflows
+## Example Workflows
 
 ### Manual Consultation
 ```python
@@ -396,7 +396,7 @@ Gemini: "For authentication, consider these approaches: 1) OAuth 2.0 with PKCE f
 Synthesis: Both suggest OAuth but Claude uncertain about security. Gemini provides specific implementation details. Recommendation: Follow Gemini's OAuth 2.0 with PKCE approach.
 ```
 
-## 🔧 Testing
+## Testing
 
 ### Test Both Server Modes
 ```bash
@@ -425,7 +425,7 @@ curl -X POST http://localhost:8006/mcp/tools/consult_gemini \
   -d '{"query": "What is the best Python web framework?"}'
 ```
 
-## 🚨 Troubleshooting
+## Troubleshooting
 
 | Issue | Solution |
 |-------|----------|
@@ -439,7 +439,7 @@ curl -X POST http://localhost:8006/mcp/tools/consult_gemini \
 | stdio connection issues | Check Claude Code MCP configuration |
 | HTTP connection refused | Verify port availability and firewall settings |
 
-## 🔐 Security Considerations
+## Security Considerations
 
 1. **API Credentials**: Store securely, use environment variables
 2. **Data Privacy**: Be cautious about sending proprietary code
@@ -448,7 +448,7 @@ curl -X POST http://localhost:8006/mcp/tools/consult_gemini \
 5. **Host-Based Architecture**: Both Gemini CLI and MCP server run on host for auth compatibility
 6. **Network Security**: HTTP mode binds to 127.0.0.1 by default (not 0.0.0.0)
 
-## 📈 Best Practices
+## Best Practices
 
 1. **Rate Limiting**: Implement appropriate delays between calls
 2. **Context Management**: Keep context concise and relevant  
@@ -458,7 +458,7 @@ curl -X POST http://localhost:8006/mcp/tools/consult_gemini \
 6. **History Management**: Periodically clear history to avoid context bloat
 7. **Mode Selection**: Use stdio for production, HTTP for testing
 
-## 🎯 Use Cases
+## Use Cases
 
 - **Architecture Decisions**: Get second opinions on design choices
 - **Security Reviews**: Validate security implementations

diff --git a/README.md b/README.md
@@ -41,44 +41,64 @@ echo "Best practices for microservice authentication?" | gemini -m gemini-2.5-pr
 ### Architecture Overview
 - **Host-Based Setup**: Both MCP server and Gemini CLI run on host machine
 - **Why Host-Only**: Gemini CLI requires interactive authentication and avoids Docker-in-Docker complexity
+- **Communication Modes**: 
+  - **stdio (recommended)**: Bidirectional streaming for production use
+  - **HTTP**: Simple request/response for testing
 - **Auto-consultation**: Detects uncertainty patterns in Claude responses
 - **Manual consultation**: On-demand second opinions via MCP tools
 - **Response synthesis**: Combines both AI perspectives
 - **Singleton Pattern**: Ensures consistent state management across all tool calls
 
 ### Key Files Structure
 ```
-├── mcp-server.py            # Enhanced MCP server with Gemini tools
-├── gemini_integration.py    # Core integration module with singleton pattern
-├── gemini-config.json       # Gemini configuration
-└── setup-gemini-integration.sh  # Setup script
+├── gemini_mcp_server.py     # stdio-based MCP server with HTTP mode support
+├── gemini_mcp_server_http.py # HTTP server implementation (imported by main)
+├── gemini_integration.py     # Core integration module with singleton pattern  
+├── gemini-config.json        # Gemini configuration
+├── start-gemini-mcp.sh       # Startup script for both modes
+└── test_gemini_mcp.py        # Test script for both server modes
 ```
 
 All files should be placed in the same directory for easy deployment.
 
 ### Host-Based MCP Server Setup
+
+#### stdio Mode (Recommended for Production)
 ```bash
-# Start MCP server directly on host
+# Start MCP server in stdio mode (default)
 cd your-project
-python3 mcp-server.py --project-root .
+python3 gemini_mcp_server.py --project-root .
 
 # Or with environment variables
 GEMINI_ENABLED=true \
 GEMINI_AUTO_CONSULT=true \
 GEMINI_CLI_COMMAND=gemini \
 GEMINI_TIMEOUT=200 \
 GEMINI_RATE_LIMIT=2 \
-python3 mcp-server.py --project-root .
+python3 gemini_mcp_server.py --project-root .
+```
+
+#### HTTP Mode (For Testing)
+```bash
+# Start MCP server in HTTP mode
+python3 gemini_mcp_server.py --project-root . --port 8006
+
+# The main server automatically:
+# 1. Detects the --port argument
+# 2. Imports gemini_mcp_server_http module
+# 3. Starts the FastAPI server on the specified port
 ```
 
 ### Claude Code Configuration
-Create `mcp-config.json`:
+
+#### stdio Configuration (Recommended)
+Add to your Claude Code's MCP settings:
 ```json
 {
   "mcpServers": {
-    "project": {
+    "gemini": {
       "command": "python3",
-      "args": ["mcp-server.py", "--project-root", "."],
+      "args": ["/path/to/gemini_mcp_server.py", "--project-root", "."],
       "cwd": "/path/to/your/project",
       "env": {
         "GEMINI_ENABLED": "true",
@@ -90,15 +110,44 @@ Create `mcp-config.json`:
 }
 ```
 
+#### HTTP Configuration (For Testing)
+```json
+{
+  "mcpServers": {
+    "gemini-http": {
+      "url": "http://localhost:8006",
+      "transport": "http"
+    }
+  }
+}
+```
+
+## 🔄 Server Mode Comparison
+
+| Feature | stdio Mode | HTTP Mode |
+|---------|-----------|-----------|
+| **Communication** | Bidirectional streaming | Request/Response |
+| **Performance** | Better for long operations | Good for simple queries |
+| **Real-time updates** | ✅ Supported | ❌ Not supported |
+| **Setup complexity** | Moderate | Simple |
+| **Use case** | Production | Testing/Development |
+
 ## 🤖 Core Features
 
-### 1. Uncertainty Detection
+### 1. Container Detection (Critical Feature)
+Both server modes automatically detect if running inside a container and exit immediately with helpful instructions. This is critical because:
+- Gemini CLI requires Docker access for containerized execution
+- Running Docker-in-Docker causes authentication and performance issues  
+- The server must run on the host system to access the Docker daemon
+- Detection happens before any imports to fail fast with clear error messages
+
+### 2. Uncertainty Detection
 Automatically detects patterns like:
 - "I'm not sure", "I think", "possibly", "probably"
 - "Multiple approaches", "trade-offs", "alternatives"
 - Critical operations: "security", "production", "database migration"
 
-### 2. MCP Tools Available
+### 3. MCP Tools Available
 
 #### `consult_gemini`
 Manual consultation with Gemini for second opinions or validation.
@@ -125,7 +174,7 @@ Check Gemini integration status and statistics.
 - Configuration status (enabled, auto-consult, CLI command, timeout, rate limit)
 - Gemini CLI availability and version
 - Consultation statistics (total, completed, average time)
-- Last consultation timestamp
+- Conversation history size
 
 **Example:**
 ```python
@@ -158,13 +207,19 @@ Clear Gemini conversation history to start fresh.
 Use the clear_gemini_history tool
 ```
 
-### 3. Response Synthesis
+### 4. Response Synthesis
 - Identifies agreement/disagreement between Claude and Gemini
 - Provides confidence levels (high/medium/low)
 - Generates combined recommendations
 - Tracks execution time and consultation ID
 
-### 4. Advanced Features
+### 5. Advanced Features
+
+#### Conversation History
+The integration maintains conversation history across consultations:
+- Configurable history size (default: 10 entries)
+- History included in subsequent consultations for context
+- Can be cleared with `clear_gemini_history` tool
 
 #### Uncertainty Detection API
 The MCP server exposes methods for detecting uncertainty:
@@ -179,8 +234,10 @@ result = await server.maybe_consult_gemini(response_text, context)
 
 #### Statistics Tracking
 - Total consultations attempted
-- Successful completions
-- Average execution time
+- Successful completions  
+- Average execution time per consultation
+- Total execution time across all consultations
+- Conversation history size
 - Last consultation timestamp
 - Error tracking and timeout monitoring
 
@@ -193,10 +250,16 @@ GEMINI_AUTO_CONSULT=true              # Auto-consult on uncertainty
 GEMINI_CLI_COMMAND=gemini             # CLI command to use
 GEMINI_TIMEOUT=200                    # Query timeout in seconds
 GEMINI_RATE_LIMIT=5                   # Delay between calls (seconds)
-GEMINI_MAX_CONTEXT=                   # Max context length
+GEMINI_MAX_CONTEXT=4000               # Max context length
 GEMINI_MODEL=gemini-2.5-flash         # Model to use
-GEMINI_SANDBOX=false                  # Sandboxing isolates operations (such as shell commands or file modifications) from your host system
-GEMINI_API_KEY=                       # Optional (blank for free tier, keys disable free mode!)
+GEMINI_SANDBOX=false                  # Sandboxing isolates operations
+GEMINI_API_KEY=                       # Optional (blank for free tier)
+GEMINI_LOG_CONSULTATIONS=true         # Log consultation details
+GEMINI_DEBUG=false                    # Debug mode
+GEMINI_INCLUDE_HISTORY=true           # Include conversation history
+GEMINI_MAX_HISTORY=10                 # Max history entries to maintain
+GEMINI_MCP_PORT=8006                  # Port for HTTP mode (if used)
+GEMINI_MCP_HOST=127.0.0.1             # Host for HTTP mode (if used)
 ```
 
 ### Gemini Configuration File
@@ -212,6 +275,8 @@ Create `gemini-config.json`:
   "model": "gemini-2.5-flash",
   "sandbox_mode": true,
   "debug_mode": false,
+  "include_history": true,
+  "max_history_entries": 10,
   "uncertainty_thresholds": {
     "uncertainty_patterns": true,
     "complex_decisions": true,
@@ -259,26 +324,36 @@ class GeminiIntegration:
         self.enabled = self.config.get('enabled', True)
         self.auto_consult = self.config.get('auto_consult', True)
         self.cli_command = self.config.get('cli_command', 'gemini')
-        self.timeout = self.config.get('timeout', 30)
-        self.rate_limit_delay = self.config.get('rate_limit_delay', 1)
+        self.timeout = self.config.get('timeout', 60)
+        self.rate_limit_delay = self.config.get('rate_limit_delay', 2)
+        self.conversation_history = []
+        self.max_history_entries = self.config.get('max_history_entries', 10)
 
     async def consult_gemini(self, query: str, context: str = "") -> Dict[str, Any]:
         """Consult Gemini CLI for second opinion"""
         # Rate limiting
         await self._enforce_rate_limit()
 
-        # Prepare query with context
+        # Prepare query with context and history
         full_query = self._prepare_query(query, context)
 
         # Execute Gemini CLI command
-        result = await self._execute_gemini_command(full_query)
+        result = await self._execute_gemini_cli(full_query)
+
+        # Update conversation history
+        if self.include_history and result.get("output"):
+            self.conversation_history.append((query, result["output"]))
+            # Trim history if needed
+            if len(self.conversation_history) > self.max_history_entries:
+                self.conversation_history = self.conversation_history[-self.max_history_entries:]
 
         return result
 
-    def detect_uncertainty(self, text: str) -> bool:
+    def detect_uncertainty(self, text: str) -> Tuple[bool, List[str]]:
         """Detect if text contains uncertainty patterns"""
-        return any(re.search(pattern, text, re.IGNORECASE) 
-                  for pattern in UNCERTAINTY_PATTERNS)
+        found_patterns = []
+        # Check all pattern categories
+        # Returns (has_uncertainty, list_of_matched_patterns)
 
 # Singleton pattern implementation
 _integration = None
@@ -298,14 +373,6 @@ The singleton pattern ensures:
 - **State Persistence**: Consultation history and statistics are maintained
 - **Resource Efficiency**: Only one instance manages the Gemini CLI connection
 
-### Usage in MCP Server
-```python
-from gemini_integration import get_integration
-
-# Get the singleton instance
-self.gemini = get_integration(config)
-```
-
 ## 📋 Example Workflows
 
 ### Manual Consultation
@@ -329,63 +396,33 @@ Gemini: "For authentication, consider these approaches: 1) OAuth 2.0 with PKCE f
 Synthesis: Both suggest OAuth but Claude uncertain about security. Gemini provides specific implementation details. Recommendation: Follow Gemini's OAuth 2.0 with PKCE approach.
 ```
 
-## 🔧 MCP Server Integration
+## 🔧 Testing
 
-### Tool Definitions
-```python
-@server.list_tools()
-async def handle_list_tools():
-    return [
-        types.Tool(
-            name="consult_gemini",
-            description="Consult Gemini CLI for a second opinion or validation",
-            inputSchema={
-                "type": "object",
-                "properties": {
-                    "query": {
-                        "type": "string", 
-                        "description": "The question or topic to consult Gemini about"
-                    },
-                    "context": {
-                        "type": "string", 
-                        "description": "Additional context for the consultation"
-                    },
-                    "comparison_mode": {
-                        "type": "boolean",
-                        "description": "Whether to request structured comparison format",
-                        "default": True
-                    },
-                    "force": {
-                        "type": "boolean",
-                        "description": "Force consultation even if Gemini is disabled",
-                        "default": False
-                    }
-                },
-                "required": ["query"]
-            }
-        ),
-        types.Tool(
-            name="gemini_status", 
-            description="Get Gemini integration status and statistics"
-        ),
-        types.Tool(
-            name="toggle_gemini_auto_consult",
-            description="Toggle automatic Gemini consultation on uncertainty detection",
-            inputSchema={
-                "type": "object", 
-                "properties": {
-                    "enable": {
-                        "type": "boolean", 
-                        "description": "Enable (true) or disable (false) auto-consultation. If not provided, toggles current state"
-                    }
-                }
-            }
-        ),
-        types.Tool(
-            name="clear_gemini_history",
-            description="Clear Gemini conversation history"
-        )
-    ]
+### Test Both Server Modes
+```bash
+# Test stdio mode (default)
+python3 test_gemini_mcp.py
+
+# Test HTTP mode
+python3 test_gemini_mcp.py --mode http
+
+# Test specific server
+python3 test_gemini_mcp.py --mode stdio --verbose
+```
+
+### Manual Testing via HTTP
+```bash
+# Start HTTP server
+python3 gemini_mcp_server.py --port 8006
+
+# Test endpoints
+curl http://localhost:8006/health
+curl http://localhost:8006/mcp/tools
+
+# Test Gemini consultation
+curl -X POST http://localhost:8006/mcp/tools/consult_gemini \
+  -H "Content-Type: application/json" \
+  -d '{"query": "What is the best Python web framework?"}'
 ```
 
 ## 🚨 Troubleshooting
@@ -398,14 +435,18 @@ async def handle_list_tools():
 | Timeout errors | Increase `GEMINI_TIMEOUT` (default: 60s) |
 | Auto-consult not working | Check `GEMINI_AUTO_CONSULT=true` |
 | Rate limiting | Adjust `GEMINI_RATE_LIMIT` (default: 2s) |
+| Container detection error | Ensure running on host system, not in Docker |
+| stdio connection issues | Check Claude Code MCP configuration |
+| HTTP connection refused | Verify port availability and firewall settings |
 
 ## 🔐 Security Considerations
 
 1. **API Credentials**: Store securely, use environment variables
 2. **Data Privacy**: Be cautious about sending proprietary code
 3. **Input Sanitization**: Sanitize queries before sending
 4. **Rate Limiting**: Respect API limits (free tier: 60/min, 1000/day)
-5. **Host-Based Architecture**: Both Gemini CLI and MCP server run on host for auth compatibility and simplicity
+5. **Host-Based Architecture**: Both Gemini CLI and MCP server run on host for auth compatibility
+6. **Network Security**: HTTP mode binds to 127.0.0.1 by default (not 0.0.0.0)
 
 ## 📈 Best Practices
 
@@ -414,7 +455,8 @@ async def handle_list_tools():
 3. **Error Handling**: Always handle Gemini failures gracefully
 4. **User Control**: Allow users to disable auto-consultation
 5. **Logging**: Log consultations for debugging and analysis
-6. **Caching**: Cache similar queries to reduce API calls
+6. **History Management**: Periodically clear history to avoid context bloat
+7. **Mode Selection**: Use stdio for production, HTTP for testing
 
 ## 🎯 Use Cases
 
@@ -423,3 +465,5 @@ async def handle_list_tools():
 - **Performance Optimization**: Compare optimization strategies  
 - **Code Quality**: Review complex algorithms or patterns
 - **Troubleshooting**: Debug complex technical issues
+- **API Design**: Validate REST/GraphQL/gRPC decisions
+- **Database Schema**: Review data modeling choices
diff --git a/gemini-config.json b/gemini-config.json
@@ -2,11 +2,12 @@
   "enabled": true,
   "auto_consult": true,
   "cli_command": "gemini",
-  "timeout": 30,
+  "timeout": 300,
   "rate_limit_delay": 5.0,
+  "max_context_length": 4000,
   "log_consultations": true,
-  "model": "gemini-2.5-pro",
-  "sandbox_mode": false,
+  "model": "gemini-2.5-flash",
+  "sandbox_mode": true,
   "debug_mode": false,
   "include_history": true,
   "max_history_entries": 10,

diff --git a/gemini_integration.py b/gemini_integration.py
@@ -177,17 +177,27 @@ def get_consultation_stats(self) -> Dict[str, Any]:
             return {"total_consultations": 0}
 
         completed = [e for e in self.consultation_log if e.get("status") == "success"]
+        total_execution_time = sum(e.get("execution_time", 0) for e in completed)
 
-        return {
+        stats = {
             "total_consultations": len(self.consultation_log),
             "completed_consultations": len(completed),
             "average_execution_time": (
-                sum(e.get("execution_time", 0) for e in completed) / len(completed)
+                total_execution_time / len(completed)
                 if completed
                 else 0
             ),
+            "total_execution_time": total_execution_time,
             "conversation_history_size": len(self.conversation_history),
         }
+
+        # Add last consultation timestamp if available
+        if self.consultation_log:
+            last_entry = self.consultation_log[-1]
+            if last_entry.get("timestamp"):
+                stats["last_consultation"] = last_entry["timestamp"]
+
+        return stats
 
     async def _enforce_rate_limit(self):
         """Enforce rate limiting between consultations"""

diff --git a/mcp-server.py → gemini_mcp_server.py b/mcp-server.py → gemini_mcp_server.py
@@ -15,7 +15,25 @@
 import mcp.types as types
 from mcp.server import NotificationOptions, Server, InitializationOptions
 
-# Assuming gemini_integration.py is in the same directory or properly installed
+
+# Check if running in container BEFORE any other imports or operations
+def check_container_and_exit():
+    """Check if running in a container and exit immediately if true."""
+    if os.path.exists("/.dockerenv") or os.environ.get("CONTAINER_ENV"):
+        print("ERROR: Gemini MCP Server cannot run inside a container!", file=sys.stderr)
+        print(
+            "The Gemini CLI requires Docker access and must run on the host system.",
+            file=sys.stderr,
+        )
+        print("Please launch this server directly on the host with:", file=sys.stderr)
+        print("  python gemini_mcp_server.py", file=sys.stderr)
+        sys.exit(1)
+
+
+# Perform container check immediately
+check_container_and_exit()
+
+# Now import the integration module
 from gemini_integration import get_integration
 
 
@@ -217,7 +235,7 @@ async def _get_gemini_status(self) -> List[types.TextContent]:
                             if 'version' in line.lower():
                                 output_lines.append(f"  {line.strip()}")
                                 break
-                except:
+                except Exception:
                     pass
             else:
                 error_msg = stderr.decode() if stderr else "Unknown error"
@@ -258,14 +276,14 @@ async def _get_gemini_status(self) -> List[types.TextContent]:
 
         if completed > 0:
             avg_time = stats.get('average_execution_time', 0)
-            total_time = stats.get('total_execution_time', 0)
             output_lines.append(f"  • Average time: {avg_time:.2f}s")
+            total_time = sum(
+                e.get("execution_time", 0)
+                for e in self.gemini.consultation_log
+                if e.get("status") == "success"
+            )
             output_lines.append(f"  • Total time: {total_time:.2f}s")
 
-        last_consultation = stats.get('last_consultation')
-        if last_consultation:
-            output_lines.append(f"  • Last consultation: {last_consultation}")
-
         output_lines.append("")
         output_lines.append("💡 Usage:")
         output_lines.append("  • Direct: Use 'consult_gemini' tool")
@@ -344,8 +362,30 @@ async def main():
         default=".",
         help="Path to the project root directory"
     )
+    parser.add_argument(
+        "--port",
+        type=int,
+        default=None,
+        help="Port for HTTP mode (if specified, runs as HTTP server instead of stdio)"
+    )
 
     args = parser.parse_args()
 
-    server = MCPServer(args.project_root)
-    server.run()
+    # Check if running in container - exit with instructions if true
+    check_container_and_exit()
+
+    # If port is specified, run as HTTP server (for backward compatibility/testing)
+    if args.port:
+        print("Warning: Running in HTTP mode. For production, use stdio mode (no --port argument)")
+        # Import and run the HTTP server
+        try:
+            from gemini_mcp_server_http import run_http_server
+            run_http_server(args.port)
+        except ImportError:
+            print("Error: gemini_mcp_server_http.py not found", file=sys.stderr)
+            print("HTTP mode requires the HTTP server implementation file", file=sys.stderr)
+            sys.exit(1)
+    else:
+        # Run as stdio MCP server (recommended)
+        server = MCPServer(args.project_root)
+        server.run()
diff --git a/gemini_mcp_server_http.py b/gemini_mcp_server_http.py
@@ -0,0 +1,237 @@
+#!/usr/bin/env python3
+"""
+HTTP-based Gemini MCP Server
+Provides REST API interface for Gemini consultation (for testing/development)
+"""
+
+import os
+import sys
+from pathlib import Path
+from typing import Any, Dict, Optional
+
+from fastapi import FastAPI, HTTPException
+from pydantic import BaseModel
+import uvicorn
+
+# Check if running in container BEFORE any other imports or operations
+def check_container_and_exit():
+    """Check if running in a container and exit immediately if true."""
+    if os.path.exists("/.dockerenv") or os.environ.get("CONTAINER_ENV"):
+        print("ERROR: Gemini MCP Server cannot run inside a container!", file=sys.stderr)
+        print(
+            "The Gemini CLI requires Docker access and must run on the host system.",
+            file=sys.stderr,
+        )
+        print("Please launch this server directly on the host with:", file=sys.stderr)
+        print("  python gemini_mcp_server_http.py", file=sys.stderr)
+        sys.exit(1)
+
+
+# Perform container check immediately
+check_container_and_exit()
+
+# Now import the integration module
+from gemini_integration import get_integration
+
+# Initialize FastAPI app
+app = FastAPI(
+    title="Gemini MCP Server (HTTP Mode)",
+    version="1.0.0",
+    description="HTTP interface for Gemini CLI integration"
+)
+
+
+class ConsultRequest(BaseModel):
+    query: str
+    context: Optional[str] = ""
+    comparison_mode: Optional[bool] = True
+    force: Optional[bool] = False
+
+
+class ToggleRequest(BaseModel):
+    enable: Optional[bool] = None
+
+
+class ConsultResponse(BaseModel):
+    status: str
+    response: Optional[str] = None
+    error: Optional[str] = None
+    consultation_id: Optional[str] = None
+    execution_time: Optional[float] = None
+    timestamp: Optional[str] = None
+
+
+# Initialize Gemini integration
+project_root = Path(os.environ.get("PROJECT_ROOT", "."))
+config = {
+    'enabled': os.getenv('GEMINI_ENABLED', 'true').lower() == 'true',
+    'auto_consult': os.getenv('GEMINI_AUTO_CONSULT', 'true').lower() == 'true',
+    'cli_command': os.getenv('GEMINI_CLI_COMMAND', 'gemini'),
+    'timeout': int(os.getenv('GEMINI_TIMEOUT', '60')),
+    'rate_limit_delay': float(os.getenv('GEMINI_RATE_LIMIT', '2')),
+    'max_context_length': int(os.getenv('GEMINI_MAX_CONTEXT', '4000')),
+    'log_consultations': os.getenv('GEMINI_LOG_CONSULTATIONS', 'true').lower() == 'true',
+    'model': os.getenv('GEMINI_MODEL', 'gemini-2.5-flash'),
+}
+
+# Get the singleton instance
+gemini = get_integration(config)
+
+
+@app.get("/")
+async def root():
+    """Root endpoint showing server info"""
+    return {
+        "name": "Gemini MCP Server (HTTP Mode)",
+        "version": "1.0.0",
+        "mode": "http",
+        "endpoints": {
+            "health": "/health",
+            "tools": "/mcp/tools",
+            "consult": "/mcp/tools/consult_gemini",
+            "status": "/mcp/tools/gemini_status",
+            "toggle": "/mcp/tools/toggle_gemini_auto_consult",
+            "clear_history": "/mcp/tools/clear_gemini_history"
+        }
+    }
+
+
+@app.get("/health")
+async def health():
+    """Health check endpoint"""
+    return {"status": "healthy", "mode": "http"}
+
+
+@app.get("/mcp/tools")
+async def list_tools():
+    """List available MCP tools"""
+    return {
+        "tools": [
+            {
+                "name": "consult_gemini",
+                "description": "Consult Gemini CLI for a second opinion or validation",
+                "parameters": {
+                    "query": {"type": "string", "required": True},
+                    "context": {"type": "string", "required": False},
+                    "comparison_mode": {"type": "boolean", "required": False, "default": True},
+                    "force": {"type": "boolean", "required": False, "default": False}
+                }
+            },
+            {
+                "name": "gemini_status",
+                "description": "Get Gemini integration status and statistics",
+                "parameters": {}
+            },
+            {
+                "name": "toggle_gemini_auto_consult",
+                "description": "Toggle automatic Gemini consultation on uncertainty detection",
+                "parameters": {
+                    "enable": {"type": "boolean", "required": False}
+                }
+            },
+            {
+                "name": "clear_gemini_history",
+                "description": "Clear Gemini conversation history",
+                "parameters": {}
+            }
+        ]
+    }
+
+
+@app.post("/mcp/tools/consult_gemini", response_model=ConsultResponse)
+async def consult_gemini_endpoint(request: ConsultRequest):
+    """Consult Gemini for a second opinion"""
+    if not request.query:
+        raise HTTPException(status_code=400, detail="Query parameter is required")
+
+    result = await gemini.consult_gemini(
+        query=request.query,
+        context=request.context,
+        comparison_mode=request.comparison_mode,
+        force_consult=request.force
+    )
+
+    return ConsultResponse(**result)
+
+
+@app.get("/mcp/tools/gemini_status")
+async def gemini_status_endpoint():
+    """Get Gemini integration status and statistics"""
+    import asyncio
+
+    # Build status information
+    status = {
+        "configuration": {
+            "enabled": gemini.enabled,
+            "auto_consult": gemini.auto_consult,
+            "cli_command": gemini.cli_command,
+            "timeout": gemini.timeout,
+            "rate_limit": gemini.rate_limit_delay,
+            "model": gemini.model
+        },
+        "statistics": gemini.get_consultation_stats()
+    }
+
+    # Check CLI availability
+    try:
+        check_process = await asyncio.create_subprocess_exec(
+            gemini.cli_command, "-p", "test",
+            stdout=asyncio.subprocess.PIPE,
+            stderr=asyncio.subprocess.PIPE
+        )
+        stdout, stderr = await asyncio.wait_for(check_process.communicate(), timeout=5)
+
+        status["cli_available"] = check_process.returncode == 0
+        if not status["cli_available"]:
+            status["cli_error"] = stderr.decode() if stderr else "Unknown error"
+    except Exception as e:
+        status["cli_available"] = False
+        status["cli_error"] = str(e)
+
+    return status
+
+
+@app.post("/mcp/tools/toggle_gemini_auto_consult")
+async def toggle_auto_consult_endpoint(request: ToggleRequest):
+    """Toggle automatic Gemini consultation"""
+    if request.enable is None:
+        # Toggle current state
+        gemini.auto_consult = not gemini.auto_consult
+    else:
+        gemini.auto_consult = bool(request.enable)
+
+    return {
+        "status": "success",
+        "auto_consult": gemini.auto_consult,
+        "message": f"Gemini auto-consultation is now {'enabled' if gemini.auto_consult else 'disabled'}"
+    }
+
+
+@app.post("/mcp/tools/clear_gemini_history")
+async def clear_history_endpoint():
+    """Clear Gemini conversation history"""
+    result = gemini.clear_conversation_history()
+    return result
+
+
+def run_http_server(port: int):
+    """Run the HTTP server"""
+    host = os.environ.get("GEMINI_MCP_HOST", "127.0.0.1")
+    print(f"Starting Gemini MCP Server in HTTP mode on {host}:{port}")
+    print("WARNING: HTTP mode is for testing only. Use stdio mode for production.")
+    print(f"Access the API at: http://{host}:{port}")
+    print(f"Health check: http://{host}:{port}/health")
+    print(f"API docs: http://{host}:{port}/docs")
+
+    uvicorn.run(app, host=host, port=port)
+
+
+if __name__ == "__main__":
+    # Check if running in container
+    check_container_and_exit()
+
+    # Default port from environment or 8006
+    default_port = int(os.environ.get("GEMINI_MCP_PORT", "8006"))
+
+    # Run server
+    run_http_server(default_port)
diff --git a/setup-gemini-integration.sh b/setup-gemini-integration.sh
@@ -1,82 +0,0 @@
-#!/bin/bash
-set -e
-
-echo "🚀 Setting up Gemini CLI Integration..."
-
-# Check Node.js version
-if ! command -v node &> /dev/null; then
-    echo "❌ Node.js not found. Please install Node.js 18+ first."
-    exit 1
-fi
-
-NODE_VERSION=$(node --version | cut -d'v' -f2 | cut -d'.' -f1)
-if [ "$NODE_VERSION" -lt 18 ]; then
-    echo "❌ Node.js version $NODE_VERSION found. Please use Node.js 18+ (recommended: 22.16.0)"
-    echo "   Use: nvm install 22.16.0 && nvm use 22.16.0"
-    exit 1
-fi
-
-echo "✅ Node.js version check passed"
-
-# Install Gemini CLI
-echo "📦 Installing Gemini CLI..."
-npm install -g @google/gemini-cli
-
-# Test installation
-echo "🧪 Testing Gemini CLI installation..."
-if gemini --help > /dev/null 2>&1; then
-    echo "✅ Gemini CLI installed successfully"
-else
-    echo "❌ Gemini CLI installation failed"
-    exit 1
-fi
-
-# Files can be placed in the same directory - no complex structure needed
-echo "📁 Setting up in current directory..."
-
-# Create default configuration
-echo "⚙️ Creating default configuration..."
-cat > gemini-config.json << 'EOF'
-{
-  "enabled": true,
-  "auto_consult": true,
-  "cli_command": "gemini",
-  "timeout": 60,
-  "rate_limit_delay": 2.0,
-  "max_context_length": 4000,
-  "log_consultations": true,
-  "model": "gemini-2.5-flash",
-  "sandbox_mode": false,
-  "debug_mode": false
-}
-EOF
-
-# Create MCP configuration for Claude Code
-echo "🔧 Creating Claude Code MCP configuration..."
-cat > mcp-config.json << 'EOF'
-{
-  "mcpServers": {
-    "project": {
-      "command": "python3",
-      "args": ["mcp-server.py", "--project-root", "."],
-      "env": {
-        "GEMINI_ENABLED": "true",
-        "GEMINI_AUTO_CONSULT": "true"
-      }
-    }
-  }
-}
-EOF
-
-echo ""
-echo "🎉 Gemini CLI Integration setup complete!"
-echo ""
-echo "📋 Next steps:"
-echo "1. Copy the provided code files to your project:"
-echo "   - gemini_integration.py"
-echo "   - mcp-server.py"
-echo "2. Install Python dependencies: pip install mcp pydantic"
-echo "3. Test with: python3 mcp-server.py --project-root ."
-echo "4. Configure Claude Code to use the MCP server"
-echo ""
-echo "💡 Tip: First run 'gemini' command to authenticate with your Google account"

diff --git a/start-gemini-mcp.sh b/start-gemini-mcp.sh
@@ -0,0 +1,69 @@
+#!/bin/bash
+# Start Gemini MCP Server in either stdio or HTTP mode
+
+# Default to stdio mode
+MODE="stdio"
+
+# Check for --http flag
+if [[ "$1" == "--http" ]]; then
+    MODE="http"
+fi
+
+# Set default environment variables if not already set
+export GEMINI_ENABLED="${GEMINI_ENABLED:-true}"
+export GEMINI_AUTO_CONSULT="${GEMINI_AUTO_CONSULT:-true}"
+export GEMINI_CLI_COMMAND="${GEMINI_CLI_COMMAND:-gemini}"
+export GEMINI_TIMEOUT="${GEMINI_TIMEOUT:-200}"
+export GEMINI_RATE_LIMIT="${GEMINI_RATE_LIMIT:-2}"
+export GEMINI_MODEL="${GEMINI_MODEL:-gemini-2.5-flash}"
+export GEMINI_MCP_PORT="${GEMINI_MCP_PORT:-8006}"
+export GEMINI_MCP_HOST="${GEMINI_MCP_HOST:-127.0.0.1}"
+
+# Show current configuration
+echo "🤖 Gemini MCP Server Startup"
+echo "============================"
+echo "Mode: $MODE"
+echo "Environment:"
+echo "  GEMINI_ENABLED=$GEMINI_ENABLED"
+echo "  GEMINI_AUTO_CONSULT=$GEMINI_AUTO_CONSULT"
+echo "  GEMINI_CLI_COMMAND=$GEMINI_CLI_COMMAND"
+echo "  GEMINI_MODEL=$GEMINI_MODEL"
+echo ""
+
+if [[ "$MODE" == "http" ]]; then
+    echo "Starting HTTP server on $GEMINI_MCP_HOST:$GEMINI_MCP_PORT..."
+    echo ""
+
+    # Start HTTP server
+    python3 gemini_mcp_server.py --port $GEMINI_MCP_PORT
+
+    # If server stops, show how to test it
+    echo ""
+    echo "Server stopped. To test the HTTP server:"
+    echo "  curl http://$GEMINI_MCP_HOST:$GEMINI_MCP_PORT/health"
+    echo "  curl http://$GEMINI_MCP_HOST:$GEMINI_MCP_PORT/mcp/tools"
+else
+    echo "stdio mode instructions:"
+    echo ""
+    echo "To use with Claude Code, add this to your MCP settings:"
+    echo ""
+    echo '{'
+    echo '  "mcpServers": {'
+    echo '    "gemini": {'
+    echo '      "command": "python3",'
+    echo "      \"args\": [\"$(pwd)/gemini_mcp_server.py\", \"--project-root\", \".\"],"
+    echo "      \"cwd\": \"$(pwd)\","
+    echo '      "env": {'
+    echo '        "GEMINI_ENABLED": "true",'
+    echo '        "GEMINI_AUTO_CONSULT": "true",'
+    echo '        "GEMINI_CLI_COMMAND": "gemini"'
+    echo '      }'
+    echo '    }'
+    echo '  }'
+    echo '}'
+    echo ""
+    echo "Or run directly for testing:"
+    echo "  python3 gemini_mcp_server.py --project-root ."
+    echo ""
+    echo "For HTTP mode, run: $0 --http"
+fi
diff --git a/test_gemini_mcp.py b/test_gemini_mcp.py
@@ -0,0 +1,205 @@
+#!/usr/bin/env python3
+"""
+Test script for Gemini MCP Server
+Tests both stdio and HTTP modes
+"""
+
+import argparse
+import asyncio
+import json
+import os
+import sys
+
+import requests
+
+
+def test_http_server(verbose=False):
+    """Test HTTP server endpoints"""
+    host = os.environ.get("GEMINI_MCP_HOST", "127.0.0.1")
+    port = int(os.environ.get("GEMINI_MCP_PORT", "8006"))
+    base_url = f"http://{host}:{port}"
+
+    print(f"🧪 Testing HTTP server at {base_url}")
+    print("=" * 50)
+
+    # Test 1: Health check
+    print("\n1. Testing health endpoint...")
+    try:
+        response = requests.get(f"{base_url}/health", timeout=5)
+        if response.status_code == 200:
+            print("✅ Health check passed")
+            if verbose:
+                print(f"   Response: {response.json()}")
+        else:
+            print(f"❌ Health check failed: {response.status_code}")
+            return False
+    except Exception as e:
+        print(f"❌ Cannot connect to server: {e}")
+        print("\nMake sure the HTTP server is running:")
+        print("  python3 gemini_mcp_server.py --port 8006")
+        return False
+
+    # Test 2: Root endpoint
+    print("\n2. Testing root endpoint...")
+    try:
+        response = requests.get(base_url)
+        if response.status_code == 200:
+            print("✅ Root endpoint accessible")
+            if verbose:
+                print(f"   Endpoints: {json.dumps(response.json()['endpoints'], indent=2)}")
+        else:
+            print(f"❌ Root endpoint failed: {response.status_code}")
+    except Exception as e:
+        print(f"❌ Root endpoint error: {e}")
+
+    # Test 3: List tools
+    print("\n3. Testing tools listing...")
+    try:
+        response = requests.get(f"{base_url}/mcp/tools")
+        if response.status_code == 200:
+            tools = response.json()["tools"]
+            print(f"✅ Found {len(tools)} tools:")
+            for tool in tools:
+                print(f"   - {tool['name']}: {tool['description']}")
+        else:
+            print(f"❌ Tools listing failed: {response.status_code}")
+    except Exception as e:
+        print(f"❌ Tools listing error: {e}")
+
+    # Test 4: Gemini status
+    print("\n4. Testing Gemini status...")
+    try:
+        response = requests.get(f"{base_url}/mcp/tools/gemini_status")
+        if response.status_code == 200:
+            status = response.json()
+            print("✅ Status endpoint working")
+            print(f"   Gemini enabled: {status['configuration']['enabled']}")
+            print(f"   Auto-consult: {status['configuration']['auto_consult']}")
+            print(f"   CLI available: {status.get('cli_available', 'Unknown')}")
+            if not status.get('cli_available') and verbose:
+                print(f"   CLI error: {status.get('cli_error', 'Unknown')}")
+        else:
+            print(f"❌ Status check failed: {response.status_code}")
+    except Exception as e:
+        print(f"❌ Status check error: {e}")
+
+    # Test 5: Simple consultation (if Gemini is available)
+    print("\n5. Testing Gemini consultation...")
+    try:
+        payload = {
+            "query": "What is 2+2?",
+            "context": "This is a test query",
+            "comparison_mode": False
+        }
+        response = requests.post(
+            f"{base_url}/mcp/tools/consult_gemini",
+            json=payload,
+            timeout=30
+        )
+        if response.status_code == 200:
+            result = response.json()
+            if result["status"] == "success":
+                print("✅ Consultation successful")
+                if verbose:
+                    print(f"   Response preview: {result['response'][:100]}...")
+                    print(f"   Execution time: {result.get('execution_time', 0):.2f}s")
+            else:
+                print(f"⚠️  Consultation returned status: {result['status']}")
+                if result.get("error"):
+                    print(f"   Error: {result['error']}")
+        else:
+            print(f"❌ Consultation failed: {response.status_code}")
+            if verbose:
+                print(f"   Response: {response.text}")
+    except Exception as e:
+        print(f"❌ Consultation error: {e}")
+
+    print("\n" + "=" * 50)
+    print("✅ HTTP server tests completed")
+    return True
+
+
+async def test_stdio_server(verbose=False):
+    """Test stdio server basic connectivity"""
+    print("🧪 Testing stdio server")
+    print("=" * 50)
+
+    print("\n1. Testing basic MCP protocol...")
+    print("   Note: Full stdio testing requires MCP client setup")
+
+    # Try to import and initialize the server
+    try:
+        from gemini_mcp_server import MCPServer
+        server = MCPServer()
+        print("✅ Server initialization successful")
+
+        # Test configuration loading
+        print("\n2. Testing configuration...")
+        print(f"   Gemini enabled: {server.gemini.enabled}")
+        print(f"   Auto-consult: {server.gemini.auto_consult}")
+        print(f"   CLI command: {server.gemini.cli_command}")
+
+        # Test uncertainty detection
+        print("\n3. Testing uncertainty detection...")
+        test_phrases = [
+            "I'm not sure about this approach",
+            "This is definitely the right way",
+            "We should consider multiple options"
+        ]
+        for phrase in test_phrases:
+            has_uncertainty, patterns = server.detect_response_uncertainty(phrase)
+            status = "✅ Uncertain" if has_uncertainty else "❌ Certain"
+            print(f"   '{phrase[:30]}...' -> {status}")
+            if verbose and has_uncertainty:
+                print(f"     Patterns: {', '.join(patterns)}")
+
+        print("\n" + "=" * 50)
+        print("✅ stdio server tests completed")
+        print("\nFor full stdio testing, configure Claude Code with:")
+        print('  python3 gemini_mcp_server.py --project-root .')
+
+    except ImportError as e:
+        print(f"❌ Cannot import server: {e}")
+        print("\nMake sure you're in the correct directory with:")
+        print("  - gemini_mcp_server.py")
+        print("  - gemini_integration.py")
+        return False
+    except Exception as e:
+        print(f"❌ Server initialization error: {e}")
+        return False
+
+    return True
+
+
+def main():
+    parser = argparse.ArgumentParser(description="Test Gemini MCP Server")
+    parser.add_argument(
+        "--mode",
+        choices=["http", "stdio"],
+        default="stdio",
+        help="Server mode to test (default: stdio)"
+    )
+    parser.add_argument(
+        "--verbose",
+        "-v",
+        action="store_true",
+        help="Show detailed output"
+    )
+
+    args = parser.parse_args()
+
+    print("🤖 Gemini MCP Server Test Suite")
+    print("==============================")
+    print(f"Testing mode: {args.mode}")
+    print("")
+
+    if args.mode == "http":
+        success = test_http_server(args.verbose)
+    else:
+        success = asyncio.run(test_stdio_server(args.verbose))
+
+    sys.exit(0 if success else 1)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/README.md b/README.md
@@ -6,8 +6,8 @@ A complete setup guide for integrating Google's Gemini CLI with Claude Code thro
   <img src="https://gist.github.com/user-attachments/assets/507ce5cd-30cd-4408-bb96-77508e7e4ac6" />
 </div>
 
-- [Example Template Repo](https://github.com/AndrewAltimit/template-repo) with Gemini CLI Integration
-- Gemini CLI Automated PR Reviews [Example PR](https://github.com/AndrewAltimit/template-repo/pull/9) , [Automation Script](https://github.com/AndrewAltimit/template-repo/blob/main/scripts/gemini-pr-review.py)
+- [Template repository](https://github.com/AndrewAltimit/template-repo) with Gemini CLI Integration
+- Gemini CLI Automated PR Reviews : [Example PR](https://github.com/AndrewAltimit/template-repo/pull/9) , [Automation Script](https://github.com/AndrewAltimit/template-repo/blob/main/scripts/gemini-pr-review.py)
 
 
 ## 🚀 Quick Start

diff --git a/README.md b/README.md
@@ -7,7 +7,7 @@ A complete setup guide for integrating Google's Gemini CLI with Claude Code thro
 </div>
 
 - [Example Template Repo](https://github.com/AndrewAltimit/template-repo) with Gemini CLI Integration
-- Gemini CLI Automated PR Reviews [Example PR](https://github.com/AndrewAltimit/template-repo/pull/3) , [Automation Script](https://github.com/AndrewAltimit/template-repo/blob/main/scripts/gemini-pr-review.py)
+- Gemini CLI Automated PR Reviews [Example PR](https://github.com/AndrewAltimit/template-repo/pull/9) , [Automation Script](https://github.com/AndrewAltimit/template-repo/blob/main/scripts/gemini-pr-review.py)
 
 
 ## 🚀 Quick Start

diff --git a/README.md b/README.md
@@ -6,6 +6,10 @@ A complete setup guide for integrating Google's Gemini CLI with Claude Code thro
   <img src="https://gist.github.com/user-attachments/assets/507ce5cd-30cd-4408-bb96-77508e7e4ac6" />
 </div>
 
+- [Example Template Repo](https://github.com/AndrewAltimit/template-repo) with Gemini CLI Integration
+- Gemini CLI Automated PR Reviews [Example PR](https://github.com/AndrewAltimit/template-repo/pull/3) , [Automation Script](https://github.com/AndrewAltimit/template-repo/blob/main/scripts/gemini-pr-review.py)
+
+
 ## 🚀 Quick Start
 
 ### 1. Install Gemini CLI (Host-based)

diff --git a/gemini_integration.py b/gemini_integration.py
@@ -1,17 +1,14 @@
 #!/usr/bin/env python3
 """
-Gemini CLI Integration Module MCP Server
+Gemini CLI Integration Module
 Provides automatic consultation with Gemini for second opinions and validation
 """
 
 import asyncio
-import json
 import logging
 import re
-import subprocess
 import time
 from datetime import datetime
-from pathlib import Path
 from typing import Any, Dict, List, Optional, Tuple
 
 # Setup logging
@@ -20,168 +17,204 @@
 
 # Uncertainty patterns that trigger automatic Gemini consultation
 UNCERTAINTY_PATTERNS = [
-    r"\bI'm not sure\b", r"\bI think\b", r"\bpossibly\b", r"\bprobably\b",
-    r"\bmight be\b", r"\bcould be\b", r"\bI believe\b", r"\bIt seems\b",
-    r"\bappears to be\b", r"\buncertain\b", r"\bI would guess\b",
-    r"\blikely\b", r"\bperhaps\b", r"\bmaybe\b", r"\bI assume\b"
+    r"\bI'm not sure\b",
+    r"\bI think\b",
+    r"\bpossibly\b",
+    r"\bprobably\b",
+    r"\bmight be\b",
+    r"\bcould be\b",
+    r"\bI believe\b",
+    r"\bIt seems\b",
+    r"\bappears to be\b",
+    r"\buncertain\b",
+    r"\bI would guess\b",
+    r"\blikely\b",
+    r"\bperhaps\b",
+    r"\bmaybe\b",
+    r"\bI assume\b",
 ]
 
 # Complex decision patterns that benefit from second opinions
 COMPLEX_DECISION_PATTERNS = [
-    r"\bmultiple approaches\b", r"\bseveral options\b", r"\btrade-offs?\b",
-    r"\bconsider(?:ing)?\b", r"\balternatives?\b", r"\bpros and cons\b",
-    r"\bweigh(?:ing)? the options\b", r"\bchoice between\b", r"\bdecision\b"
+    r"\bmultiple approaches\b",
+    r"\bseveral options\b",
+    r"\btrade-offs?\b",
+    r"\bconsider(?:ing)?\b",
+    r"\balternatives?\b",
+    r"\bpros and cons\b",
+    r"\bweigh(?:ing)? the options\b",
+    r"\bchoice between\b",
+    r"\bdecision\b",
 ]
 
 # Critical operations that should trigger consultation
 CRITICAL_OPERATION_PATTERNS = [
-    r"\bproduction\b", r"\bdatabase migration\b", r"\bsecurity\b",
-    r"\bauthentication\b", r"\bencryption\b", r"\bAPI key\b",
-    r"\bcredentials?\b", r"\bperformance\s+critical\b"
+    r"\bproduction\b",
+    r"\bdatabase migration\b",
+    r"\bsecurity\b",
+    r"\bauthentication\b",
+    r"\bencryption\b",
+    r"\bAPI key\b",
+    r"\bcredentials?\b",
+    r"\bperformance\s+critical\b",
 ]
 
 
 class GeminiIntegration:
     """Handles Gemini CLI integration for second opinions and validation"""
-    
+
     def __init__(self, config: Optional[Dict[str, Any]] = None):
         self.config = config or {}
-        self.enabled = self.config.get('enabled', True)
-        self.auto_consult = self.config.get('auto_consult', True)
-        self.cli_command = self.config.get('cli_command', 'gemini')
-        self.timeout = self.config.get('timeout', 60)
-        self.rate_limit_delay = self.config.get('rate_limit_delay', 2.0)
+        self.enabled = self.config.get("enabled", True)
+        self.auto_consult = self.config.get("auto_consult", True)
+        self.cli_command = self.config.get("cli_command", "gemini")
+        self.timeout = self.config.get("timeout", 60)
+        self.rate_limit_delay = self.config.get("rate_limit_delay", 2.0)
         self.last_consultation = 0
         self.consultation_log = []
-        self.max_context_length = self.config.get('max_context_length', 4000)
-        self.model = self.config.get('model', 'gemini-2.5-flash')
-        
+        self.max_context_length = self.config.get("max_context_length", 4000)
+        self.model = self.config.get("model", "gemini-2.5-flash")
+
         # Conversation history for maintaining state
         self.conversation_history = []
-        self.max_history_entries = self.config.get('max_history_entries', 10)
-        self.include_history = self.config.get('include_history', True)
+        self.max_history_entries = self.config.get("max_history_entries", 10)
+        self.include_history = self.config.get("include_history", True)
 
-    async def consult_gemini(self, query: str, context: str = "", 
-                           comparison_mode: bool = True, 
-                           force_consult: bool = False) -> Dict[str, Any]:
+    async def consult_gemini(
+        self,
+        query: str,
+        context: str = "",
+        comparison_mode: bool = True,
+        force_consult: bool = False,
+    ) -> Dict[str, Any]:
         """Consult Gemini CLI for second opinion"""
         if not self.enabled and not force_consult:
-            return {'status': 'disabled', 'message': 'Gemini integration is disabled'}
-        
+            return {"status": "disabled", "message": "Gemini integration is disabled"}
+
         if not force_consult:
             await self._enforce_rate_limit()
-        
+
         consultation_id = f"consult_{int(time.time())}_{len(self.consultation_log)}"
-        
+
         try:
             # Prepare query with context
             full_query = self._prepare_query(query, context, comparison_mode)
-            
+
             # Execute Gemini CLI command
             result = await self._execute_gemini_cli(full_query)
-            
+
             # Save to conversation history
-            if self.include_history and result.get('output'):
-                self.conversation_history.append((query, result['output']))
+            if self.include_history and result.get("output"):
+                self.conversation_history.append((query, result["output"]))
                 # Trim history if it exceeds max entries
                 if len(self.conversation_history) > self.max_history_entries:
-                    self.conversation_history = self.conversation_history[-self.max_history_entries:]
-
+                    self.conversation_history = self.conversation_history[
+                        -self.max_history_entries :
+                    ]
+
             # Log consultation
-            if self.config.get('log_consultations', True):
-                self.consultation_log.append({
-                    'id': consultation_id,
-                    'timestamp': datetime.now().isoformat(),
-                    'query': query[:200] + "..." if len(query) > 200 else query,
-                    'status': 'success',
-                    'execution_time': result.get('execution_time', 0)
-                })
-
+            if self.config.get("log_consultations", True):
+                self.consultation_log.append(
+                    {
+                        "id": consultation_id,
+                        "timestamp": datetime.now().isoformat(),
+                        "query": (query[:200] + "..." if len(query) > 200 else query),
+                        "status": "success",
+                        "execution_time": result.get("execution_time", 0),
+                    }
+                )
+
             return {
-                'status': 'success',
-                'response': result['output'],
-                'execution_time': result['execution_time'],
-                'consultation_id': consultation_id,
-                'timestamp': datetime.now().isoformat()
+                "status": "success",
+                "response": result["output"],
+                "execution_time": result["execution_time"],
+                "consultation_id": consultation_id,
+                "timestamp": datetime.now().isoformat(),
             }
-            
+
         except Exception as e:
             logger.error(f"Error consulting Gemini: {str(e)}")
             return {
-                'status': 'error',
-                'error': str(e),
-                'consultation_id': consultation_id
+                "status": "error",
+                "error": str(e),
+                "consultation_id": consultation_id,
             }
 
     def detect_uncertainty(self, text: str) -> Tuple[bool, List[str]]:
         """Detect if text contains uncertainty patterns"""
         found_patterns = []
-        
+
         # Check uncertainty patterns
         for pattern in UNCERTAINTY_PATTERNS:
             if re.search(pattern, text, re.IGNORECASE):
                 found_patterns.append(f"uncertainty: {pattern}")
-        
+
         # Check complex decision patterns
         for pattern in COMPLEX_DECISION_PATTERNS:
             if re.search(pattern, text, re.IGNORECASE):
                 found_patterns.append(f"complex_decision: {pattern}")
-        
+
         # Check critical operation patterns
         for pattern in CRITICAL_OPERATION_PATTERNS:
             if re.search(pattern, text, re.IGNORECASE):
                 found_patterns.append(f"critical_operation: {pattern}")
-        
+
         return len(found_patterns) > 0, found_patterns
 
     def clear_conversation_history(self) -> Dict[str, Any]:
         """Clear the conversation history"""
         old_count = len(self.conversation_history)
         self.conversation_history = []
         return {
-            'status': 'success',
-            'cleared_entries': old_count,
-            'message': f'Cleared {old_count} conversation entries'
+            "status": "success",
+            "cleared_entries": old_count,
+            "message": f"Cleared {old_count} conversation entries",
         }
 
     def get_consultation_stats(self) -> Dict[str, Any]:
         """Get statistics about consultations"""
         if not self.consultation_log:
-            return {'total_consultations': 0}
-            
-        completed = [e for e in self.consultation_log if e.get('status') == 'success']
-        
+            return {"total_consultations": 0}
+
+        completed = [e for e in self.consultation_log if e.get("status") == "success"]
+
         return {
-            'total_consultations': len(self.consultation_log),
-            'completed_consultations': len(completed),
-            'average_execution_time': sum(e.get('execution_time', 0) for e in completed) / len(completed) if completed else 0,
-            'conversation_history_size': len(self.conversation_history)
+            "total_consultations": len(self.consultation_log),
+            "completed_consultations": len(completed),
+            "average_execution_time": (
+                sum(e.get("execution_time", 0) for e in completed) / len(completed)
+                if completed
+                else 0
+            ),
+            "conversation_history_size": len(self.conversation_history),
         }
 
     async def _enforce_rate_limit(self):
         """Enforce rate limiting between consultations"""
         current_time = time.time()
         time_since_last = current_time - self.last_consultation
-        
+
         if time_since_last < self.rate_limit_delay:
             sleep_time = self.rate_limit_delay - time_since_last
             await asyncio.sleep(sleep_time)
-        
+
         self.last_consultation = time.time()
 
     def _prepare_query(self, query: str, context: str, comparison_mode: bool) -> str:
         """Prepare the full query for Gemini CLI"""
         parts = []
-        
+
         if comparison_mode:
             parts.append("Please provide a technical analysis and second opinion:")
             parts.append("")
-        
+
         # Include conversation history if enabled and available
         if self.include_history and self.conversation_history:
             parts.append("Previous conversation:")
             parts.append("-" * 40)
-            for i, (prev_q, prev_a) in enumerate(self.conversation_history[-self.max_history_entries:], 1):
+            for i, (prev_q, prev_a) in enumerate(
+                self.conversation_history[-self.max_history_entries :], 1
+            ):
                 parts.append(f"Q{i}: {prev_q}")
                 # Truncate long responses in history
                 if len(prev_a) > 500:
@@ -191,66 +224,62 @@ def _prepare_query(self, query: str, context: str, comparison_mode: bool) -> str
                 parts.append("")
             parts.append("-" * 40)
             parts.append("")
-        
+
         # Truncate context if too long
         if len(context) > self.max_context_length:
-            context = context[:self.max_context_length] + "\n[Context truncated...]"
-        
+            context = context[: self.max_context_length] + "\n[Context truncated...]"
+
         if context:
             parts.append("Context:")
             parts.append(context)
             parts.append("")
-        
+
         parts.append("Current Question/Topic:")
         parts.append(query)
-        
+
         if comparison_mode:
-            parts.extend([
-                "",
-                "Please structure your response with:",
-                "1. Your analysis and understanding",
-                "2. Recommendations or approach", 
-                "3. Any concerns or considerations",
-                "4. Alternative approaches (if applicable)"
-            ])
-
+            parts.extend(
+                [
+                    "",
+                    "Please structure your response with:",
+                    "1. Your analysis and understanding",
+                    "2. Recommendations or approach",
+                    "3. Any concerns or considerations",
+                    "4. Alternative approaches (if applicable)",
+                ]
+            )
+
         return "\n".join(parts)
 
     async def _execute_gemini_cli(self, query: str) -> Dict[str, Any]:
         """Execute Gemini CLI command and return results"""
         start_time = time.time()
-        
+
         # Build command
         cmd = [self.cli_command]
         if self.model:
-            cmd.extend(['-m', self.model])
-        cmd.extend(['-p', query])  # Non-interactive mode
-        
+            cmd.extend(["-m", self.model])
+        cmd.extend(["-p", query])  # Non-interactive mode
+
         try:
             process = await asyncio.create_subprocess_exec(
-                *cmd,
-                stdout=asyncio.subprocess.PIPE,
-                stderr=asyncio.subprocess.PIPE
+                *cmd, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE
             )
-            
+
             stdout, stderr = await asyncio.wait_for(
-                process.communicate(),
-                timeout=self.timeout
+                process.communicate(), timeout=self.timeout
             )
-            
+
             execution_time = time.time() - start_time
-            
+
             if process.returncode != 0:
                 error_msg = stderr.decode() if stderr else "Unknown error"
                 if "authentication" in error_msg.lower():
                     error_msg += "\nTip: Run 'gemini' interactively to authenticate"
                 raise Exception(f"Gemini CLI failed: {error_msg}")
-
-            return {
-                'output': stdout.decode().strip(),
-                'execution_time': execution_time
-            }
-
+
+            return {"output": stdout.decode().strip(), "execution_time": execution_time}
+
         except asyncio.TimeoutError:
             raise Exception(f"Gemini CLI timed out after {self.timeout} seconds")
 
@@ -262,18 +291,18 @@ async def _execute_gemini_cli(self, query: str) -> Dict[str, Any]:
 def get_integration(config: Optional[Dict[str, Any]] = None) -> GeminiIntegration:
     """
     Get or create the global Gemini integration instance.
-    
+
     This ensures that all parts of the application share the same instance,
     maintaining consistent state for rate limiting, consultation history,
     and configuration across all tool calls.
-    
+
     Args:
         config: Optional configuration dict. Only used on first call.
-    
+
     Returns:
         The singleton GeminiIntegration instance
     """
     global _integration
     if _integration is None:
         _integration = GeminiIntegration(config)
-    return _integration
+    return _integration
diff --git a/README.md b/README.md
@@ -191,6 +191,7 @@ GEMINI_TIMEOUT=200                    # Query timeout in seconds
 GEMINI_RATE_LIMIT=5                   # Delay between calls (seconds)
 GEMINI_MAX_CONTEXT=                   # Max context length
 GEMINI_MODEL=gemini-2.5-flash         # Model to use
+GEMINI_SANDBOX=false                  # Sandboxing isolates operations (such as shell commands or file modifications) from your host system
 GEMINI_API_KEY=                       # Optional (blank for free tier, keys disable free mode!)
 ```
 

diff --git a/mcp-server.py b/mcp-server.py
@@ -60,7 +60,7 @@ def _load_gemini_config(self) -> Dict[str, Any]:
             'max_context_length': int(os.getenv('GEMINI_MAX_CONTEXT', '4000')),
             'log_consultations': os.getenv('GEMINI_LOG_CONSULTATIONS', 'true').lower() == 'true',
             'model': os.getenv('GEMINI_MODEL', 'gemini-2.5-flash'),
-            'sandbox_mode': os.getenv('GEMINI_SANDBOX', 'true').lower() == 'true',
+            'sandbox_mode': os.getenv('GEMINI_SANDBOX', 'false').lower() == 'true',
             'debug_mode': os.getenv('GEMINI_DEBUG', 'false').lower() == 'true',
             'include_history': os.getenv('GEMINI_INCLUDE_HISTORY', 'true').lower() == 'true',
             'max_history_entries': int(os.getenv('GEMINI_MAX_HISTORY', '10')),

diff --git a/README.md b/README.md
@@ -333,29 +333,52 @@ async def handle_list_tools():
     return [
         types.Tool(
             name="consult_gemini",
-            description="Consult Gemini for a second opinion",
+            description="Consult Gemini CLI for a second opinion or validation",
             inputSchema={
                 "type": "object",
                 "properties": {
-                    "query": {"type": "string", "description": "Question for Gemini"},
-                    "context": {"type": "string", "description": "Additional context"}
+                    "query": {
+                        "type": "string", 
+                        "description": "The question or topic to consult Gemini about"
+                    },
+                    "context": {
+                        "type": "string", 
+                        "description": "Additional context for the consultation"
+                    },
+                    "comparison_mode": {
+                        "type": "boolean",
+                        "description": "Whether to request structured comparison format",
+                        "default": True
+                    },
+                    "force": {
+                        "type": "boolean",
+                        "description": "Force consultation even if Gemini is disabled",
+                        "default": False
+                    }
                 },
                 "required": ["query"]
             }
         ),
         types.Tool(
             name="gemini_status", 
-            description="Check Gemini integration status"
+            description="Get Gemini integration status and statistics"
         ),
         types.Tool(
             name="toggle_gemini_auto_consult",
-            description="Enable/disable automatic consultation",
+            description="Toggle automatic Gemini consultation on uncertainty detection",
             inputSchema={
                 "type": "object", 
                 "properties": {
-                    "enable": {"type": "boolean", "description": "Enable or disable"}
+                    "enable": {
+                        "type": "boolean", 
+                        "description": "Enable (true) or disable (false) auto-consultation. If not provided, toggles current state"
+                    }
                 }
             }
+        ),
+        types.Tool(
+            name="clear_gemini_history",
+            description="Clear Gemini conversation history"
         )
     ]
 ```

diff --git a/README.md b/README.md
@@ -95,14 +95,90 @@ Automatically detects patterns like:
 - Critical operations: "security", "production", "database migration"
 
 ### 2. MCP Tools Available
-- `consult_gemini` - Manual consultation with context
-- `gemini_status` - Check integration status and statistics  
-- `toggle_gemini_auto_consult` - Enable/disable auto-consultation
+
+#### `consult_gemini`
+Manual consultation with Gemini for second opinions or validation.
+
+**Parameters:**
+- `query` (required): The question or topic to consult Gemini about
+- `context` (optional): Additional context for the consultation
+- `comparison_mode` (optional, default: true): Whether to request structured comparison format
+- `force` (optional, default: false): Force consultation even if disabled
+
+**Example:**
+```python
+# In Claude Code
+Use the consult_gemini tool with:
+query: "Should I use WebSockets or gRPC for real-time communication?"
+context: "Building a multiplayer application with real-time updates"
+comparison_mode: true
+```
+
+#### `gemini_status`
+Check Gemini integration status and statistics.
+
+**Returns:**
+- Configuration status (enabled, auto-consult, CLI command, timeout, rate limit)
+- Gemini CLI availability and version
+- Consultation statistics (total, completed, average time)
+- Last consultation timestamp
+
+**Example:**
+```python
+# Check current status
+Use the gemini_status tool
+```
+
+#### `toggle_gemini_auto_consult`
+Enable or disable automatic Gemini consultation on uncertainty detection.
+
+**Parameters:**
+- `enable` (optional): true to enable, false to disable. If not provided, toggles current state.
+
+**Example:**
+```python
+# Toggle auto-consultation
+Use the toggle_gemini_auto_consult tool
+
+# Or explicitly enable/disable
+Use the toggle_gemini_auto_consult tool with:
+enable: false
+```
+
+#### `clear_gemini_history`
+Clear Gemini conversation history to start fresh.
+
+**Example:**
+```python
+# Clear all consultation history
+Use the clear_gemini_history tool
+```
 
 ### 3. Response Synthesis
 - Identifies agreement/disagreement between Claude and Gemini
 - Provides confidence levels (high/medium/low)
 - Generates combined recommendations
+- Tracks execution time and consultation ID
+
+### 4. Advanced Features
+
+#### Uncertainty Detection API
+The MCP server exposes methods for detecting uncertainty:
+
+```python
+# Detect uncertainty in responses
+has_uncertainty, patterns = server.detect_response_uncertainty(response_text)
+
+# Automatically consult if uncertain
+result = await server.maybe_consult_gemini(response_text, context)
+```
+
+#### Statistics Tracking
+- Total consultations attempted
+- Successful completions
+- Average execution time
+- Last consultation timestamp
+- Error tracking and timeout monitoring
 
 ## ⚙️ Configuration
 

diff --git a/gemini-config.json b/gemini-config.json
@@ -6,8 +6,10 @@
   "rate_limit_delay": 5.0,
   "log_consultations": true,
   "model": "gemini-2.5-pro",
-  "sandbox_mode": true,
+  "sandbox_mode": false,
   "debug_mode": false,
+  "include_history": true,
+  "max_history_entries": 10,
   "uncertainty_thresholds": {
     "uncertainty_patterns": true,
     "complex_decisions": true,

diff --git a/gemini_integration.py b/gemini_integration.py
@@ -1,6 +1,6 @@
 #!/usr/bin/env python3
 """
-Gemini CLI Integration Module
+Gemini CLI Integration Module MCP Server
 Provides automatic consultation with Gemini for second opinions and validation
 """
 
@@ -55,18 +55,23 @@ def __init__(self, config: Optional[Dict[str, Any]] = None):
         self.consultation_log = []
         self.max_context_length = self.config.get('max_context_length', 4000)
         self.model = self.config.get('model', 'gemini-2.5-flash')
+
+        # Conversation history for maintaining state
+        self.conversation_history = []
+        self.max_history_entries = self.config.get('max_history_entries', 10)
+        self.include_history = self.config.get('include_history', True)
 
     async def consult_gemini(self, query: str, context: str = "", 
                            comparison_mode: bool = True, 
                            force_consult: bool = False) -> Dict[str, Any]:
         """Consult Gemini CLI for second opinion"""
-        if not self.enabled:
+        if not self.enabled and not force_consult:
             return {'status': 'disabled', 'message': 'Gemini integration is disabled'}
 
         if not force_consult:
             await self._enforce_rate_limit()
 
-        consultation_id = f"consult_{int(time.time())}"
+        consultation_id = f"consult_{int(time.time())}_{len(self.consultation_log)}"
 
         try:
             # Prepare query with context
@@ -75,6 +80,13 @@ async def consult_gemini(self, query: str, context: str = "",
             # Execute Gemini CLI command
             result = await self._execute_gemini_cli(full_query)
 
+            # Save to conversation history
+            if self.include_history and result.get('output'):
+                self.conversation_history.append((query, result['output']))
+                # Trim history if it exceeds max entries
+                if len(self.conversation_history) > self.max_history_entries:
+                    self.conversation_history = self.conversation_history[-self.max_history_entries:]
+
             # Log consultation
             if self.config.get('log_consultations', True):
                 self.consultation_log.append({
@@ -122,6 +134,30 @@ def detect_uncertainty(self, text: str) -> Tuple[bool, List[str]]:
 
         return len(found_patterns) > 0, found_patterns
 
+    def clear_conversation_history(self) -> Dict[str, Any]:
+        """Clear the conversation history"""
+        old_count = len(self.conversation_history)
+        self.conversation_history = []
+        return {
+            'status': 'success',
+            'cleared_entries': old_count,
+            'message': f'Cleared {old_count} conversation entries'
+        }
+
+    def get_consultation_stats(self) -> Dict[str, Any]:
+        """Get statistics about consultations"""
+        if not self.consultation_log:
+            return {'total_consultations': 0}
+
+        completed = [e for e in self.consultation_log if e.get('status') == 'success']
+
+        return {
+            'total_consultations': len(self.consultation_log),
+            'completed_consultations': len(completed),
+            'average_execution_time': sum(e.get('execution_time', 0) for e in completed) / len(completed) if completed else 0,
+            'conversation_history_size': len(self.conversation_history)
+        }
+
     async def _enforce_rate_limit(self):
         """Enforce rate limiting between consultations"""
         current_time = time.time()
@@ -135,20 +171,37 @@ async def _enforce_rate_limit(self):
 
     def _prepare_query(self, query: str, context: str, comparison_mode: bool) -> str:
         """Prepare the full query for Gemini CLI"""
-        if len(context) > self.max_context_length:
-            context = context[:self.max_context_length] + "\n[Context truncated...]"
-
         parts = []
+
         if comparison_mode:
             parts.append("Please provide a technical analysis and second opinion:")
             parts.append("")
 
+        # Include conversation history if enabled and available
+        if self.include_history and self.conversation_history:
+            parts.append("Previous conversation:")
+            parts.append("-" * 40)
+            for i, (prev_q, prev_a) in enumerate(self.conversation_history[-self.max_history_entries:], 1):
+                parts.append(f"Q{i}: {prev_q}")
+                # Truncate long responses in history
+                if len(prev_a) > 500:
+                    parts.append(f"A{i}: {prev_a[:500]}... [truncated]")
+                else:
+                    parts.append(f"A{i}: {prev_a}")
+                parts.append("")
+            parts.append("-" * 40)
+            parts.append("")
+
+        # Truncate context if too long
+        if len(context) > self.max_context_length:
+            context = context[:self.max_context_length] + "\n[Context truncated...]"
+
         if context:
             parts.append("Context:")
             parts.append(context)
             parts.append("")
 
-        parts.append("Question/Topic:")
+        parts.append("Current Question/Topic:")
         parts.append(query)
 
         if comparison_mode:

diff --git a/mcp-server.py b/mcp-server.py
@@ -9,198 +9,343 @@
 import os
 import sys
 from pathlib import Path
-from typing import Any, Dict, List
+from typing import Any, Dict, List, Optional, Tuple
 
 import mcp.server.stdio
 import mcp.types as types
-from mcp.server import Server
+from mcp.server import NotificationOptions, Server, InitializationOptions
 
-# Import Gemini integration
+# Assuming gemini_integration.py is in the same directory or properly installed
 from gemini_integration import get_integration
 
 
 class MCPServer:
     def __init__(self, project_root: str = None):
         self.project_root = Path(project_root) if project_root else Path.cwd()
-        self.server = Server("mcp-server")
+        self.server = Server("gemini-mcp-server")
 
         # Initialize Gemini integration with singleton pattern
         self.gemini_config = self._load_gemini_config()
         # Get the singleton instance, passing config on first call
         self.gemini = get_integration(self.gemini_config)
 
+        # Track uncertainty for auto-consultation
+        self.last_response_uncertainty = None
+
         self._setup_tools()
 
     def _load_gemini_config(self) -> Dict[str, Any]:
-        """Load Gemini configuration from file and environment"""
-        config = {}
+        """Load Gemini configuration from environment or config file."""
+        # Try to load .env file if it exists
+        env_file = self.project_root / '.env'
+        if env_file.exists():
+            try:
+                with open(env_file, 'r') as f:
+                    for line in f:
+                        line = line.strip()
+                        if line and not line.startswith('#') and '=' in line:
+                            key, value = line.split('=', 1)
+                            # Only set if not already in environment
+                            if key not in os.environ:
+                                os.environ[key] = value
+            except Exception as e:
+                print(f"Warning: Could not load .env file: {e}")
 
-        # Load from config file if exists
-        config_file = self.project_root / "gemini-config.json"
-        if config_file.exists():
-            with open(config_file) as f:
-                config = json.load(f)
-
-        # Override with environment variables
-        env_mapping = {
-            'GEMINI_ENABLED': ('enabled', lambda x: x.lower() == 'true'),
-            'GEMINI_AUTO_CONSULT': ('auto_consult', lambda x: x.lower() == 'true'),
-            'GEMINI_CLI_COMMAND': ('cli_command', str),
-            'GEMINI_TIMEOUT': ('timeout', int),
-            'GEMINI_RATE_LIMIT': ('rate_limit_delay', float),
-            'GEMINI_MODEL': ('model', str),
+        config = {
+            'enabled': os.getenv('GEMINI_ENABLED', 'true').lower() == 'true',
+            'auto_consult': os.getenv('GEMINI_AUTO_CONSULT', 'true').lower() == 'true',
+            'cli_command': os.getenv('GEMINI_CLI_COMMAND', 'gemini'),
+            'timeout': int(os.getenv('GEMINI_TIMEOUT', '60')),
+            'rate_limit_delay': float(os.getenv('GEMINI_RATE_LIMIT', '2')),
+            'max_context_length': int(os.getenv('GEMINI_MAX_CONTEXT', '4000')),
+            'log_consultations': os.getenv('GEMINI_LOG_CONSULTATIONS', 'true').lower() == 'true',
+            'model': os.getenv('GEMINI_MODEL', 'gemini-2.5-flash'),
+            'sandbox_mode': os.getenv('GEMINI_SANDBOX', 'true').lower() == 'true',
+            'debug_mode': os.getenv('GEMINI_DEBUG', 'false').lower() == 'true',
+            'include_history': os.getenv('GEMINI_INCLUDE_HISTORY', 'true').lower() == 'true',
+            'max_history_entries': int(os.getenv('GEMINI_MAX_HISTORY', '10')),
         }
 
-        for env_key, (config_key, converter) in env_mapping.items():
-            value = os.getenv(env_key)
-            if value is not None:
-                config[config_key] = converter(value)
+        # Try to load from config file
+        config_file = self.project_root / 'gemini-config.json'
+        if config_file.exists():
+            try:
+                with open(config_file, 'r') as f:
+                    file_config = json.load(f)
+                config.update(file_config)
+            except Exception as e:
+                print(f"Warning: Could not load gemini-config.json: {e}")
 
         return config
 
     def _setup_tools(self):
         """Register all MCP tools"""
-        @self.server.list_tools()
-        async def handle_list_tools():
-            return [
-                types.Tool(
-                    name="consult_gemini",
-                    description="Consult Gemini for a second opinion or validation",
-                    inputSchema={
-                        "type": "object",
-                        "properties": {
-                            "query": {
-                                "type": "string",
-                                "description": "The question or topic to consult Gemini about"
-                            },
-                            "context": {
-                                "type": "string", 
-                                "description": "Additional context for the consultation"
-                            },
-                            "comparison_mode": {
-                                "type": "boolean",
-                                "description": "Whether to request structured comparison format",
-                                "default": True
-                            }
-                        },
-                        "required": ["query"]
-                    }
-                ),
-                types.Tool(
-                    name="gemini_status",
-                    description="Check Gemini integration status and statistics"
-                ),
-                types.Tool(
-                    name="toggle_gemini_auto_consult", 
-                    description="Enable or disable automatic Gemini consultation",
-                    inputSchema={
-                        "type": "object",
-                        "properties": {
-                            "enable": {
-                                "type": "boolean",
-                                "description": "Enable (true) or disable (false) auto-consultation"
-                            }
-                        }
-                    }
-                )
-            ]
-
+
+        # Gemini consultation tool
         @self.server.call_tool()
-        async def handle_call_tool(name: str, arguments: Dict[str, Any]):
-            if name == "consult_gemini":
-                return await self._handle_consult_gemini(arguments)
-            elif name == "gemini_status":
-                return await self._handle_gemini_status(arguments)
-            elif name == "toggle_gemini_auto_consult":
-                return await self._handle_toggle_auto_consult(arguments)
-            else:
-                raise ValueError(f"Unknown tool: {name}")
-
-    async def _handle_consult_gemini(self, arguments: Dict[str, Any]) -> List[types.TextContent]:
-        """Handle Gemini consultation requests"""
-        query = arguments.get('query', '')
-        context = arguments.get('context', '')
-        comparison_mode = arguments.get('comparison_mode', True)
+        async def consult_gemini(arguments: Dict[str, Any]) -> List[types.TextContent]:
+            """Consult Gemini CLI for a second opinion or validation."""
+            query = arguments.get('query', '')
+            context = arguments.get('context', '')
+            comparison_mode = arguments.get('comparison_mode', True)
+            force_consult = arguments.get('force', False)
+
+            if not query:
+                return [types.TextContent(
+                    type="text",
+                    text="❌ Error: 'query' parameter is required for Gemini consultation"
+                )]
+
+            # Consult Gemini
+            result = await self.gemini.consult_gemini(
+                query=query,
+                context=context,
+                comparison_mode=comparison_mode,
+                force_consult=force_consult
+            )
+
+            # Format the response
+            return await self._format_gemini_response(result)
+
+        @self.server.call_tool()
+        async def gemini_status(arguments: Dict[str, Any]) -> List[types.TextContent]:
+            """Get Gemini integration status and statistics."""
+            return await self._get_gemini_status()
 
-        if not query:
+        @self.server.call_tool()
+        async def toggle_gemini_auto_consult(arguments: Dict[str, Any]) -> List[types.TextContent]:
+            """Toggle automatic Gemini consultation on uncertainty detection."""
+            enable = arguments.get('enable', None)
+
+            if enable is None:
+                # Toggle current state
+                self.gemini.auto_consult = not self.gemini.auto_consult
+            else:
+                self.gemini.auto_consult = bool(enable)
+
+            status = "enabled" if self.gemini.auto_consult else "disabled"
             return [types.TextContent(
                 type="text",
-                text="❌ Error: 'query' parameter is required for Gemini consultation"
+                text=f"✅ Gemini auto-consultation is now {status}"
             )]
 
-        result = await self.gemini.consult_gemini(
-            query=query,
-            context=context,
-            comparison_mode=comparison_mode
-        )
+        @self.server.call_tool()
+        async def clear_gemini_history(arguments: Dict[str, Any]) -> List[types.TextContent]:
+            """Clear Gemini conversation history."""
+            result = self.gemini.clear_conversation_history()
+            return [types.TextContent(
+                type="text",
+                text=f"✅ {result['message']}"
+            )]
+
+    async def _format_gemini_response(self, result: Dict[str, Any]) -> List[types.TextContent]:
+        """Format Gemini consultation response for MCP output."""
+        output_lines = []
+        output_lines.append("🤖 Gemini Consultation Response")
+        output_lines.append("=" * 40)
+        output_lines.append("")
 
         if result['status'] == 'success':
-            response_text = f"🤖 **Gemini Second Opinion**\n\n{result['response']}\n\n"
-            response_text += f"⏱️ *Consultation completed in {result['execution_time']:.2f}s*"
-        else:
-            response_text = f"❌ **Gemini Consultation Failed**\n\nError: {result.get('error', 'Unknown error')}"
+            output_lines.append(f"✅ Consultation ID: {result['consultation_id']}")
+            output_lines.append(f"⏱️  Execution time: {result['execution_time']:.2f}s")
+            output_lines.append("")
+
+            # Display the raw response (simplified format)
+            response = result.get('response', '')
+            if response:
+                output_lines.append("📄 Response:")
+                output_lines.append(response)
 
-        return [types.TextContent(type="text", text=response_text)]
-
-    async def _handle_gemini_status(self, arguments: Dict[str, Any]) -> List[types.TextContent]:
-        """Handle Gemini status requests"""
-        status_lines = [
-            "🤖 **Gemini Integration Status**",
-            "",
-            f"• **Enabled**: {'✅ Yes' if self.gemini.enabled else '❌ No'}",
-            f"• **Auto-consult**: {'✅ Yes' if self.gemini.auto_consult else '❌ No'}",
-            f"• **CLI Command**: `{self.gemini.cli_command}`",
-            f"• **Model**: {self.gemini.model}",
-            f"• **Rate Limit**: {self.gemini.rate_limit_delay}s between calls",
-            f"• **Timeout**: {self.gemini.timeout}s",
-            "",
-            f"📊 **Statistics**:",
-            f"• **Total Consultations**: {len(self.gemini.consultation_log)}",
-        ]
-
-        if self.gemini.consultation_log:
-            recent = self.gemini.consultation_log[-1]
-            status_lines.append(f"• **Last Consultation**: {recent['timestamp']}")
-
-        return [types.TextContent(type="text", text="\n".join(status_lines))]
-
-    async def _handle_toggle_auto_consult(self, arguments: Dict[str, Any]) -> List[types.TextContent]:
-        """Handle toggle auto-consultation requests"""
-        enable = arguments.get('enable')
-
-        if enable is None:
-            # Toggle current state
-            self.gemini.auto_consult = not self.gemini.auto_consult
-        else:
-            self.gemini.auto_consult = enable
-
-        status = "enabled" if self.gemini.auto_consult else "disabled"
-        return [types.TextContent(
-            type="text",
-            text=f"🔄 Auto-consultation has been **{status}**"
-        )]
-
-    async def run(self):
-        """Run the MCP server"""
-        async with mcp.server.stdio.stdio_server() as (read_stream, write_stream):
-            await self.server.run(
-                read_stream,
-                write_stream,
-                self.server.create_initialization_options()
+        elif result['status'] == 'disabled':
+            output_lines.append("ℹ️  Gemini consultation is currently disabled")
+            output_lines.append("💡 Enable with: toggle_gemini_auto_consult")
+
+        elif result['status'] == 'timeout':
+            output_lines.append(f"❌ {result['error']}")
+            output_lines.append("💡 Try increasing the timeout or simplifying the query")
+
+        else:  # error
+            output_lines.append(f"❌ Error: {result.get('error', 'Unknown error')}")
+            output_lines.append("")
+            output_lines.append("💡 Troubleshooting:")
+            output_lines.append("  1. Check if Gemini CLI is installed and in PATH")
+            output_lines.append("  2. Verify Gemini CLI authentication")
+            output_lines.append("  3. Check the logs for more details")
+
+        return [types.TextContent(type="text", text="\n".join(output_lines))]
+
+    async def _get_gemini_status(self) -> List[types.TextContent]:
+        """Get Gemini integration status and statistics."""
+        output_lines = []
+        output_lines.append("🤖 Gemini Integration Status")
+        output_lines.append("=" * 40)
+        output_lines.append("")
+
+        # Configuration status
+        output_lines.append("⚙️  Configuration:")
+        output_lines.append(f"  • Enabled: {'✅ Yes' if self.gemini.enabled else '❌ No'}")
+        output_lines.append(f"  • Auto-consult: {'✅ Yes' if self.gemini.auto_consult else '❌ No'}")
+        output_lines.append(f"  • CLI command: {self.gemini.cli_command}")
+        output_lines.append(f"  • Timeout: {self.gemini.timeout}s")
+        output_lines.append(f"  • Rate limit: {self.gemini.rate_limit_delay}s")
+        output_lines.append("")
+
+        # Check if Gemini CLI is available
+        try:
+            # Test with a simple prompt rather than --version (which may not be supported)
+            check_process = await asyncio.create_subprocess_exec(
+                self.gemini.cli_command, "-p", "test",
+                stdout=asyncio.subprocess.PIPE,
+                stderr=asyncio.subprocess.PIPE
             )
+            stdout, stderr = await asyncio.wait_for(check_process.communicate(), timeout=10)
+
+            if check_process.returncode == 0:
+                output_lines.append("✅ Gemini CLI is available and working")
+                # Try to get version info from help or other means
+                try:
+                    help_process = await asyncio.create_subprocess_exec(
+                        self.gemini.cli_command, "--help",
+                        stdout=asyncio.subprocess.PIPE,
+                        stderr=asyncio.subprocess.PIPE
+                    )
+                    help_stdout, _ = await help_process.communicate()
+                    help_text = help_stdout.decode()
+                    # Look for version in help output
+                    if "version" in help_text.lower():
+                        for line in help_text.split('\n'):
+                            if 'version' in line.lower():
+                                output_lines.append(f"  {line.strip()}")
+                                break
+                except:
+                    pass
+            else:
+                error_msg = stderr.decode() if stderr else "Unknown error"
+                output_lines.append("❌ Gemini CLI found but not working properly")
+                output_lines.append(f"  Command tested: {self.gemini.cli_command}")
+                output_lines.append(f"  Error: {error_msg}")
+
+                # Check for authentication issues
+                if "authentication" in error_msg.lower() or "api key" in error_msg.lower():
+                    output_lines.append("")
+                    output_lines.append("🔑 Authentication required:")
+                    output_lines.append("  1. Set GEMINI_API_KEY environment variable, or")
+                    output_lines.append("  2. Run 'gemini' interactively to authenticate with Google")
+
+        except asyncio.TimeoutError:
+            output_lines.append("❌ Gemini CLI test timed out")
+            output_lines.append("  This may indicate authentication is required")
+        except FileNotFoundError:
+            output_lines.append("❌ Gemini CLI not found in PATH")
+            output_lines.append(f"  Expected command: {self.gemini.cli_command}")
+            output_lines.append("")
+            output_lines.append("📦 Installation:")
+            output_lines.append("  npm install -g @google/gemini-cli")
+            output_lines.append("  OR")
+            output_lines.append("  npx @google/gemini-cli")
+        except Exception as e:
+            output_lines.append(f"❌ Error checking Gemini CLI: {str(e)}")
+
+        output_lines.append("")
+
+        # Consultation statistics
+        stats = self.gemini.get_consultation_stats()
+        output_lines.append("📊 Consultation Statistics:")
+        output_lines.append(f"  • Total consultations: {stats.get('total_consultations', 0)}")
+
+        completed = stats.get('completed_consultations', 0)
+        output_lines.append(f"  • Completed: {completed}")
+
+        if completed > 0:
+            avg_time = stats.get('average_execution_time', 0)
+            total_time = stats.get('total_execution_time', 0)
+            output_lines.append(f"  • Average time: {avg_time:.2f}s")
+            output_lines.append(f"  • Total time: {total_time:.2f}s")
+
+        last_consultation = stats.get('last_consultation')
+        if last_consultation:
+            output_lines.append(f"  • Last consultation: {last_consultation}")
+
+        output_lines.append("")
+        output_lines.append("💡 Usage:")
+        output_lines.append("  • Direct: Use 'consult_gemini' tool")
+        output_lines.append("  • Auto: Enable auto-consult for uncertainty detection")
+        output_lines.append("  • Toggle: Use 'toggle_gemini_auto_consult' tool")
+
+        return [types.TextContent(type="text", text="\n".join(output_lines))]
+
+    def detect_response_uncertainty(self, response: str) -> Tuple[bool, List[str]]:
+        """
+        Detect uncertainty in a response for potential auto-consultation.
+        This is a wrapper around the GeminiIntegration's detection.
+        """
+        return self.gemini.detect_uncertainty(response)
+
+    async def maybe_consult_gemini(self, response: str, context: str = "") -> Optional[Dict[str, Any]]:
+        """
+        Check if response contains uncertainty and consult Gemini if needed.
+        
+        Args:
+            response: The response to check for uncertainty
+            context: Additional context for the consultation
+            
+        Returns:
+            Gemini consultation result if consulted, None otherwise
+        """
+        if not self.gemini.auto_consult or not self.gemini.enabled:
+            return None
+
+        has_uncertainty, patterns = self.detect_response_uncertainty(response)
+
+        if has_uncertainty:
+            # Extract the main question or topic from the response
+            query = f"Please provide a second opinion on this analysis:\n\n{response}"
+
+            # Add uncertainty patterns to context
+            enhanced_context = f"{context}\n\nUncertainty detected in: {', '.join(patterns)}"
+
+            result = await self.gemini.consult_gemini(
+                query=query,
+                context=enhanced_context,
+                comparison_mode=True
+            )
+
+            return result
+
+        return None
+
+    def run(self):
+        """Run the MCP server."""
+        async def main():
+            async with mcp.server.stdio.stdio_server() as (read_stream, write_stream):
+                await self.server.run(
+                    read_stream,
+                    write_stream,
+                    InitializationOptions(
+                        server_name="gemini-mcp-server",
+                        server_version="1.0.0",
+                        capabilities=self.server.get_capabilities(
+                            notification_options=NotificationOptions(),
+                            experimental_capabilities={},
+                        ),
+                    ),
+                )
+
+        asyncio.run(main())
 
 
-async def main():
+if __name__ == "__main__":
     import argparse
 
     parser = argparse.ArgumentParser(description="MCP Server with Gemini Integration")
-    parser.add_argument("--project-root", type=str, default=".", 
-                       help="Project root directory")
+    parser.add_argument(
+        "--project-root",
+        type=str,
+        default=".",
+        help="Path to the project root directory"
+    )
+
     args = parser.parse_args()
 
-    server = MCPServer(project_root=args.project_root)
-    await server.run()
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
+    server = MCPServer(args.project_root)
+    server.run()
diff --git a/test_gemini_state.py b/test_gemini_state.py
@@ -0,0 +1,131 @@
+#!/usr/bin/env python3
+"""Test Gemini state management with automatic history inclusion."""
+
+import asyncio
+import sys
+from pathlib import Path
+
+from gemini_integration import get_integration
+
+async def test_automatic_state():
+    """Test if Gemini automatically maintains state through history."""
+    print("🧪 Testing Automatic Gemini State Management")
+    print("=" * 50)
+
+    # Use singleton
+    gemini = get_integration({
+        'enabled': True,
+        'cli_command': 'gemini',
+        'timeout': 30,
+        'include_history': True,  # This enables automatic history inclusion
+        'max_history_entries': 10,
+        'debug_mode': False
+    })
+
+    # Clear any existing history
+    gemini.clear_conversation_history()
+
+    print("\n1️⃣  First Question: What is 2+2?")
+    print("-" * 30)
+
+    try:
+        response1 = await gemini.consult_gemini(
+            query="What is 2+2?",
+            context="",  # No context needed
+            force_consult=True
+        )
+
+        if response1.get('status') == 'success':
+            response_text = response1.get('response', '')
+            print(f"✅ Success! Gemini responded.")
+
+            # Find and show the answer
+            lines = response_text.strip().split('\n')
+            for line in lines:
+                if '4' in line or 'four' in line.lower():
+                    print(f"📝 Found answer: {line.strip()[:100]}...")
+                    break
+        else:
+            print(f"❌ Error: {response1.get('error', 'Unknown error')}")
+            return
+
+    except Exception as e:
+        print(f"❌ Exception: {e}")
+        return
+
+    print(f"\n📊 Conversation history size: {len(gemini.conversation_history)}")
+
+    print("\n2️⃣  Second Question: What is that doubled?")
+    print("   (No context provided - relying on automatic history)")
+    print("-" * 30)
+
+    try:
+        # This time, provide NO context at all - let the history do the work
+        response2 = await gemini.consult_gemini(
+            query="What is that doubled?",
+            context="",  # Empty context - history should provide the context
+            force_consult=True
+        )
+
+        if response2.get('status') == 'success':
+            response_text = response2.get('response', '')
+            print(f"✅ Success! Gemini responded.")
+
+            # Check if it understood the context
+            if '8' in response_text or 'eight' in response_text.lower():
+                print("🎉 STATE MAINTAINED! Gemini understood 'that' referred to 4")
+                print("📝 Found reference to 8 in the response")
+
+                # Find and show where 8 appears
+                for line in response_text.split('\n'):
+                    if '8' in line or 'eight' in line.lower():
+                        print(f"📝 Context: {line.strip()[:100]}...")
+                        break
+            else:
+                print("⚠️  Gemini may not have maintained state properly")
+                print("📝 Response doesn't clearly reference 8")
+                print(f"First 200 chars: {response_text[:200]}...")
+
+        else:
+            print(f"❌ Error: {response2.get('error', 'Unknown error')}")
+
+    except Exception as e:
+        print(f"❌ Exception: {e}")
+
+    print(f"\n📊 Final conversation history size: {len(gemini.conversation_history)}")
+
+    # Test with history disabled
+    print("\n3️⃣  Testing with History Disabled")
+    print("-" * 30)
+
+    # Disable history
+    gemini.include_history = False
+    gemini.clear_conversation_history()
+
+    # Ask first question
+    await gemini.consult_gemini(
+        query="What is 3+3?",
+        context="",
+        force_consult=True
+    )
+
+    # Ask follow-up
+    response3 = await gemini.consult_gemini(
+        query="What is that tripled?",
+        context="",
+        force_consult=True
+    )
+
+    if response3.get('status') == 'success':
+        response_text = response3.get('response', '')
+        if '18' in response_text or 'eighteen' in response_text.lower():
+            print("❌ UNEXPECTED: Found 18 even without history!")
+        else:
+            print("✅ EXPECTED: Without history, Gemini doesn't understand 'that'")
+            # Show what Gemini says when it doesn't have context
+            print(f"Response preview: {response_text[:200]}...")
+
+    print("\n✅ Test complete!")
+
+if __name__ == "__main__":
+    asyncio.run(test_automatic_state())
diff --git a/README.md b/README.md
@@ -2,7 +2,9 @@
 
 A complete setup guide for integrating Google's Gemini CLI with Claude Code through an MCP (Model Context Protocol) server. This provides automatic second opinion consultation when Claude expresses uncertainty or encounters complex technical decisions.
 
-![image](https://gist.github.com/user-attachments/assets/507ce5cd-30cd-4408-bb96-77508e7e4ac6)
+<div align="center">
+  <img src="https://gist.github.com/user-attachments/assets/507ce5cd-30cd-4408-bb96-77508e7e4ac6" />
+</div>
 
 ## 🚀 Quick Start
 

diff --git a/README.md b/README.md
@@ -54,15 +54,15 @@ All files should be placed in the same directory for easy deployment.
 ```bash
 # Start MCP server directly on host
 cd your-project
-python3 tools/mcp/mcp-server.py --project-root .
+python3 mcp-server.py --project-root .
 
 # Or with environment variables
 GEMINI_ENABLED=true \
 GEMINI_AUTO_CONSULT=true \
 GEMINI_CLI_COMMAND=gemini \
 GEMINI_TIMEOUT=200 \
 GEMINI_RATE_LIMIT=2 \
-python3 tools/mcp/mcp-server.py --project-root .
+python3 mcp-server.py --project-root .
 ```
 
 ### Claude Code Configuration
@@ -72,7 +72,7 @@ Create `mcp-config.json`:
   "mcpServers": {
     "project": {
       "command": "python3",
-      "args": ["tools/mcp/mcp-server.py", "--project-root", "."],
+      "args": ["mcp-server.py", "--project-root", "."],
       "cwd": "/path/to/your/project",
       "env": {
         "GEMINI_ENABLED": "true",

diff --git a/setup-gemini-integration.sh b/setup-gemini-integration.sh
@@ -58,7 +58,7 @@ cat > mcp-config.json << 'EOF'
   "mcpServers": {
     "project": {
       "command": "python3",
-      "args": ["tools/mcp/mcp-server.py", "--project-root", "."],
+      "args": ["mcp-server.py", "--project-root", "."],
       "env": {
         "GEMINI_ENABLED": "true",
         "GEMINI_AUTO_CONSULT": "true"
@@ -73,10 +73,10 @@ echo "🎉 Gemini CLI Integration setup complete!"
 echo ""
 echo "📋 Next steps:"
 echo "1. Copy the provided code files to your project:"
-echo "   - tools/gemini/gemini_integration.py"
-echo "   - tools/mcp/mcp-server.py"
+echo "   - gemini_integration.py"
+echo "   - mcp-server.py"
 echo "2. Install Python dependencies: pip install mcp pydantic"
-echo "3. Test with: python3 tools/mcp/mcp-server.py --project-root ."
+echo "3. Test with: python3 mcp-server.py --project-root ."
 echo "4. Configure Claude Code to use the MCP server"
 echo ""
 echo "💡 Tip: First run 'gemini' command to authenticate with your Google account"
diff --git a/README.md b/README.md
@@ -113,7 +113,7 @@ GEMINI_TIMEOUT=200                    # Query timeout in seconds
 GEMINI_RATE_LIMIT=5                   # Delay between calls (seconds)
 GEMINI_MAX_CONTEXT=                   # Max context length
 GEMINI_MODEL=gemini-2.5-flash         # Model to use
-GEMINI_API_KEY=                       # Optional (blank for free tier)
+GEMINI_API_KEY=                       # Optional (blank for free tier, keys disable free mode!)
 ```
 
 ### Gemini Configuration File

diff --git a/README.md b/README.md
@@ -38,15 +38,18 @@ echo "Best practices for microservice authentication?" | gemini -m gemini-2.5-pr
 - **Auto-consultation**: Detects uncertainty patterns in Claude responses
 - **Manual consultation**: On-demand second opinions via MCP tools
 - **Response synthesis**: Combines both AI perspectives
+- **Singleton Pattern**: Ensures consistent state management across all tool calls
 
 ### Key Files Structure
 ```
-├── tools/mcp/mcp-server.py            # Enhanced MCP server (host-based)
-├── tools/gemini/gemini_integration.py # Core integration module  
-├── gemini-config.json                 # Gemini configuration
-└── mcp-config.json                    # Claude Code MCP config
+├── mcp-server.py            # Enhanced MCP server with Gemini tools
+├── gemini_integration.py    # Core integration module with singleton pattern
+├── gemini-config.json       # Gemini configuration
+└── setup-gemini-integration.sh  # Setup script
 ```
 
+All files should be placed in the same directory for easy deployment.
+
 ### Host-Based MCP Server Setup
 ```bash
 # Start MCP server directly on host
@@ -193,6 +196,31 @@ class GeminiIntegration:
         """Detect if text contains uncertainty patterns"""
         return any(re.search(pattern, text, re.IGNORECASE) 
                   for pattern in UNCERTAINTY_PATTERNS)
+
+# Singleton pattern implementation
+_integration = None
+
+def get_integration(config: Optional[Dict[str, Any]] = None) -> GeminiIntegration:
+    """Get or create the global Gemini integration instance"""
+    global _integration
+    if _integration is None:
+        _integration = GeminiIntegration(config)
+    return _integration
+```
+
+### Singleton Pattern Benefits
+The singleton pattern ensures:
+- **Consistent Rate Limiting**: All MCP tool calls share the same rate limiter
+- **Unified Configuration**: Changes to config affect all usage points
+- **State Persistence**: Consultation history and statistics are maintained
+- **Resource Efficiency**: Only one instance manages the Gemini CLI connection
+
+### Usage in MCP Server
+```python
+from gemini_integration import get_integration
+
+# Get the singleton instance
+self.gemini = get_integration(config)
 ```
 
 ## 📋 Example Workflows

diff --git a/gemini-config.json b/gemini-config.json
@@ -0,0 +1,16 @@
+{
+  "enabled": true,
+  "auto_consult": true,
+  "cli_command": "gemini",
+  "timeout": 30,
+  "rate_limit_delay": 5.0,
+  "log_consultations": true,
+  "model": "gemini-2.5-pro",
+  "sandbox_mode": true,
+  "debug_mode": false,
+  "uncertainty_thresholds": {
+    "uncertainty_patterns": true,
+    "complex_decisions": true,
+    "critical_operations": true
+  }
+}
diff --git a/gemini_integration.py b/gemini_integration.py
@@ -199,4 +199,28 @@ async def _execute_gemini_cli(self, query: str) -> Dict[str, Any]:
             }
 
         except asyncio.TimeoutError:
-            raise Exception(f"Gemini CLI timed out after {self.timeout} seconds")
+            raise Exception(f"Gemini CLI timed out after {self.timeout} seconds")
+
+
+# Singleton pattern implementation
+_integration = None
+
+
+def get_integration(config: Optional[Dict[str, Any]] = None) -> GeminiIntegration:
+    """
+    Get or create the global Gemini integration instance.
+    
+    This ensures that all parts of the application share the same instance,
+    maintaining consistent state for rate limiting, consultation history,
+    and configuration across all tool calls.
+    
+    Args:
+        config: Optional configuration dict. Only used on first call.
+    
+    Returns:
+        The singleton GeminiIntegration instance
+    """
+    global _integration
+    if _integration is None:
+        _integration = GeminiIntegration(config)
+    return _integration
diff --git a/mcp-server.py b/mcp-server.py
@@ -16,18 +16,18 @@
 from mcp.server import Server
 
 # Import Gemini integration
-sys.path.append(str(Path(__file__).parent.parent / "gemini"))
-from gemini_integration import GeminiIntegration
+from gemini_integration import get_integration
 
 
 class MCPServer:
     def __init__(self, project_root: str = None):
         self.project_root = Path(project_root) if project_root else Path.cwd()
         self.server = Server("mcp-server")
 
-        # Initialize Gemini integration
+        # Initialize Gemini integration with singleton pattern
         self.gemini_config = self._load_gemini_config()
-        self.gemini = GeminiIntegration(self.gemini_config)
+        # Get the singleton instance, passing config on first call
+        self.gemini = get_integration(self.gemini_config)
 
         self._setup_tools()
 

diff --git a/setup-gemini-integration.sh b/setup-gemini-integration.sh
@@ -31,9 +31,8 @@ else
     exit 1
 fi
 
-# Create directories
-echo "📁 Creating project structure..."
-mkdir -p tools/mcp tools/gemini
+# Files can be placed in the same directory - no complex structure needed
+echo "📁 Setting up in current directory..."
 
 # Create default configuration
 echo "⚙️ Creating default configuration..."

diff --git a/README.md b/README.md
@@ -2,7 +2,7 @@
 
 A complete setup guide for integrating Google's Gemini CLI with Claude Code through an MCP (Model Context Protocol) server. This provides automatic second opinion consultation when Claude expresses uncertainty or encounters complex technical decisions.
 
-![image](https://gist.github.com/user-attachments/assets/5aec2b9e-9102-4e65-b9b0-ea3e08b9c25e)
+![image](https://gist.github.com/user-attachments/assets/507ce5cd-30cd-4408-bb96-77508e7e4ac6)
 
 ## 🚀 Quick Start
 

diff --git a/README.md b/README.md
@@ -2,6 +2,8 @@
 
 A complete setup guide for integrating Google's Gemini CLI with Claude Code through an MCP (Model Context Protocol) server. This provides automatic second opinion consultation when Claude expresses uncertainty or encounters complex technical decisions.
 
+![image](https://gist.github.com/user-attachments/assets/5aec2b9e-9102-4e65-b9b0-ea3e08b9c25e)
+
 ## 🚀 Quick Start
 
 ### 1. Install Gemini CLI (Host-based)

diff --git a/README.md b/README.md
@@ -55,7 +55,7 @@ python3 tools/mcp/mcp-server.py --project-root .
 GEMINI_ENABLED=true \
 GEMINI_AUTO_CONSULT=true \
 GEMINI_CLI_COMMAND=gemini \
-GEMINI_TIMEOUT=60 \
+GEMINI_TIMEOUT=200 \
 GEMINI_RATE_LIMIT=2 \
 python3 tools/mcp/mcp-server.py --project-root .
 ```
@@ -104,9 +104,9 @@ Automatically detects patterns like:
 GEMINI_ENABLED=true                   # Enable integration
 GEMINI_AUTO_CONSULT=true              # Auto-consult on uncertainty
 GEMINI_CLI_COMMAND=gemini             # CLI command to use
-GEMINI_TIMEOUT=60                     # Query timeout in seconds
-GEMINI_RATE_LIMIT=2                   # Delay between calls (seconds)
-GEMINI_MAX_CONTEXT=4000               # Max context length
+GEMINI_TIMEOUT=200                    # Query timeout in seconds
+GEMINI_RATE_LIMIT=5                   # Delay between calls (seconds)
+GEMINI_MAX_CONTEXT=                   # Max context length
 GEMINI_MODEL=gemini-2.5-flash         # Model to use
 GEMINI_API_KEY=                       # Optional (blank for free tier)
 ```
@@ -118,9 +118,8 @@ Create `gemini-config.json`:
   "enabled": true,
   "auto_consult": true,
   "cli_command": "gemini",
-  "timeout": 60,
-  "rate_limit_delay": 2.0,
-  "max_context_length": 4000,
+  "timeout": 300,
+  "rate_limit_delay": 5.0,
   "log_consultations": true,
   "model": "gemini-2.5-flash",
   "sandbox_mode": true,

diff --git a/README.md b/README.md
@@ -39,10 +39,10 @@ echo "Best practices for microservice authentication?" | gemini -m gemini-2.5-pr
 
 ### Key Files Structure
 ```
-├── tools/mcp/mcp-server.py          # Enhanced MCP server (host-based)
+├── tools/mcp/mcp-server.py            # Enhanced MCP server (host-based)
 ├── tools/gemini/gemini_integration.py # Core integration module  
-├── gemini-config.json               # Gemini configuration
-└── mcp-config.json                  # Claude Code MCP config
+├── gemini-config.json                 # Gemini configuration
+└── mcp-config.json                    # Claude Code MCP config
 ```
 
 ### Host-Based MCP Server Setup

diff --git a/README.md b/README.md
@@ -101,7 +101,7 @@ Automatically detects patterns like:
 
 ### Environment Variables
 ```bash
-GEMINI_ENABLED=true                    # Enable integration
+GEMINI_ENABLED=true                   # Enable integration
 GEMINI_AUTO_CONSULT=true              # Auto-consult on uncertainty
 GEMINI_CLI_COMMAND=gemini             # CLI command to use
 GEMINI_TIMEOUT=60                     # Query timeout in seconds

diff --git a/README.md b/README.md
@@ -107,7 +107,7 @@ GEMINI_CLI_COMMAND=gemini             # CLI command to use
 GEMINI_TIMEOUT=60                     # Query timeout in seconds
 GEMINI_RATE_LIMIT=2                   # Delay between calls (seconds)
 GEMINI_MAX_CONTEXT=4000               # Max context length
-GEMINI_MODEL=gemini-1.5-pro-latest    # Model to use
+GEMINI_MODEL=gemini-2.5-flash         # Model to use
 GEMINI_API_KEY=                       # Optional (blank for free tier)
 ```
 
@@ -122,7 +122,7 @@ Create `gemini-config.json`:
   "rate_limit_delay": 2.0,
   "max_context_length": 4000,
   "log_consultations": true,
-  "model": "gemini-1.5-pro-latest",
+  "model": "gemini-2.5-flash",
   "sandbox_mode": true,
   "debug_mode": false,
   "uncertainty_thresholds": {

diff --git a/gemini_integration.py b/gemini_integration.py
@@ -54,7 +54,7 @@ def __init__(self, config: Optional[Dict[str, Any]] = None):
         self.last_consultation = 0
         self.consultation_log = []
         self.max_context_length = self.config.get('max_context_length', 4000)
-        self.model = self.config.get('model', 'gemini-1.5-pro-latest')
+        self.model = self.config.get('model', 'gemini-2.5-flash')
 
     async def consult_gemini(self, query: str, context: str = "", 
                            comparison_mode: bool = True, 

diff --git a/setup-gemini-integration.sh b/setup-gemini-integration.sh
@@ -46,7 +46,7 @@ cat > gemini-config.json << 'EOF'
   "rate_limit_delay": 2.0,
   "max_context_length": 4000,
   "log_consultations": true,
-  "model": "gemini-1.5-pro-latest",
+  "model": "gemini-2.5-flash",
   "sandbox_mode": false,
   "debug_mode": false
 }

diff --git a/README.md b/README.md
@@ -0,0 +1,290 @@
+# Gemini CLI Integration for Claude Code MCP Server
+
+A complete setup guide for integrating Google's Gemini CLI with Claude Code through an MCP (Model Context Protocol) server. This provides automatic second opinion consultation when Claude expresses uncertainty or encounters complex technical decisions.
+
+## 🚀 Quick Start
+
+### 1. Install Gemini CLI (Host-based)
+```bash
+# Switch to Node.js 22.16.0
+nvm use 22.16.0
+
+# Install Gemini CLI globally
+npm install -g @google/gemini-cli
+
+# Test installation
+gemini --help
+
+# Authenticate with Google account (free tier: 60 req/min, 1,000/day)
+# Authentication happens automatically on first use
+```
+
+### 2. Direct Usage (Fastest)
+```bash
+# Direct consultation (no container setup needed)
+echo "Your question here" | gemini
+
+# Example: Technical questions
+echo "Best practices for microservice authentication?" | gemini -m gemini-2.5-pro
+```
+
+## 🏠 Host-Based MCP Integration
+
+### Architecture Overview
+- **Host-Based Setup**: Both MCP server and Gemini CLI run on host machine
+- **Why Host-Only**: Gemini CLI requires interactive authentication and avoids Docker-in-Docker complexity
+- **Auto-consultation**: Detects uncertainty patterns in Claude responses
+- **Manual consultation**: On-demand second opinions via MCP tools
+- **Response synthesis**: Combines both AI perspectives
+
+### Key Files Structure
+```
+├── tools/mcp/mcp-server.py          # Enhanced MCP server (host-based)
+├── tools/gemini/gemini_integration.py # Core integration module  
+├── gemini-config.json               # Gemini configuration
+└── mcp-config.json                  # Claude Code MCP config
+```
+
+### Host-Based MCP Server Setup
+```bash
+# Start MCP server directly on host
+cd your-project
+python3 tools/mcp/mcp-server.py --project-root .
+
+# Or with environment variables
+GEMINI_ENABLED=true \
+GEMINI_AUTO_CONSULT=true \
+GEMINI_CLI_COMMAND=gemini \
+GEMINI_TIMEOUT=60 \
+GEMINI_RATE_LIMIT=2 \
+python3 tools/mcp/mcp-server.py --project-root .
+```
+
+### Claude Code Configuration
+Create `mcp-config.json`:
+```json
+{
+  "mcpServers": {
+    "project": {
+      "command": "python3",
+      "args": ["tools/mcp/mcp-server.py", "--project-root", "."],
+      "cwd": "/path/to/your/project",
+      "env": {
+        "GEMINI_ENABLED": "true",
+        "GEMINI_AUTO_CONSULT": "true", 
+        "GEMINI_CLI_COMMAND": "gemini"
+      }
+    }
+  }
+}
+```
+
+## 🤖 Core Features
+
+### 1. Uncertainty Detection
+Automatically detects patterns like:
+- "I'm not sure", "I think", "possibly", "probably"
+- "Multiple approaches", "trade-offs", "alternatives"
+- Critical operations: "security", "production", "database migration"
+
+### 2. MCP Tools Available
+- `consult_gemini` - Manual consultation with context
+- `gemini_status` - Check integration status and statistics  
+- `toggle_gemini_auto_consult` - Enable/disable auto-consultation
+
+### 3. Response Synthesis
+- Identifies agreement/disagreement between Claude and Gemini
+- Provides confidence levels (high/medium/low)
+- Generates combined recommendations
+
+## ⚙️ Configuration
+
+### Environment Variables
+```bash
+GEMINI_ENABLED=true                    # Enable integration
+GEMINI_AUTO_CONSULT=true              # Auto-consult on uncertainty
+GEMINI_CLI_COMMAND=gemini             # CLI command to use
+GEMINI_TIMEOUT=60                     # Query timeout in seconds
+GEMINI_RATE_LIMIT=2                   # Delay between calls (seconds)
+GEMINI_MAX_CONTEXT=4000               # Max context length
+GEMINI_MODEL=gemini-1.5-pro-latest    # Model to use
+GEMINI_API_KEY=                       # Optional (blank for free tier)
+```
+
+### Gemini Configuration File
+Create `gemini-config.json`:
+```json
+{
+  "enabled": true,
+  "auto_consult": true,
+  "cli_command": "gemini",
+  "timeout": 60,
+  "rate_limit_delay": 2.0,
+  "max_context_length": 4000,
+  "log_consultations": true,
+  "model": "gemini-1.5-pro-latest",
+  "sandbox_mode": true,
+  "debug_mode": false,
+  "uncertainty_thresholds": {
+    "uncertainty_patterns": true,
+    "complex_decisions": true,
+    "critical_operations": true
+  }
+}
+```
+
+## 🧠 Integration Module Core
+
+### Uncertainty Patterns (Python)
+```python
+UNCERTAINTY_PATTERNS = [
+    r"\bI'm not sure\b",
+    r"\bI think\b", 
+    r"\bpossibly\b",
+    r"\bprobably\b",
+    r"\bmight be\b",
+    r"\bcould be\b",
+    # ... more patterns
+]
+
+COMPLEX_DECISION_PATTERNS = [
+    r"\bmultiple approaches\b",
+    r"\bseveral options\b", 
+    r"\btrade-offs?\b",
+    r"\balternatives?\b",
+    # ... more patterns
+]
+
+CRITICAL_OPERATION_PATTERNS = [
+    r"\bproduction\b",
+    r"\bdatabase migration\b",
+    r"\bsecurity\b",
+    r"\bauthentication\b",
+    # ... more patterns
+]
+```
+
+### Basic Integration Class Structure
+```python
+class GeminiIntegration:
+    def __init__(self, config: Optional[Dict[str, Any]] = None):
+        self.config = config or {}
+        self.enabled = self.config.get('enabled', True)
+        self.auto_consult = self.config.get('auto_consult', True)
+        self.cli_command = self.config.get('cli_command', 'gemini')
+        self.timeout = self.config.get('timeout', 30)
+        self.rate_limit_delay = self.config.get('rate_limit_delay', 1)
+
+    async def consult_gemini(self, query: str, context: str = "") -> Dict[str, Any]:
+        """Consult Gemini CLI for second opinion"""
+        # Rate limiting
+        await self._enforce_rate_limit()
+
+        # Prepare query with context
+        full_query = self._prepare_query(query, context)
+
+        # Execute Gemini CLI command
+        result = await self._execute_gemini_command(full_query)
+
+        return result
+
+    def detect_uncertainty(self, text: str) -> bool:
+        """Detect if text contains uncertainty patterns"""
+        return any(re.search(pattern, text, re.IGNORECASE) 
+                  for pattern in UNCERTAINTY_PATTERNS)
+```
+
+## 📋 Example Workflows
+
+### Manual Consultation
+```python
+# In Claude Code
+Use the consult_gemini tool with:
+query: "Should I use WebSockets or gRPC for real-time communication?"
+context: "Building a multiplayer application with real-time updates"
+```
+
+### Automatic Consultation Flow
+```
+User: "How should I handle authentication?"
+
+Claude: "I think OAuth might work, but I'm not certain about the security implications..."
+
+[Auto-consultation triggered]
+
+Gemini: "For authentication, consider these approaches: 1) OAuth 2.0 with PKCE for web apps..."
+
+Synthesis: Both suggest OAuth but Claude uncertain about security. Gemini provides specific implementation details. Recommendation: Follow Gemini's OAuth 2.0 with PKCE approach.
+```
+
+## 🔧 MCP Server Integration
+
+### Tool Definitions
+```python
+@server.list_tools()
+async def handle_list_tools():
+    return [
+        types.Tool(
+            name="consult_gemini",
+            description="Consult Gemini for a second opinion",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "query": {"type": "string", "description": "Question for Gemini"},
+                    "context": {"type": "string", "description": "Additional context"}
+                },
+                "required": ["query"]
+            }
+        ),
+        types.Tool(
+            name="gemini_status", 
+            description="Check Gemini integration status"
+        ),
+        types.Tool(
+            name="toggle_gemini_auto_consult",
+            description="Enable/disable automatic consultation",
+            inputSchema={
+                "type": "object", 
+                "properties": {
+                    "enable": {"type": "boolean", "description": "Enable or disable"}
+                }
+            }
+        )
+    ]
+```
+
+## 🚨 Troubleshooting
+
+| Issue | Solution |
+|-------|----------|
+| Gemini CLI not found | Install Node.js 18+ and `npm install -g @google/gemini-cli` |
+| Authentication errors | Run `gemini` and sign in with Google account |
+| Node version issues | Use `nvm use 22.16.0` |
+| Timeout errors | Increase `GEMINI_TIMEOUT` (default: 60s) |
+| Auto-consult not working | Check `GEMINI_AUTO_CONSULT=true` |
+| Rate limiting | Adjust `GEMINI_RATE_LIMIT` (default: 2s) |
+
+## 🔐 Security Considerations
+
+1. **API Credentials**: Store securely, use environment variables
+2. **Data Privacy**: Be cautious about sending proprietary code
+3. **Input Sanitization**: Sanitize queries before sending
+4. **Rate Limiting**: Respect API limits (free tier: 60/min, 1000/day)
+5. **Host-Based Architecture**: Both Gemini CLI and MCP server run on host for auth compatibility and simplicity
+
+## 📈 Best Practices
+
+1. **Rate Limiting**: Implement appropriate delays between calls
+2. **Context Management**: Keep context concise and relevant  
+3. **Error Handling**: Always handle Gemini failures gracefully
+4. **User Control**: Allow users to disable auto-consultation
+5. **Logging**: Log consultations for debugging and analysis
+6. **Caching**: Cache similar queries to reduce API calls
+
+## 🎯 Use Cases
+
+- **Architecture Decisions**: Get second opinions on design choices
+- **Security Reviews**: Validate security implementations
+- **Performance Optimization**: Compare optimization strategies  
+- **Code Quality**: Review complex algorithms or patterns
+- **Troubleshooting**: Debug complex technical issues
diff --git a/gemini_integration.py b/gemini_integration.py
@@ -0,0 +1,202 @@
+#!/usr/bin/env python3
+"""
+Gemini CLI Integration Module
+Provides automatic consultation with Gemini for second opinions and validation
+"""
+
+import asyncio
+import json
+import logging
+import re
+import subprocess
+import time
+from datetime import datetime
+from pathlib import Path
+from typing import Any, Dict, List, Optional, Tuple
+
+# Setup logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+
+# Uncertainty patterns that trigger automatic Gemini consultation
+UNCERTAINTY_PATTERNS = [
+    r"\bI'm not sure\b", r"\bI think\b", r"\bpossibly\b", r"\bprobably\b",
+    r"\bmight be\b", r"\bcould be\b", r"\bI believe\b", r"\bIt seems\b",
+    r"\bappears to be\b", r"\buncertain\b", r"\bI would guess\b",
+    r"\blikely\b", r"\bperhaps\b", r"\bmaybe\b", r"\bI assume\b"
+]
+
+# Complex decision patterns that benefit from second opinions
+COMPLEX_DECISION_PATTERNS = [
+    r"\bmultiple approaches\b", r"\bseveral options\b", r"\btrade-offs?\b",
+    r"\bconsider(?:ing)?\b", r"\balternatives?\b", r"\bpros and cons\b",
+    r"\bweigh(?:ing)? the options\b", r"\bchoice between\b", r"\bdecision\b"
+]
+
+# Critical operations that should trigger consultation
+CRITICAL_OPERATION_PATTERNS = [
+    r"\bproduction\b", r"\bdatabase migration\b", r"\bsecurity\b",
+    r"\bauthentication\b", r"\bencryption\b", r"\bAPI key\b",
+    r"\bcredentials?\b", r"\bperformance\s+critical\b"
+]
+
+
+class GeminiIntegration:
+    """Handles Gemini CLI integration for second opinions and validation"""
+
+    def __init__(self, config: Optional[Dict[str, Any]] = None):
+        self.config = config or {}
+        self.enabled = self.config.get('enabled', True)
+        self.auto_consult = self.config.get('auto_consult', True)
+        self.cli_command = self.config.get('cli_command', 'gemini')
+        self.timeout = self.config.get('timeout', 60)
+        self.rate_limit_delay = self.config.get('rate_limit_delay', 2.0)
+        self.last_consultation = 0
+        self.consultation_log = []
+        self.max_context_length = self.config.get('max_context_length', 4000)
+        self.model = self.config.get('model', 'gemini-1.5-pro-latest')
+
+    async def consult_gemini(self, query: str, context: str = "", 
+                           comparison_mode: bool = True, 
+                           force_consult: bool = False) -> Dict[str, Any]:
+        """Consult Gemini CLI for second opinion"""
+        if not self.enabled:
+            return {'status': 'disabled', 'message': 'Gemini integration is disabled'}
+
+        if not force_consult:
+            await self._enforce_rate_limit()
+
+        consultation_id = f"consult_{int(time.time())}"
+
+        try:
+            # Prepare query with context
+            full_query = self._prepare_query(query, context, comparison_mode)
+
+            # Execute Gemini CLI command
+            result = await self._execute_gemini_cli(full_query)
+
+            # Log consultation
+            if self.config.get('log_consultations', True):
+                self.consultation_log.append({
+                    'id': consultation_id,
+                    'timestamp': datetime.now().isoformat(),
+                    'query': query[:200] + "..." if len(query) > 200 else query,
+                    'status': 'success',
+                    'execution_time': result.get('execution_time', 0)
+                })
+
+            return {
+                'status': 'success',
+                'response': result['output'],
+                'execution_time': result['execution_time'],
+                'consultation_id': consultation_id,
+                'timestamp': datetime.now().isoformat()
+            }
+
+        except Exception as e:
+            logger.error(f"Error consulting Gemini: {str(e)}")
+            return {
+                'status': 'error',
+                'error': str(e),
+                'consultation_id': consultation_id
+            }
+
+    def detect_uncertainty(self, text: str) -> Tuple[bool, List[str]]:
+        """Detect if text contains uncertainty patterns"""
+        found_patterns = []
+
+        # Check uncertainty patterns
+        for pattern in UNCERTAINTY_PATTERNS:
+            if re.search(pattern, text, re.IGNORECASE):
+                found_patterns.append(f"uncertainty: {pattern}")
+
+        # Check complex decision patterns
+        for pattern in COMPLEX_DECISION_PATTERNS:
+            if re.search(pattern, text, re.IGNORECASE):
+                found_patterns.append(f"complex_decision: {pattern}")
+
+        # Check critical operation patterns
+        for pattern in CRITICAL_OPERATION_PATTERNS:
+            if re.search(pattern, text, re.IGNORECASE):
+                found_patterns.append(f"critical_operation: {pattern}")
+
+        return len(found_patterns) > 0, found_patterns
+
+    async def _enforce_rate_limit(self):
+        """Enforce rate limiting between consultations"""
+        current_time = time.time()
+        time_since_last = current_time - self.last_consultation
+
+        if time_since_last < self.rate_limit_delay:
+            sleep_time = self.rate_limit_delay - time_since_last
+            await asyncio.sleep(sleep_time)
+
+        self.last_consultation = time.time()
+
+    def _prepare_query(self, query: str, context: str, comparison_mode: bool) -> str:
+        """Prepare the full query for Gemini CLI"""
+        if len(context) > self.max_context_length:
+            context = context[:self.max_context_length] + "\n[Context truncated...]"
+
+        parts = []
+        if comparison_mode:
+            parts.append("Please provide a technical analysis and second opinion:")
+            parts.append("")
+
+        if context:
+            parts.append("Context:")
+            parts.append(context)
+            parts.append("")
+
+        parts.append("Question/Topic:")
+        parts.append(query)
+
+        if comparison_mode:
+            parts.extend([
+                "",
+                "Please structure your response with:",
+                "1. Your analysis and understanding",
+                "2. Recommendations or approach", 
+                "3. Any concerns or considerations",
+                "4. Alternative approaches (if applicable)"
+            ])
+
+        return "\n".join(parts)
+
+    async def _execute_gemini_cli(self, query: str) -> Dict[str, Any]:
+        """Execute Gemini CLI command and return results"""
+        start_time = time.time()
+
+        # Build command
+        cmd = [self.cli_command]
+        if self.model:
+            cmd.extend(['-m', self.model])
+        cmd.extend(['-p', query])  # Non-interactive mode
+
+        try:
+            process = await asyncio.create_subprocess_exec(
+                *cmd,
+                stdout=asyncio.subprocess.PIPE,
+                stderr=asyncio.subprocess.PIPE
+            )
+
+            stdout, stderr = await asyncio.wait_for(
+                process.communicate(),
+                timeout=self.timeout
+            )
+
+            execution_time = time.time() - start_time
+
+            if process.returncode != 0:
+                error_msg = stderr.decode() if stderr else "Unknown error"
+                if "authentication" in error_msg.lower():
+                    error_msg += "\nTip: Run 'gemini' interactively to authenticate"
+                raise Exception(f"Gemini CLI failed: {error_msg}")
+
+            return {
+                'output': stdout.decode().strip(),
+                'execution_time': execution_time
+            }
+
+        except asyncio.TimeoutError:
+            raise Exception(f"Gemini CLI timed out after {self.timeout} seconds")
diff --git a/mcp-server.py b/mcp-server.py
@@ -0,0 +1,206 @@
+#!/usr/bin/env python3
+"""
+MCP Server with Gemini Integration
+Provides development workflow automation with AI second opinions
+"""
+
+import asyncio
+import json
+import os
+import sys
+from pathlib import Path
+from typing import Any, Dict, List
+
+import mcp.server.stdio
+import mcp.types as types
+from mcp.server import Server
+
+# Import Gemini integration
+sys.path.append(str(Path(__file__).parent.parent / "gemini"))
+from gemini_integration import GeminiIntegration
+
+
+class MCPServer:
+    def __init__(self, project_root: str = None):
+        self.project_root = Path(project_root) if project_root else Path.cwd()
+        self.server = Server("mcp-server")
+
+        # Initialize Gemini integration
+        self.gemini_config = self._load_gemini_config()
+        self.gemini = GeminiIntegration(self.gemini_config)
+
+        self._setup_tools()
+
+    def _load_gemini_config(self) -> Dict[str, Any]:
+        """Load Gemini configuration from file and environment"""
+        config = {}
+
+        # Load from config file if exists
+        config_file = self.project_root / "gemini-config.json"
+        if config_file.exists():
+            with open(config_file) as f:
+                config = json.load(f)
+
+        # Override with environment variables
+        env_mapping = {
+            'GEMINI_ENABLED': ('enabled', lambda x: x.lower() == 'true'),
+            'GEMINI_AUTO_CONSULT': ('auto_consult', lambda x: x.lower() == 'true'),
+            'GEMINI_CLI_COMMAND': ('cli_command', str),
+            'GEMINI_TIMEOUT': ('timeout', int),
+            'GEMINI_RATE_LIMIT': ('rate_limit_delay', float),
+            'GEMINI_MODEL': ('model', str),
+        }
+
+        for env_key, (config_key, converter) in env_mapping.items():
+            value = os.getenv(env_key)
+            if value is not None:
+                config[config_key] = converter(value)
+
+        return config
+
+    def _setup_tools(self):
+        """Register all MCP tools"""
+        @self.server.list_tools()
+        async def handle_list_tools():
+            return [
+                types.Tool(
+                    name="consult_gemini",
+                    description="Consult Gemini for a second opinion or validation",
+                    inputSchema={
+                        "type": "object",
+                        "properties": {
+                            "query": {
+                                "type": "string",
+                                "description": "The question or topic to consult Gemini about"
+                            },
+                            "context": {
+                                "type": "string", 
+                                "description": "Additional context for the consultation"
+                            },
+                            "comparison_mode": {
+                                "type": "boolean",
+                                "description": "Whether to request structured comparison format",
+                                "default": True
+                            }
+                        },
+                        "required": ["query"]
+                    }
+                ),
+                types.Tool(
+                    name="gemini_status",
+                    description="Check Gemini integration status and statistics"
+                ),
+                types.Tool(
+                    name="toggle_gemini_auto_consult", 
+                    description="Enable or disable automatic Gemini consultation",
+                    inputSchema={
+                        "type": "object",
+                        "properties": {
+                            "enable": {
+                                "type": "boolean",
+                                "description": "Enable (true) or disable (false) auto-consultation"
+                            }
+                        }
+                    }
+                )
+            ]
+
+        @self.server.call_tool()
+        async def handle_call_tool(name: str, arguments: Dict[str, Any]):
+            if name == "consult_gemini":
+                return await self._handle_consult_gemini(arguments)
+            elif name == "gemini_status":
+                return await self._handle_gemini_status(arguments)
+            elif name == "toggle_gemini_auto_consult":
+                return await self._handle_toggle_auto_consult(arguments)
+            else:
+                raise ValueError(f"Unknown tool: {name}")
+
+    async def _handle_consult_gemini(self, arguments: Dict[str, Any]) -> List[types.TextContent]:
+        """Handle Gemini consultation requests"""
+        query = arguments.get('query', '')
+        context = arguments.get('context', '')
+        comparison_mode = arguments.get('comparison_mode', True)
+
+        if not query:
+            return [types.TextContent(
+                type="text",
+                text="❌ Error: 'query' parameter is required for Gemini consultation"
+            )]
+
+        result = await self.gemini.consult_gemini(
+            query=query,
+            context=context,
+            comparison_mode=comparison_mode
+        )
+
+        if result['status'] == 'success':
+            response_text = f"🤖 **Gemini Second Opinion**\n\n{result['response']}\n\n"
+            response_text += f"⏱️ *Consultation completed in {result['execution_time']:.2f}s*"
+        else:
+            response_text = f"❌ **Gemini Consultation Failed**\n\nError: {result.get('error', 'Unknown error')}"
+
+        return [types.TextContent(type="text", text=response_text)]
+
+    async def _handle_gemini_status(self, arguments: Dict[str, Any]) -> List[types.TextContent]:
+        """Handle Gemini status requests"""
+        status_lines = [
+            "🤖 **Gemini Integration Status**",
+            "",
+            f"• **Enabled**: {'✅ Yes' if self.gemini.enabled else '❌ No'}",
+            f"• **Auto-consult**: {'✅ Yes' if self.gemini.auto_consult else '❌ No'}",
+            f"• **CLI Command**: `{self.gemini.cli_command}`",
+            f"• **Model**: {self.gemini.model}",
+            f"• **Rate Limit**: {self.gemini.rate_limit_delay}s between calls",
+            f"• **Timeout**: {self.gemini.timeout}s",
+            "",
+            f"📊 **Statistics**:",
+            f"• **Total Consultations**: {len(self.gemini.consultation_log)}",
+        ]
+
+        if self.gemini.consultation_log:
+            recent = self.gemini.consultation_log[-1]
+            status_lines.append(f"• **Last Consultation**: {recent['timestamp']}")
+
+        return [types.TextContent(type="text", text="\n".join(status_lines))]
+
+    async def _handle_toggle_auto_consult(self, arguments: Dict[str, Any]) -> List[types.TextContent]:
+        """Handle toggle auto-consultation requests"""
+        enable = arguments.get('enable')
+
+        if enable is None:
+            # Toggle current state
+            self.gemini.auto_consult = not self.gemini.auto_consult
+        else:
+            self.gemini.auto_consult = enable
+
+        status = "enabled" if self.gemini.auto_consult else "disabled"
+        return [types.TextContent(
+            type="text",
+            text=f"🔄 Auto-consultation has been **{status}**"
+        )]
+
+    async def run(self):
+        """Run the MCP server"""
+        async with mcp.server.stdio.stdio_server() as (read_stream, write_stream):
+            await self.server.run(
+                read_stream,
+                write_stream,
+                self.server.create_initialization_options()
+            )
+
+
+async def main():
+    import argparse
+
+    parser = argparse.ArgumentParser(description="MCP Server with Gemini Integration")
+    parser.add_argument("--project-root", type=str, default=".", 
+                       help="Project root directory")
+    args = parser.parse_args()
+
+    server = MCPServer(project_root=args.project_root)
+    await server.run()
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/setup-gemini-integration.sh b/setup-gemini-integration.sh
@@ -0,0 +1,83 @@
+#!/bin/bash
+set -e
+
+echo "🚀 Setting up Gemini CLI Integration..."
+
+# Check Node.js version
+if ! command -v node &> /dev/null; then
+    echo "❌ Node.js not found. Please install Node.js 18+ first."
+    exit 1
+fi
+
+NODE_VERSION=$(node --version | cut -d'v' -f2 | cut -d'.' -f1)
+if [ "$NODE_VERSION" -lt 18 ]; then
+    echo "❌ Node.js version $NODE_VERSION found. Please use Node.js 18+ (recommended: 22.16.0)"
+    echo "   Use: nvm install 22.16.0 && nvm use 22.16.0"
+    exit 1
+fi
+
+echo "✅ Node.js version check passed"
+
+# Install Gemini CLI
+echo "📦 Installing Gemini CLI..."
+npm install -g @google/gemini-cli
+
+# Test installation
+echo "🧪 Testing Gemini CLI installation..."
+if gemini --help > /dev/null 2>&1; then
+    echo "✅ Gemini CLI installed successfully"
+else
+    echo "❌ Gemini CLI installation failed"
+    exit 1
+fi
+
+# Create directories
+echo "📁 Creating project structure..."
+mkdir -p tools/mcp tools/gemini
+
+# Create default configuration
+echo "⚙️ Creating default configuration..."
+cat > gemini-config.json << 'EOF'
+{
+  "enabled": true,
+  "auto_consult": true,
+  "cli_command": "gemini",
+  "timeout": 60,
+  "rate_limit_delay": 2.0,
+  "max_context_length": 4000,
+  "log_consultations": true,
+  "model": "gemini-1.5-pro-latest",
+  "sandbox_mode": false,
+  "debug_mode": false
+}
+EOF
+
+# Create MCP configuration for Claude Code
+echo "🔧 Creating Claude Code MCP configuration..."
+cat > mcp-config.json << 'EOF'
+{
+  "mcpServers": {
+    "project": {
+      "command": "python3",
+      "args": ["tools/mcp/mcp-server.py", "--project-root", "."],
+      "env": {
+        "GEMINI_ENABLED": "true",
+        "GEMINI_AUTO_CONSULT": "true"
+      }
+    }
+  }
+}
+EOF
+
+echo ""
+echo "🎉 Gemini CLI Integration setup complete!"
+echo ""
+echo "📋 Next steps:"
+echo "1. Copy the provided code files to your project:"
+echo "   - tools/gemini/gemini_integration.py"
+echo "   - tools/mcp/mcp-server.py"
+echo "2. Install Python dependencies: pip install mcp pydantic"
+echo "3. Test with: python3 tools/mcp/mcp-server.py --project-root ."
+echo "4. Configure Claude Code to use the MCP server"
+echo ""
+echo "💡 Tip: First run 'gemini' command to authenticate with your Google account"