# Gemini CLI Integration for Claude Code MCP Server A complete setup guide for integrating Google's Gemini CLI with Claude Code through an MCP (Model Context Protocol) server. This provides automatic second opinion consultation when Claude expresses uncertainty or encounters complex technical decisions.
- [Template repository](https://github.com/AndrewAltimit/template-repo) with Gemini CLI Integration - Gemini CLI Automated PR Reviews : [Example PR](https://github.com/AndrewAltimit/template-repo/pull/9) , [Automation Script](https://github.com/AndrewAltimit/template-repo/blob/main/scripts/gemini-pr-review.py) ## 🚀 Quick Start ### 1. Install Gemini CLI (Host-based) ```bash # Switch to Node.js 22.16.0 nvm use 22.16.0 # Install Gemini CLI globally npm install -g @google/gemini-cli # Test installation gemini --help # Authenticate with Google account (free tier: 60 req/min, 1,000/day) # Authentication happens automatically on first use ``` ### 2. Direct Usage (Fastest) ```bash # Direct consultation (no container setup needed) echo "Your question here" | gemini # Example: Technical questions echo "Best practices for microservice authentication?" | gemini -m gemini-2.5-pro ``` ## 🏠 Host-Based MCP Integration ### Architecture Overview - **Host-Based Setup**: Both MCP server and Gemini CLI run on host machine - **Why Host-Only**: Gemini CLI requires interactive authentication and avoids Docker-in-Docker complexity - **Auto-consultation**: Detects uncertainty patterns in Claude responses - **Manual consultation**: On-demand second opinions via MCP tools - **Response synthesis**: Combines both AI perspectives - **Singleton Pattern**: Ensures consistent state management across all tool calls ### Key Files Structure ``` ├── mcp-server.py # Enhanced MCP server with Gemini tools ├── gemini_integration.py # Core integration module with singleton pattern ├── gemini-config.json # Gemini configuration └── setup-gemini-integration.sh # Setup script ``` All files should be placed in the same directory for easy deployment. ### Host-Based MCP Server Setup ```bash # Start MCP server directly on host cd your-project python3 mcp-server.py --project-root . # Or with environment variables GEMINI_ENABLED=true \ GEMINI_AUTO_CONSULT=true \ GEMINI_CLI_COMMAND=gemini \ GEMINI_TIMEOUT=200 \ GEMINI_RATE_LIMIT=2 \ python3 mcp-server.py --project-root . ``` ### Claude Code Configuration Create `mcp-config.json`: ```json { "mcpServers": { "project": { "command": "python3", "args": ["mcp-server.py", "--project-root", "."], "cwd": "/path/to/your/project", "env": { "GEMINI_ENABLED": "true", "GEMINI_AUTO_CONSULT": "true", "GEMINI_CLI_COMMAND": "gemini" } } } } ``` ## 🤖 Core Features ### 1. Uncertainty Detection Automatically detects patterns like: - "I'm not sure", "I think", "possibly", "probably" - "Multiple approaches", "trade-offs", "alternatives" - Critical operations: "security", "production", "database migration" ### 2. MCP Tools Available #### `consult_gemini` Manual consultation with Gemini for second opinions or validation. **Parameters:** - `query` (required): The question or topic to consult Gemini about - `context` (optional): Additional context for the consultation - `comparison_mode` (optional, default: true): Whether to request structured comparison format - `force` (optional, default: false): Force consultation even if disabled **Example:** ```python # In Claude Code Use the consult_gemini tool with: query: "Should I use WebSockets or gRPC for real-time communication?" context: "Building a multiplayer application with real-time updates" comparison_mode: true ``` #### `gemini_status` Check Gemini integration status and statistics. **Returns:** - Configuration status (enabled, auto-consult, CLI command, timeout, rate limit) - Gemini CLI availability and version - Consultation statistics (total, completed, average time) - Last consultation timestamp **Example:** ```python # Check current status Use the gemini_status tool ``` #### `toggle_gemini_auto_consult` Enable or disable automatic Gemini consultation on uncertainty detection. **Parameters:** - `enable` (optional): true to enable, false to disable. If not provided, toggles current state. **Example:** ```python # Toggle auto-consultation Use the toggle_gemini_auto_consult tool # Or explicitly enable/disable Use the toggle_gemini_auto_consult tool with: enable: false ``` #### `clear_gemini_history` Clear Gemini conversation history to start fresh. **Example:** ```python # Clear all consultation history Use the clear_gemini_history tool ``` ### 3. Response Synthesis - Identifies agreement/disagreement between Claude and Gemini - Provides confidence levels (high/medium/low) - Generates combined recommendations - Tracks execution time and consultation ID ### 4. Advanced Features #### Uncertainty Detection API The MCP server exposes methods for detecting uncertainty: ```python # Detect uncertainty in responses has_uncertainty, patterns = server.detect_response_uncertainty(response_text) # Automatically consult if uncertain result = await server.maybe_consult_gemini(response_text, context) ``` #### Statistics Tracking - Total consultations attempted - Successful completions - Average execution time - Last consultation timestamp - Error tracking and timeout monitoring ## ⚙️ Configuration ### Environment Variables ```bash GEMINI_ENABLED=true # Enable integration GEMINI_AUTO_CONSULT=true # Auto-consult on uncertainty GEMINI_CLI_COMMAND=gemini # CLI command to use GEMINI_TIMEOUT=200 # Query timeout in seconds GEMINI_RATE_LIMIT=5 # Delay between calls (seconds) GEMINI_MAX_CONTEXT= # Max context length GEMINI_MODEL=gemini-2.5-flash # Model to use GEMINI_SANDBOX=false # Sandboxing isolates operations (such as shell commands or file modifications) from your host system GEMINI_API_KEY= # Optional (blank for free tier, keys disable free mode!) ``` ### Gemini Configuration File Create `gemini-config.json`: ```json { "enabled": true, "auto_consult": true, "cli_command": "gemini", "timeout": 300, "rate_limit_delay": 5.0, "log_consultations": true, "model": "gemini-2.5-flash", "sandbox_mode": true, "debug_mode": false, "uncertainty_thresholds": { "uncertainty_patterns": true, "complex_decisions": true, "critical_operations": true } } ``` ## 🧠 Integration Module Core ### Uncertainty Patterns (Python) ```python UNCERTAINTY_PATTERNS = [ r"\bI'm not sure\b", r"\bI think\b", r"\bpossibly\b", r"\bprobably\b", r"\bmight be\b", r"\bcould be\b", # ... more patterns ] COMPLEX_DECISION_PATTERNS = [ r"\bmultiple approaches\b", r"\bseveral options\b", r"\btrade-offs?\b", r"\balternatives?\b", # ... more patterns ] CRITICAL_OPERATION_PATTERNS = [ r"\bproduction\b", r"\bdatabase migration\b", r"\bsecurity\b", r"\bauthentication\b", # ... more patterns ] ``` ### Basic Integration Class Structure ```python class GeminiIntegration: def __init__(self, config: Optional[Dict[str, Any]] = None): self.config = config or {} self.enabled = self.config.get('enabled', True) self.auto_consult = self.config.get('auto_consult', True) self.cli_command = self.config.get('cli_command', 'gemini') self.timeout = self.config.get('timeout', 30) self.rate_limit_delay = self.config.get('rate_limit_delay', 1) async def consult_gemini(self, query: str, context: str = "") -> Dict[str, Any]: """Consult Gemini CLI for second opinion""" # Rate limiting await self._enforce_rate_limit() # Prepare query with context full_query = self._prepare_query(query, context) # Execute Gemini CLI command result = await self._execute_gemini_command(full_query) return result def detect_uncertainty(self, text: str) -> bool: """Detect if text contains uncertainty patterns""" return any(re.search(pattern, text, re.IGNORECASE) for pattern in UNCERTAINTY_PATTERNS) # Singleton pattern implementation _integration = None def get_integration(config: Optional[Dict[str, Any]] = None) -> GeminiIntegration: """Get or create the global Gemini integration instance""" global _integration if _integration is None: _integration = GeminiIntegration(config) return _integration ``` ### Singleton Pattern Benefits The singleton pattern ensures: - **Consistent Rate Limiting**: All MCP tool calls share the same rate limiter - **Unified Configuration**: Changes to config affect all usage points - **State Persistence**: Consultation history and statistics are maintained - **Resource Efficiency**: Only one instance manages the Gemini CLI connection ### Usage in MCP Server ```python from gemini_integration import get_integration # Get the singleton instance self.gemini = get_integration(config) ``` ## 📋 Example Workflows ### Manual Consultation ```python # In Claude Code Use the consult_gemini tool with: query: "Should I use WebSockets or gRPC for real-time communication?" context: "Building a multiplayer application with real-time updates" ``` ### Automatic Consultation Flow ``` User: "How should I handle authentication?" Claude: "I think OAuth might work, but I'm not certain about the security implications..." [Auto-consultation triggered] Gemini: "For authentication, consider these approaches: 1) OAuth 2.0 with PKCE for web apps..." Synthesis: Both suggest OAuth but Claude uncertain about security. Gemini provides specific implementation details. Recommendation: Follow Gemini's OAuth 2.0 with PKCE approach. ``` ## 🔧 MCP Server Integration ### Tool Definitions ```python @server.list_tools() async def handle_list_tools(): return [ types.Tool( name="consult_gemini", description="Consult Gemini CLI for a second opinion or validation", inputSchema={ "type": "object", "properties": { "query": { "type": "string", "description": "The question or topic to consult Gemini about" }, "context": { "type": "string", "description": "Additional context for the consultation" }, "comparison_mode": { "type": "boolean", "description": "Whether to request structured comparison format", "default": True }, "force": { "type": "boolean", "description": "Force consultation even if Gemini is disabled", "default": False } }, "required": ["query"] } ), types.Tool( name="gemini_status", description="Get Gemini integration status and statistics" ), types.Tool( name="toggle_gemini_auto_consult", description="Toggle automatic Gemini consultation on uncertainty detection", inputSchema={ "type": "object", "properties": { "enable": { "type": "boolean", "description": "Enable (true) or disable (false) auto-consultation. If not provided, toggles current state" } } } ), types.Tool( name="clear_gemini_history", description="Clear Gemini conversation history" ) ] ``` ## 🚨 Troubleshooting | Issue | Solution | |-------|----------| | Gemini CLI not found | Install Node.js 18+ and `npm install -g @google/gemini-cli` | | Authentication errors | Run `gemini` and sign in with Google account | | Node version issues | Use `nvm use 22.16.0` | | Timeout errors | Increase `GEMINI_TIMEOUT` (default: 60s) | | Auto-consult not working | Check `GEMINI_AUTO_CONSULT=true` | | Rate limiting | Adjust `GEMINI_RATE_LIMIT` (default: 2s) | ## 🔐 Security Considerations 1. **API Credentials**: Store securely, use environment variables 2. **Data Privacy**: Be cautious about sending proprietary code 3. **Input Sanitization**: Sanitize queries before sending 4. **Rate Limiting**: Respect API limits (free tier: 60/min, 1000/day) 5. **Host-Based Architecture**: Both Gemini CLI and MCP server run on host for auth compatibility and simplicity ## 📈 Best Practices 1. **Rate Limiting**: Implement appropriate delays between calls 2. **Context Management**: Keep context concise and relevant 3. **Error Handling**: Always handle Gemini failures gracefully 4. **User Control**: Allow users to disable auto-consultation 5. **Logging**: Log consultations for debugging and analysis 6. **Caching**: Cache similar queries to reduce API calls ## 🎯 Use Cases - **Architecture Decisions**: Get second opinions on design choices - **Security Reviews**: Validate security implementations - **Performance Optimization**: Compare optimization strategies - **Code Quality**: Review complex algorithms or patterns - **Troubleshooting**: Debug complex technical issues