dataflow.mcp.health¶
MCP Health Monitoring for haive-dataflow.
This module provides health monitoring and management capabilities for MCP servers, including connection status tracking, performance metrics, and automatic recovery.
- Classes:
MCPHealthMonitor: Main health monitoring service MCPHealthChecker: Individual server health checker
Classes¶
Health checker for individual MCP servers. |
|
Health monitoring service for MCP servers. |
Module Contents¶
- class dataflow.mcp.health.MCPHealthChecker(server_name, server_config)¶
Health checker for individual MCP servers.
This class handles health checking for a single MCP server including connection testing, response time measurement, and recovery attempts.
Initialize health checker.
- Parameters:
server_name (str) – Name of the server to monitor
server_config (haive.dataflow.registry.models.MCPServerConfig) – Server configuration
- async attempt_recovery()¶
Attempt to recover the server connection.
- async check_health()¶
Perform health check on the server.
- Returns:
Current health status
- Return type:
haive.dataflow.registry.models.MCPServerHealth
- class dataflow.mcp.health.MCPHealthMonitor(mcp_client=None, monitoring_interval=30)¶
Health monitoring service for MCP servers.
This class provides comprehensive health monitoring for MCP servers including: - Periodic health checks - Performance metric tracking - Automatic recovery attempts - Health status reporting
- mcp_client¶
Reference to the MCP client
- health_checkers¶
Dictionary of server health checkers
- monitoring_interval¶
Seconds between health checks
- is_monitoring¶
Whether monitoring is currently active
Examples
monitor = MCPHealthMonitor(mcp_client) await monitor.start_monitoring()
# Get health status status = await monitor.get_health_summary() print(f”Healthy servers: {status[‘healthy_count’]}”)
Initialize health monitor.
- Parameters:
mcp_client – MCP client instance to monitor
monitoring_interval (int) – Seconds between health checks
- async check_all_servers()¶
Perform health check on all servers.
- async get_health_summary()¶
Get summary of health status across all servers.
- async recover_failed_servers()¶
Attempt to recover failed servers.
- async start_monitoring()¶
Start health monitoring for all connected servers.
- async stop_monitoring()¶
Stop health monitoring.