dataflow.mcp.health

MCP Health Monitoring for haive-dataflow.

This module provides health monitoring and management capabilities for MCP servers, including connection status tracking, performance metrics, and automatic recovery.

Classes:

MCPHealthMonitor: Main health monitoring service MCPHealthChecker: Individual server health checker

Classes

MCPHealthChecker

Health checker for individual MCP servers.

MCPHealthMonitor

Health monitoring service for MCP servers.

Module Contents

class dataflow.mcp.health.MCPHealthChecker(server_name, server_config)

Health checker for individual MCP servers.

This class handles health checking for a single MCP server including connection testing, response time measurement, and recovery attempts.

Initialize health checker.

Parameters:
  • server_name (str) – Name of the server to monitor

  • server_config (haive.dataflow.registry.models.MCPServerConfig) – Server configuration

async attempt_recovery()

Attempt to recover the server connection.

async check_health()

Perform health check on the server.

Returns:

Current health status

Return type:

haive.dataflow.registry.models.MCPServerHealth

class dataflow.mcp.health.MCPHealthMonitor(mcp_client=None, monitoring_interval=30)

Health monitoring service for MCP servers.

This class provides comprehensive health monitoring for MCP servers including: - Periodic health checks - Performance metric tracking - Automatic recovery attempts - Health status reporting

mcp_client

Reference to the MCP client

health_checkers

Dictionary of server health checkers

monitoring_interval

Seconds between health checks

is_monitoring

Whether monitoring is currently active

Examples

monitor = MCPHealthMonitor(mcp_client) await monitor.start_monitoring()

# Get health status status = await monitor.get_health_summary() print(f”Healthy servers: {status[‘healthy_count’]}”)

Initialize health monitor.

Parameters:
  • mcp_client – MCP client instance to monitor

  • monitoring_interval (int) – Seconds between health checks

async check_all_servers()

Perform health check on all servers.

Returns:

Dictionary of server name to health status

Return type:

dict[str, haive.dataflow.registry.models.MCPServerHealth]

async get_health_summary()

Get summary of health status across all servers.

Returns:

Summary dictionary with health metrics

Return type:

dict[str, any]

async recover_failed_servers()

Attempt to recover failed servers.

Returns:

List of server names that were successfully recovered

Return type:

list[str]

async start_monitoring()

Start health monitoring for all connected servers.

async stop_monitoring()

Stop health monitoring.