Implement circuit breaker for vMCP backend health monitoring#3136
Implement circuit breaker for vMCP backend health monitoring#3136yrobla wants to merge 8 commits intofeat/issue-3036-healthcheck-3from
Conversation
author taskbot <taskbot@users.noreply.github.com> 1766072123 +0100 committer taskbot <taskbot@users.noreply.github.com> 1766158585 +0100 Integrate health monitoring into vMCP server Integrates the health monitoring infrastructure (from previous into the vMCP server, enabling periodic backend health checks with configurable Related-to: #3036 intervals and thresholds. changes from review changes from review add missing method Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Adds health monitoring integration to the Kubernetes operator controller, enabling real-time backend health status tracking and reporting in the VirtualMCPServer CRD status.
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## feat/issue-3036-healthcheck-3 #3136 +/- ##
=================================================================
- Coverage 57.11% 57.05% -0.06%
=================================================================
Files 341 342 +1
Lines 33949 34332 +383
=================================================================
+ Hits 19389 19589 +200
- Misses 12951 13126 +175
- Partials 1609 1617 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add circuit breaker pattern to the vMCP health monitoring system to prevent cascading failures and enable graceful degradation when backends become unhealthy. The circuit breaker fast-fails health checks when backends are down, reducing unnecessary network calls and allowing faster recovery. Implementation: - Add circuit breaker to health monitoring system (pkg/vmcp/health/) - Circuit states map to health states (Closed→Healthy, Open→Unhealthy, HalfOpen→Degraded) - Per-backend circuit isolation - each backend has independent circuit state - Configurable failure threshold and timeout for circuit transitions - Fast-fail behavior skips health checks when circuit is open Configuration: - Disabled by default for backward compatibility - Optional circuit_breaker config in VirtualMCPServer operational settings - Configurable failure_threshold and timeout parameters Testing: - Unit tests for circuit state machine logic (circuit_breaker_test.go) - Integration tests with health monitor (monitor_test.go) - End-to-end tests in Kubernetes environment (virtualmcp_circuit_breaker_test.go) - All tests run in parallel for faster execution The circuit breaker opens after consecutive failures, transitions to half-open after timeout, and closes on successful recovery. This prevents overwhelming failing backends while maintaining healthy backend availability. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
cc090f9 to
aee49c0
Compare
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
There was a problem hiding this comment.
Pull request overview
This PR implements a circuit breaker pattern for vMCP backend health monitoring to prevent cascading failures and enable graceful degradation. The circuit breaker fast-fails health checks when backends are down, reducing unnecessary network calls and allowing faster recovery.
Key changes:
- New circuit breaker state machine with three states (Closed, Open, HalfOpen) that map to health states (Healthy, Unhealthy, Degraded)
- Per-backend circuit isolation with independent state tracking for each backend
- Disabled by default for backward compatibility with configurable threshold and timeout parameters
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| pkg/vmcp/health/config.go | Defines circuit breaker configuration, validation logic, and state constants |
| pkg/vmcp/health/config_test.go | Unit tests for circuit breaker configuration validation and state string conversion |
| pkg/vmcp/health/circuit_breaker_test.go | Comprehensive unit tests for circuit breaker state machine logic and transitions |
| pkg/vmcp/health/status.go | Integrates circuit breaker into status tracker with state management methods |
| pkg/vmcp/health/status_test.go | Updates status tracker tests to pass circuit breaker config parameter |
| pkg/vmcp/health/monitor.go | Implements fast-fail logic by checking circuit state before health checks |
| pkg/vmcp/health/monitor_test.go | Integration tests for circuit breaker with health monitor including full cycle and backward compatibility tests |
| cmd/vmcp/app/commands.go | Maps circuit breaker configuration from YAML to health monitor config |
| cmd/vmcp/README.md | Comprehensive documentation of health monitoring and circuit breaker features |
| examples/operator/virtual-mcps/vmcp_health_monitoring.yaml | Example configuration demonstrating circuit breaker settings |
| test/e2e/thv-operator/virtualmcp/virtualmcp_circuit_breaker_test.go | End-to-end Kubernetes tests verifying circuit breaker behavior in production-like environment |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
test/e2e/thv-operator/virtualmcp/virtualmcp_circuit_breaker_test.go
Outdated
Show resolved
Hide resolved
5d3dc03 to
7c7fc41
Compare
|
pending on changes for status reporter abstraction |
Add circuit breaker pattern to the vMCP health monitoring system to prevent cascading failures and enable graceful degradation when backends become unhealthy. The circuit breaker fast-fails health checks when backends are down, reducing unnecessary network calls and allowing faster recovery.
Implementation:
Configuration:
Testing:
The circuit breaker opens after consecutive failures, transitions to half-open after timeout, and closes on successful recovery. This prevents overwhelming failing backends while maintaining healthy backend availability.
🤖 Generated with Claude Code
Large PR Justification
This PR implements the circuit breaker pattern for vMCP backend health monitoring as a single atomic feature. The circuit breaker logic is tightly coupled across multiple health system components (status tracker, monitor, config) that must work together - splitting would create intermediate states where the feature is incomplete or broken. The configuration flows through multiple layers (YAML → commands → server → monitor → status), and separating config from implementation would leave the system non-functional. Additionally, the circuit states (Closed/Open/HalfOpen) map directly to health states (Healthy/Unhealthy/Degraded), a contract that must be established atomically.