Conversation
This comment has been minimized.
This comment has been minimized.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
3cb842d to
87a9d9c
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
- Add deploy-langfuse-openshift.sh script with ClickHouse resource fixes - Add Langfuse configuration manifests for ambient-code integration - README.md with complete setup instructions - Add Makefile target: deploy-langfuse-openshift - Update .gitignore to exclude Langfuse secrets 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: sallyom <somalley@redhat.com>
Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: sallyom <somalley@redhat.com>
Remove npm installation of @anthropic-ai/claude-code and Node.js dependency from Dockerfile. The claude-agent-sdk Python package bundles its own CLI (v2.0.49) which is automatically used by default. pyproject.toml unpinned dependency versions remain: - anthropic[vertex]>=0.68.0 (from ==0.73.0) - claude-agent-sdk>=0.1.4 (from ==0.1.6) Benefits: - Smaller container image (no Node.js/npm) - Automatic version sync between SDK and CLI - Simpler build process 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: sallyom <somalley@redhat.com>
Signed-off-by: sallyom <somalley@redhat.com> Co-Authored-By: Claude <noreply@anthropic.com>
… children
Problem: All interactions were being appended to a single claude_turn_1 trace
instead of creating separate claude_turn_1, claude_turn_2, etc. traces.
Root Cause:
- Using start_as_current_observation() sets the new observation as the current
OpenTelemetry context, causing subsequent observations to become children
- This created a nested hierarchy instead of sibling observations
Solution:
1. Split turn tracking into two phases:
- start_turn() - Called when AssistantMessage arrives (BEFORE tools)
- end_turn() - Called when ResultMessage arrives (with usage data)
2. Use session_trace.start_observation() instead of start_as_current_observation()
- Creates observations as SIBLINGS under the session trace
- Does NOT change the current context
- All turns are children of session, not children of each other
3. Tool span parenting:
- Tools now properly attach to their parent turn via
current_turn_generation.start_observation()
- Added fallback to session trace if no active turn
4. Proper observation lifecycle:
- Call .end() on both generations and spans to close them
Expected Result:
session_{session_id} (span)
├── claude_turn_1 (generation) - Sibling
│ ├── tool_Read (span)
│ └── tool_Write (span)
├── claude_turn_2 (generation) - Sibling
│ └── tool_Bash (span)
└── claude_turn_3 (generation) - Sibling
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>
Signed-off-by: sallyom <somalley@redhat.com> Co-Authored-By: Claude <noreply@anthropic.com>
Co-authored-by: Claude Code <noreply@anthropic.com> Signed-off-by: sallyom <somalley@redhat.com>
Signed-off-by: sallyom <somalley@redhat.com> Co-Authored-By: Claude <noreply@anthropic.com>
Addresses GitHub Actions security review findings: 1. Enhanced Secret Exposure Prevention (observability.py:164) 2. Model Parameter Input Validation (observability.py:143-155) 3. Complete Error Recovery Logging (observability.py:341-342) Co-authored-by: GitHub Actions Security Review Bot Signed-off-by: sallyom <somalley@redhat.com>
Addresses GitHub Actions security review:
Risk: Logging secret_name reveals credential context to attackers
Fix:
- Changed log message from revealing secret type to generic warning
- Before: "SECURITY: Secret '{secret_name}' found in sanitized message"
- After: "SECURITY: Credential sanitization validation failed"
- Maintains security alerting without revealing attack surface
Signed-off-by: sallyom <somalley@redhat.com>
…ers only 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: sallyom <somalley@redhat.com>
…secret Signed-off-by: sallyom <somalley@redhat.com> Co-Authored-By: Claude <noreply@anthropic.com>
Claude Code ReviewSummaryThis PR adds comprehensive Langfuse observability infrastructure to track LLM usage, costs, and session traces. The implementation is well-architected with strong security practices, extensive test coverage, and production-ready deployment tooling. Overall Assessment: ✅ Strong implementation with excellent security practices and comprehensive testing. A few critical security and architectural recommendations should be addressed before merge. Issues by Severity🚫 Blocker IssuesNone - No blocking issues found. 🔴 Critical Issues1. Kubernetes Name Validation May Be Too Restrictive (middleware.go:37-48)Issue: The Location: var kubernetesNameRegex = regexp.MustCompile(`^[a-z0-9]([-a-z0-9]*[a-z0-9])?$`)Impact: Legitimate users with uppercase namespace names will be rejected with Recommendation:
References:
2. Sanitization Validation Logic Has False Positive Risk (security_utils.py:60-67)Issue: Post-sanitization validation uses Location: for secret_name, secret_value in secrets_to_redact.items():
if secret_value and secret_value.strip() and secret_value in error_msg:
logging.error("SECURITY: Credential sanitization validation failed")
return "Operation failed - check configuration and credentials"Impact:
Recommendation: # Add minimum length check to avoid false positives
MIN_SECRET_LENGTH = 8 # Langfuse keys are typically >20 chars
if secret_value and len(secret_value.strip()) >= MIN_SECRET_LENGTH and secret_value in error_msg:
logging.error("SECURITY: Credential sanitization validation failed")
return "Operation failed - check configuration and credentials"Test Coverage: Add test for short secrets (e.g., 3. Missing Secret Validation in Operator (sessions.go:298-328)Issue: The operator copies Location: Impact:
Recommendation: // Validate secret contains required keys before copying
requiredKeys := []string{"LANGFUSE_HOST", "LANGFUSE_PUBLIC_KEY", "LANGFUSE_SECRET_KEY", "LANGFUSE_ENABLED"}
for _, key := range requiredKeys {
if _, ok := langfuseSecret.Data[key]; \!ok {
log.Printf("Warning: ambient-admin-langfuse-secret missing required key %s", key)
// Skip copy or log error
}
}Consistency: Matches URL validation pattern in 🟡 Major Issues4. Context Manager Cleanup on Partial Initialization Failure (observability.py:163-179)Issue: If Location: try:
self._propagate_ctx = propagate_attributes(...)
self._propagate_ctx.__enter__()
except Exception:
if self._propagate_ctx:
try:
self._propagate_ctx.__exit__(None, None, None)
except Exception:
pass
self._propagate_ctx = None
raiseIssue: If Recommendation: ctx = None
try:
ctx = propagate_attributes(...)
ctx.__enter__()
self._propagate_ctx = ctx # Only assign after successful __enter__
except Exception:
if ctx:
try:
ctx.__exit__(None, None, None)
except Exception:
pass
raise5. Hardcoded Model Default Should Be Configurable (wrapper.py:269)Issue: Default model Location: configured_model = model or 'claude-sonnet-4-5@20250929'Recommendation: DEFAULT_MODEL = os.getenv('DEFAULT_LLM_MODEL', 'claude-sonnet-4-5@20250929')
configured_model = model or DEFAULT_MODELBenefit: Allows platform admins to update default via ConfigMap without code changes. 6. Flush Timeout Configuration Lacks Documentation (observability.py:482)Issue: Location: Recommendation: Add to # Optional: Langfuse flush timeout (default: 30s)
# Increase for large traces or constrained networks
# LANGFUSE_FLUSH_TIMEOUT: "60.0"7. Test Coverage GapsMissing Tests:
Recommendation: Add unit tests for:
Note: Excellent test coverage for 🔵 Minor Issues8. Inconsistent Error Logging PatternsIssue: Some functions log errors before returning, others don't. Example: # observability.py:186 - logs before returning
except Exception as e:
error_msg = sanitize_exception_message(e, secrets)
logging.warning(f"Langfuse init failed: {error_msg}") # ✅
# observability.py:407 - catches but doesn't log
except Exception as e:
logging.debug(f"Langfuse: Failed to track tool use: {e}") # Debug levelRecommendation: Standardize error levels:
9. Magic String "ambient-admin-langfuse-secret" RepeatedIssue: Secret name appears 12+ times across operator, docs, and examples. No constant definition. Recommendation: Define in const AmbientLangfuseSecretName = "ambient-admin-langfuse-secret"Benefit: Single source of truth, easier refactoring. 10. Deployment Script Has Weak Credential DefaultsIssue: Test mode uses simple passwords ( Location: Recommendation: # Add warning banner
if [[ "$USE_TEST_CREDS" =~ ^[Yy]$ ]]; then
echo "⚠️ WARNING: Using test credentials (INSECURE - dev/test only\!)"
echo " DO NOT use in production environments"
read -p "Continue? (y/n): " CONFIRM
[[ "$CONFIRM" =~ ^[Yy]$ ]] || exit 1
# ... existing code
fiPositive Highlights🌟 Excellent Security Practices
🧪 Strong Test Coverage
📚 Excellent Documentation
🏗️ Clean Architecture
🔧 Production-Ready Features
RecommendationsPriority 1 (Before Merge)
Priority 2 (Near-term)
Priority 3 (Nice-to-have)
Final NotesThis is a high-quality PR with exceptional attention to security, testing, and documentation. The code follows repository standards from CLAUDE.md (secret sanitization, timeout wrappers, RBAC documentation). The critical issues are primarily about edge case handling (namespace validation, secret validation) rather than fundamental design flaws. Once addressed, this will be a production-ready observability system. Estimated review effort: 2-3 hours for fixes + testing Reviewed by: Claude Code (Sonnet 4.5) |



This PR