Skip to content

[SDBM-2310] Extend Mysql replication metric reporting to support mixed replication topologies#22485

Merged
eric-weaver merged 6 commits intomasterfrom
eric.weaver/SDBM-2310
Feb 11, 2026
Merged

[SDBM-2310] Extend Mysql replication metric reporting to support mixed replication topologies#22485
eric-weaver merged 6 commits intomasterfrom
eric.weaver/SDBM-2310

Conversation

@eric-weaver
Copy link
Copy Markdown
Contributor

What does this PR do?

Currently when replication metric reporting is enabled we report back group replication OR traditional primary/replica replication. It's possible for Mysql topologies to make use of both replication methods. The changes in this pull request extend support so that we collect both if they are both identified as active.

A new test harness for integration tests was added which spins up a 3 node group replication setup with a 4th instance that is configured as a traditional replication replica of node1.

Motivation

The ability to collect metrics for mixed replication topologies was requested in https://datadoghq.atlassian.net/browse/SDBM-2310

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
  • If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 51485de760

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread mysql/datadog_checks/mysql/mysql.py Outdated
@datadog-official

This comment has been minimized.

@codecov
Copy link
Copy Markdown

codecov bot commented Jan 30, 2026

Codecov Report

❌ Patch coverage is 99.42197% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 89.15%. Comparing base (ae85c62) to head (24b812e).
⚠️ Report is 17 commits behind head on master.

Additional details and impacted files
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

lu-zhengda
lu-zhengda previously approved these changes Feb 3, 2026
@temporal-github-worker-1 temporal-github-worker-1 bot dismissed lu-zhengda’s stale review February 9, 2026 20:46

Review from lu-zhengda is dismissed. Related teams and files:

  • database-monitoring-agent
    • mysql/datadog_checks/mysql/mysql.py
    • mysql/tests/common.py
    • mysql/tests/compose/mysql8-hybrid.yaml
    • mysql/tests/conftest.py
    • mysql/tests/test_mysql.py
    • mysql/tests/test_unit.py
    • mysql/tests/variables.py
@eric-weaver
Copy link
Copy Markdown
Contributor Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 594db4ae48

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 769 to +777
def _collect_replication_metrics(self, db, results, above_560):
# Get replica stats
results.update(self._get_replica_stats(db))
results.update(self._get_replicas_connected_count(db, above_560))
return REPLICA_VARS
replica_stats = self._get_replica_stats(db)
replicas_connected = self._get_replicas_connected_count(db, above_560)

# When group replication is not active, always report traditional replication metrics
# so _check_replication_status can emit WARNING for sources with 0 replicas.
has_traditional_replication = (
replica_stats or replicas_connected.get('Replicas_connected', 0) > 0 or not self._group_replication_active
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid reporting classic replication metrics on group-only setups

When group replication is active without any classic replicas, the replicas_connected probe can still be non‑zero because SQL_REPLICA_WORKER_THREADS matches any PROCESSLIST_COMMAND LIKE 'Binlog dump%', which group replication also uses for its internal channels. With the new gate, that replicas_connected > 0 path treats a pure group‑replication node as having traditional replicas and returns REPLICA_VARS, which in turn emits mysql.replication.* metrics and source/replica service checks that were previously suppressed for group-only nodes. This causes false positives on group‑only clusters; consider filtering out group replication channels or basing the decision on replica status rows instead of binlog dump thread counts.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to be a very minimal edge case that might show up only while a replication group member is bootstrapping replication from the primary, which is technically done via traditional replication

@eric-weaver eric-weaver added this pull request to the merge queue Feb 11, 2026
Merged via the queue into master with commit ff50372 Feb 11, 2026
332 of 335 checks passed
@eric-weaver eric-weaver deleted the eric.weaver/SDBM-2310 branch February 11, 2026 15:13
lukepatrick pushed a commit to lukepatrick/integrations-core that referenced this pull request Mar 17, 2026
…d replication topologies (DataDog#22485)

* Add support for collecting traditional replication metrics when group replication is enabled

* Add test harness for hybrid replication topology

* Add changelog

* Fix replica offline and add tests

* Run ddev validate ci --sync

Signed-off-by: lukepatrick <lukephilips@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants