summaryrefslogtreecommitdiff
path: root/check_postgres.pl.html
diff options
context:
space:
mode:
authorCédric Villemain2012-01-22 12:46:43 +0000
committerCédric Villemain2012-01-22 12:56:33 +0000
commit06c9f6d4ae80ac5fefca66c51dc7487f2f60f24e (patch)
tree5ba3c1869f233b5148b7f365295d31d2c890a677 /check_postgres.pl.html
parenta0ea364a1c6c534e2eec23992e5d8ef67f98d5a8 (diff)
Add `pgagent_jobs` test.
From: "David E. Wheeler" <david@justatheory.com> This patch adds support for checking for failed pgAgent jobs within a specified period of time. You can specify either --critical or --warning as a period of time, and it will report on failures within that period of time previous to the current time. Job failures are determined by a non-0 status in a job step record. Using this test obviously requiers that the pgAgent schema be installed. I've also included a bunch of unit tests to make sure it works the way I would expect (the test will create a schema for testing) and documentation. As part of this, I've introduced the `any_warning` argument to `validate_range()`. The `pgagent_jobs` test does not care if you specify a warning value greater than the critical value (indeed, I expect that if one used both at all, the warning would be much longer). So this new argument prevents the `range-warnbigtime` or `range-warnbigsize` failures from being triggered. Cedric: I sorted the POD and added the action_info so that t/05_docs.t is ok. I also built and push the new .html
Diffstat (limited to 'check_postgres.pl.html')
-rw-r--r--check_postgres.pl.html22
1 files changed, 22 insertions, 0 deletions
diff --git a/check_postgres.pl.html b/check_postgres.pl.html
index 3ffef6360..ea19b1ff0 100644
--- a/check_postgres.pl.html
+++ b/check_postgres.pl.html
@@ -79,6 +79,7 @@
<li><a href="#pgb_pool_maxwait"><strong>pgb_pool_maxwait</strong></a></li>
<li><a href="#pgbouncer_backends"><strong>pgbouncer_backends</strong></a></li>
<li><a href="#pgbouncer_checksum"><strong>pgbouncer_checksum</strong></a></li>
+ <li><a href="#pgagent_jobs"><strong>pgagent_jobs</strong></a></li>
<li><a href="#prepared_txns"><strong>prepared_txns</strong></a></li>
<li><a href="#query_runtime"><strong>query_runtime</strong></a></li>
<li><a href="#query_time"><strong>query_time</strong></a></li>
@@ -1234,6 +1235,27 @@ checksum must be provided as the <code>--mrtg</code> argument. The fourth line a
current checksum.</p>
<p>
</p>
+<h2><a name="pgagent_jobs"><strong>pgagent_jobs</strong></a></h2>
+<p>(<code>symlink: check_postgres_pgagent_jobs</code>) Checks that all the pgAgent jobs
+that have executed in the preceding interval of time have succeeded. This is
+done by checking for any steps that have a non-zero result.</p>
+<p>Either <code>--warning</code> or <code>--critical</code>, or both, may be specified as times, and
+jobs will be checked for failures withing the specified periods of time before
+the current time. Valid units are seconds, minutes, hours, and days; all can
+be abbreviated to the first letter. If no units are given, 'seconds' are
+assumed.</p>
+<p>Example 1: Give a critical when any jobs executed in the last day have failed.</p>
+<pre>
+ check_postgres_pgagent_jobs --critical=1d</pre>
+<p>Example 2: Give a warning when any jobs executed in the last week have failed.</p>
+<pre>
+ check_postgres_pgagent_jobs --warning=7d</pre>
+<p>Example 3: Give a critical for jobs that have failed in the last 2 hours and a
+warning for jobs that have failed in the last 4 hours:</p>
+<pre>
+ check_postgres_pgagent_jobs --critical=2h --warning=4h</pre>
+<p>
+</p>
<h2><a name="prepared_txns"><strong>prepared_txns</strong></a></h2>
<p>(<code>symlink: check_postgres_prepared_txns</code>) Check on the age of any existing prepared transactions.
Note that most people will NOT use prepared transactions, as they are part of two-part commit