Lists: | pgsql-hackers |
---|
From: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Operation log for major operations |
Date: | 2022-11-21 08:41:22 |
Message-ID: | 02fe0063-bf77-90d0-3cf5-e9fe7c2a487b@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi, hackers!
It is important for customer support to know what system operations
(pg_resetwal, pg_rewind, pg_upgrade, ...) have been executed on the
database. A variant of implementation of the log for system operations
(operation log) is attached to this email.
Introduction.
-------------
Operation log is designed to store information about critical system
events (like pg_upgrade, pg_resetwal, pg_resetwal, etc.).
This information is not interesting to the ordinary user, but it is very
important for the vendor's technical support.
An example: client complains about DB damage to technical support
(damage was caused by pg_resetwal which was "silent" executed by one of
administrators).
Concepts.
---------
* operation log is placed in the file 'global/pg_control', starting from
position 4097 (log size is 4kB);
* user can not modify the operation log; log can be changed by
function call only (from postgres or from postgres utilities);
* operation log is a ring buffer (with CRC-32 protection), deleting
entries from the operation log is possible only when the buffer is
overflowed;
* SQL-function is used to read data of operation log.
Example of operation log data.
------------------------------
>select * from pg_operation_log();
event |edition|version| lsn | last |count
------------+-------+-------+---------+----------------------+------
startup |vanilla|10.22.0|0/8000028|2022-11-18 23:06:27+03| 1
pg_upgrade |vanilla|15.0.0 |0/8000028|2022-11-18 23:06:27+03| 1
startup |vanilla|15.0.0 |0/80001F8|2022-11-18 23:11:53+03| 3
pg_resetwal|vanilla|15.0.0 |0/80001F8|2022-11-18 23:09:53+03| 2
(4 rows)
Some details about inserting data to operation log.
---------------------------------------------------
There are two modes of inserting information about events in the
operation log.
* MERGE mode (events "startup", "pg_resetwal", "pg_rewind").
We searches in ring buffer of operation log an event with the same type
("startup" for example) with the same version number.
If event was found, we will increment event counter by 1 and update the
date/time of event ("last" field) with the current value.
If event was not found, we will add this event to the ring buffer (see
INSERT mode).
* INSERT mode (events "bootstrap", "pg_upgrade", "promoted").
We will add an event to the ring buffer (without searching).
P.S. File 'global/pg_control' was chosen as operation log storage
because data of this file cannot be removed or modified in a simple way
and no need to change any extensions and utilities to support this file.
I attached the patch (v1-0001-Operation-log.patch) and extended
description of operation log (Operation-log.txt).
With best regards,
Dmitry Koval
Postgres Professional: http://postgrespro.com
Attachment | Content-Type | Size |
---|---|---|
Operation-log.txt | text/plain | 6.3 KB |
v1-0001-Operation-log.patch | text/plain | 38.2 KB |
From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Operation log for major operations |
Date: | 2022-11-21 11:03:15 |
Message-ID: | 20221121110315.GV11463@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
See also prior discussions:
https://www.postgresql.org/message-id/flat/62750df5b126e1d8ee039a79ef3cc64ac3d47cd5.camel%40j-davis.com
https://www.postgresql.org/message-id/flat/20180228214311.jclah37cwh572t2c%40alap3.anarazel.de
--
Justin
From: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Operation log for major operations |
Date: | 2022-11-21 13:55:36 |
Message-ID: | 30979c9c-b804-d346-18d5-2e3ff26a9055@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Thanks for references, Justin!
Couple comments about these references.
1) "Make unlogged table resets detectable".
https://www.postgresql.org/message-id/flat/62750df5b126e1d8ee039a79ef3cc64ac3d47cd5.camel%40j-davis.com
This conversation is about specific problem (unlogged table repairing).
Operation log has another use - it is primary a helper for diagnostics.
2) "RFC: Add 'taint' field to pg_control."
https://www.postgresql.org/message-id/flat/20180228214311.jclah37cwh572t2c%40alap3.anarazel.de
This is almost the same problem that we want to solve with operation
log. Differences between the operation log and what is discussed in the
thread:
* there suggested to store operation log in pg_control file - but
separate from pg_control main data (and write data separately too);
* operation log data can be represented in relational form (not flags),
this is more usable for RDBMS;
* number of registered event types can be increased easy (without
changes of storage).
--
With best regards,
Dmitry Koval
Postgres Professional: http://postgrespro.com
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Operation log for major operations |
Date: | 2022-11-23 09:00:21 |
Message-ID: | 20221123090021.ba5johlgwwdx2nzm@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2022-Nov-21, Dmitry Koval wrote:
> Concepts.
> ---------
> * operation log is placed in the file 'global/pg_control', starting from
> position 4097 (log size is 4kB);
I think storing this in pg_control is a bad idea. That file is
extremely critical and if you break it, you're pretty much SOL on
recovering your data. I suggest that this should use a separate file.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"All rings of power are equal,
But some rings of power are more equal than others."
(George Orwell's The Lord of the Rings)
From: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Operation log for major operations |
Date: | 2022-12-05 08:11:51 |
Message-ID: | 066b68a1-6e25-6e10-a88f-4f24e7b60a51@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi!
>I think storing this in pg_control is a bad idea. That file is
>extremely critical and if you break it, you're pretty much SOL on
>recovering your data. I suggest that this should use a separate file.
Thanks. Operation log location changed to 'global/pg_control_log' (if
the name is not pretty, it can be changed).
I attached the patch (v2-0001-Operation-log.patch) and description of
operation log (Operation-log.txt).
With best regards,
Dmitry Koval
Postgres Professional: http://postgrespro.com
Attachment | Content-Type | Size |
---|---|---|
Operation-log.txt | text/plain | 8.7 KB |
v2-0001-Operation-log.patch | text/plain | 47.7 KB |
From: | vignesh C <vignesh21(at)gmail(dot)com> |
---|---|
To: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Operation log for major operations |
Date: | 2023-01-14 06:53:55 |
Message-ID: | CALDaNm2tO7sgHy-2Os8sgTWfk=YwwQTcGa-fnJ=Ri69_hLvw8A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, 5 Dec 2022 at 13:42, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> wrote:
>
> Hi!
>
> >I think storing this in pg_control is a bad idea. That file is
> >extremely critical and if you break it, you're pretty much SOL on
> >recovering your data. I suggest that this should use a separate file.
>
> Thanks. Operation log location changed to 'global/pg_control_log' (if
> the name is not pretty, it can be changed).
>
> I attached the patch (v2-0001-Operation-log.patch) and description of
> operation log (Operation-log.txt).
The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
ff23b592ad6621563d3128b26860bcb41daf9542 ===
=== applying patch ./v2-0001-Operation-log.patch
....
patching file src/tools/msvc/Mkvcbuild.pm
Hunk #1 FAILED at 134.
1 out of 1 hunk FAILED -- saving rejects to file src/tools/msvc/Mkvcbuild.pm.rej
[1] - http://cfbot.cputube.org/patch_41_4018.log
Regards,
Vignesh
From: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
---|---|
To: | vignesh C <vignesh21(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Operation log for major operations |
Date: | 2023-01-14 10:17:07 |
Message-ID: | 919243a3-4811-6fef-32fe-f03d89f80ffb@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi!
>The patch does not apply on top of HEAD ...
Here is a fixed version.
Small additional fixes:
1) added CRC calculation for empty 'pg_control_log' file;
2) added saving 'errno' before calling LWLockRelease and restoring after
that;
3) corrected pg_upgrade for case old cluster does not have
'pg_control_log' file.
--
With best regards,
Dmitry Koval
Postgres Professional: http://postgrespro.com
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Operation-log.patch | text/plain | 48.8 KB |
From: | vignesh C <vignesh21(at)gmail(dot)com> |
---|---|
To: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Operation log for major operations |
Date: | 2023-01-19 11:22:58 |
Message-ID: | CALDaNm10-wzPZKpFH_TezTWYJMRPHz8P0PWE8GDTOvpmof+NTw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Sat, 14 Jan 2023 at 15:47, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> wrote:
>
> Hi!
>
> >The patch does not apply on top of HEAD ...
>
> Here is a fixed version.
> Small additional fixes:
> 1) added CRC calculation for empty 'pg_control_log' file;
> 2) added saving 'errno' before calling LWLockRelease and restoring after
> that;
> 3) corrected pg_upgrade for case old cluster does not have
> 'pg_control_log' file.
The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
14bdb3f13de16523609d838b725540af5e23ddd3 ===
=== applying patch ./v3-0001-Operation-log.patch
...
patching file src/tools/msvc/Mkvcbuild.pm
Hunk #1 FAILED at 134.
1 out of 1 hunk FAILED -- saving rejects to file src/tools/msvc/Mkvcbuild.pm.rej
[1] - http://cfbot.cputube.org/patch_41_4018.log
Regards,
Vignesh
From: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
---|---|
To: | vignesh C <vignesh21(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Operation log for major operations |
Date: | 2023-01-19 21:11:58 |
Message-ID: | 4cbc15b2-40d4-70a5-ec20-c2044439b309@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
>The patch does not apply on top of HEAD ...
Thanks!
Here is a fixed version.
Additional changes:
1) get_operation_log() function doesn't create empty operation log file;
2) removed extra unlink() call.
--
With best regards,
Dmitry Koval
Postgres Professional: http://postgrespro.com
Attachment | Content-Type | Size |
---|---|---|
v4-0001-Operation-log.patch | text/plain | 47.9 KB |
From: | Ted Yu <yuzhihong(at)gmail(dot)com> |
---|---|
To: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
Cc: | vignesh C <vignesh21(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Operation log for major operations |
Date: | 2023-01-19 21:36:24 |
Message-ID: | CALte62x_3x4LyFDS3CEtQBBmALFayLTheMQ_swMs7YDprEu4xQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, Jan 19, 2023 at 1:12 PM Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> wrote:
> >The patch does not apply on top of HEAD ...
>
> Thanks!
> Here is a fixed version.
>
> Additional changes:
> 1) get_operation_log() function doesn't create empty operation log file;
> 2) removed extra unlink() call.
>
> --
> With best regards,
> Dmitry Koval
>
> Postgres Professional: http://postgrespro.com
Hi,
Copyright (c) 1996-2022
Please update year for the license in pg_controllog.c
+check_file_exists(const char *datadir, const char *path)
There is existing helper function such as:
src/backend/utils/fmgr/dfmgr.c:static bool file_exists(const char *name);
Is it possible to reuse that code ?
Cheers
From: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
---|---|
To: | Ted Yu <yuzhihong(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Operation log for major operations |
Date: | 2023-01-20 09:19:23 |
Message-ID: | 69d49e4f-a7ab-e36b-cf2c-1e0809d0234a@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Thanks, Ted Yu!
> Please update year for the license in pg_controllog.c
License updated for files pg_controllog.c, controllog_utils.c,
pg_controllog.h, controllog_utils.h.
> +check_file_exists(const char *datadir, const char *path)
> There is existing helper function such as:
> src/backend/utils/fmgr/dfmgr.c:static bool file_exists(const char *name);
> Is it possible to reuse that code ?
There are a lot of functions that check the file existence:
1) src/backend/utils/fmgr/dfmgr.c:static bool file_exists(const char *name);
2) src/backend/jit/jit.c:static bool file_exists(const char *name);
3) src/test/regress/pg_regress.c:bool file_exists(const char *file);
4) src/bin/pg_upgrade/exec.c:bool pid_lock_file_exists(const char *datadir);
5) src/backend/commands/extension.c:bool extension_file_exists(const
char *extensionName);
But there is no unified function: different components use their own
function with their own specific.
Probably we can not reuse code of dfmgr.c:file_exists() because this
function skip "errno == EACCES" (this error is critical for us).
I copied for src/bin/pg_rewind/file_ops.c:check_file_exists() code of
function jit.c:file_exists() (with adaptation for the utility).
--
With best regards,
Dmitry Koval
Postgres Professional: http://postgrespro.com
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Operation-log.patch | text/plain | 47.8 KB |
From: | Ted Yu <yuzhihong(at)gmail(dot)com> |
---|---|
To: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Operation log for major operations |
Date: | 2023-01-20 13:23:16 |
Message-ID: | CALte62w+yo33GxQ9FRc4Ho2Cp-phg0-SBiPoWr15Lc1H8gumJA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Fri, Jan 20, 2023 at 1:19 AM Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> wrote:
> Thanks, Ted Yu!
>
> > Please update year for the license in pg_controllog.c
>
> License updated for files pg_controllog.c, controllog_utils.c,
> pg_controllog.h, controllog_utils.h.
>
> > +check_file_exists(const char *datadir, const char *path)
> > There is existing helper function such as:
> > src/backend/utils/fmgr/dfmgr.c:static bool file_exists(const char
> *name);
> > Is it possible to reuse that code ?
>
> There are a lot of functions that check the file existence:
>
> 1) src/backend/utils/fmgr/dfmgr.c:static bool file_exists(const char
> *name);
> 2) src/backend/jit/jit.c:static bool file_exists(const char *name);
> 3) src/test/regress/pg_regress.c:bool file_exists(const char *file);
> 4) src/bin/pg_upgrade/exec.c:bool pid_lock_file_exists(const char
> *datadir);
> 5) src/backend/commands/extension.c:bool extension_file_exists(const
> char *extensionName);
>
> But there is no unified function: different components use their own
> function with their own specific.
> Probably we can not reuse code of dfmgr.c:file_exists() because this
> function skip "errno == EACCES" (this error is critical for us).
> I copied for src/bin/pg_rewind/file_ops.c:check_file_exists() code of
> function jit.c:file_exists() (with adaptation for the utility).
>
> --
> With best regards,
> Dmitry Koval
>
> Postgres Professional: http://postgrespro.com
Hi,
Maybe another discussion thread can be created for the consolidation of
XX_file_exists functions.
Cheers
From: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
---|---|
To: | Ted Yu <yuzhihong(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Operation log for major operations |
Date: | 2023-01-25 08:35:24 |
Message-ID: | dd621eb3-b532-25e2-59d7-101861ce9322@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi!
> Maybe another discussion thread can be created for the consolidation of
> XX_file_exists functions.
Usually XX_file_exists functions are simple. They contain single call
stat() or open() and specific error processing after this call.
Likely the unified function will be too complex to cover all the uses of
XX_file_exists functions.
--
With best regards,
Dmitry Koval
Postgres Professional: http://postgrespro.com
From: | "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com> |
---|---|
To: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
Cc: | vignesh C <vignesh21(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Operation log for major operations |
Date: | 2023-03-01 20:59:23 |
Message-ID: | CAM-w4HP6CEnv6ZYcKsArd5Egds0Fk3zga95_=wYfu3TP8QQ2YQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, 19 Jan 2023 at 16:12, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> wrote:
>
> >The patch does not apply on top of HEAD ...
>
> Thanks!
> Here is a fixed version.
Sorry to say, but this needs a rebase again... Setting to Waiting on Author...
Are there specific feedback needed to make progress? Once it's rebased
if you think it's ready set it to Ready for Committer or if you still
need feedback then Needs Review -- but it's usually more helpful to do
that with an email expressing what questions you're blocked on.
--
Gregory Stark
As Commitfest Manager
From: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Operation log for major operations |
Date: | 2023-03-02 17:57:43 |
Message-ID: | 073f99fa-9a11-788e-cccb-d23f68f7d1ee@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi!
These changes did not interest the community. It was expected (topic is
very specifiс: vendor's technical support). So no sense to distract
developers ...
I'll move patch to Withdrawn.
--
With best regards,
Dmitry Koval
Postgres Professional: http://postgrespro.com
From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Operation log for major operations |
Date: | 2023-03-02 18:37:03 |
Message-ID: | 20230302183703.GH4268@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, Mar 02, 2023 at 08:57:43PM +0300, Dmitry Koval wrote:
> These changes did not interest the community. It was expected (topic is very
> specifiс: vendor's technical support). So no sense to distract developers
Actually, I think there is interest, but it has to be phrased in a
limited sense to go into the control file.
In November, I referenced 2 threads, but I think you misunderstood one
of them. If you skim the first couple mails, you'll find a discussion
about recording crash information in the control file.
https://www.postgresql.org/message-id/666c2599a07addea00ae2d0af96192def8441974.camel%40j-davis.com
It's come up several times now, and there seems to be ample support for
adding some limited information.
But a "log" which might exceed a few dozen bytes (now or later), that's
inconsistent with the pre-existing purpose served by pg_control.
--
Justin
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Operation log for major operations |
Date: | 2023-03-02 19:36:43 |
Message-ID: | 2130093.1677785803@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Justin Pryzby <pryzby(at)telsasoft(dot)com> writes:
> On Thu, Mar 02, 2023 at 08:57:43PM +0300, Dmitry Koval wrote:
>> These changes did not interest the community. It was expected (topic is very
>> specifiс: vendor's technical support). So no sense to distract developers
> Actually, I think there is interest, but it has to be phrased in a
> limited sense to go into the control file.
> In November, I referenced 2 threads, but I think you misunderstood one
> of them. If you skim the first couple mails, you'll find a discussion
> about recording crash information in the control file.
> https://www.postgresql.org/message-id/666c2599a07addea00ae2d0af96192def8441974.camel%40j-davis.com
> It's come up several times now, and there seems to be ample support for
> adding some limited information.
> But a "log" which might exceed a few dozen bytes (now or later), that's
> inconsistent with the pre-existing purpose served by pg_control.
I'm pretty dubious about recording *anything* in the control file.
Every time we write to that, we risk the entire database on completing
the write successfully. I don't want to do that more often than once
per checkpoint. If you want to put crash info in some less-critical
place, maybe we could talk.
regards, tom lane
From: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Operation log for major operations |
Date: | 2023-03-02 21:09:38 |
Message-ID: | 5c8ab42d-d0f7-42fa-6d98-cee4649c0d6d@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
I'll try to expand my explanation.
I fully understand and accept the arguments about "limited sense to go
into the control file" and "about recording *anything* in the control
file". This is totally correct for vanilla.
But vendors have forks of PostgreSQL with custom features and extensions.
Sometimes (especially at the first releases) these custom components
have bugs which can causes rare problems in data.
These problems can migrate with using pg_upgrade and "lazy" upgrade of
pages to higher versions of PostgreSQL fork.
So in error cases "recording crash information" etc. is not the only
important information.
Very important is history of this database (pg_upgrades, promotions,
pg_resets, pg_rewinds etc.).
Often these "history" allows you to determine from which version of the
PostgreSQL fork the error came from and what causes of errors we can
discard immediately.
This "history" is the information that our technical support wants (and
reason of this patch), but this information is not needed for vanilla...
Another important condition is that the user should not have easy ways
to delete information about "history" (about reason to use pg_control
file as "history" storage, but write into it from position 4kB, 8kB,...).
--
With best regards,
Dmitry Koval
Postgres Professional: http://postgrespro.com
From: | Kirk Wolak <wolakk(at)gmail(dot)com> |
---|---|
To: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Operation log for major operations |
Date: | 2023-03-02 21:49:43 |
Message-ID: | CACLU5mRzPOZmZrEEnwYKsd3st_rKEhnHWq2qU0TfdtB0AzCKxg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, Mar 2, 2023 at 4:09 PM Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> wrote:
> I'll try to expand my explanation.
> I fully understand and accept the arguments about "limited sense to go
> into the control file" and "about recording *anything* in the control
> file". This is totally correct for vanilla.
> But vendors have forks of PostgreSQL with custom features and extensions.
> Sometimes (especially at the first releases) these custom components
> have bugs which can causes rare problems in data.
> These problems can migrate with using pg_upgrade and "lazy" upgrade of
> pages to higher versions of PostgreSQL fork.
>
> So in error cases "recording crash information" etc. is not the only
> important information.
> Very important is history of this database (pg_upgrades, promotions,
> pg_resets, pg_rewinds etc.).
> Often these "history" allows you to determine from which version of the
> PostgreSQL fork the error came from and what causes of errors we can
> discard immediately.
>
> This "history" is the information that our technical support wants (and
> reason of this patch), but this information is not needed for vanilla...
>
> Another important condition is that the user should not have easy ways
> to delete information about "history" (about reason to use pg_control
> file as "history" storage, but write into it from position 4kB, 8kB,...).
>
> --
> With best regards,
> Dmitry Koval
>
> Postgres Professional: http://postgrespro.com
>
> Dmitry, this is a great explanation. Thinking outside the box, it feels
like:
We need some kind of semaphore flag that tells us something awkward
happened.
When it happened, and a little bit of extra information.
You also make the point that if such things have happened, it would
probably be a good idea to NOT
allow pg_upgrade to run. It might even be a reason to constantly bother
someone until the issue is repaired.
To that point, this feels like a "postgresql_panic.log" file (within the
postgresql files?)... Something that would prevent pg_upgrade,
etc. That everyone recognizes is serious. Especially 3rd party vendors.
I see the need for such a thing. I have to agree with others about
questioning the proper place to write this.
Are there other places that make sense, that you could use, especially if
knowing it exists means there was a serious issue?
Kirk
From: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
---|---|
To: | Kirk Wolak <wolakk(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Operation log for major operations |
Date: | 2023-03-13 21:36:14 |
Message-ID: | b7a1dee3-a4f2-b8d6-a020-a0584c40902f@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Kirk, I'm sorry about the long pause in my reply.
>We need some kind of semaphore flag that tells us something awkward
>happened. When it happened, and a little bit of extra information.
I agree that we do not have this kind of information.
Additionally, legal events like start of pg_rewind, pg_reset, ... are
interesting.
>You also make the point that if such things have happened, it would
>probably be a good idea to NOT allow pg_upgrade to run.
>It might even be a reason to constantly bother someone until
>the issue is repaired.
I think no reason to forbid the run of pg_upgrade for the user
(especially in automatic mode).
If we automatically do NOT allow pg_upgrade, what should the user do for
allow pg_upgrade?
Unfortunately, PostgreSQL does not have the utilities to correct errors
in the database (in case of errors users uses copies of the DB or
corrects errors manually).
An ordinary user cannot correct errors on his own ...
So we cannot REQUIRE the user to correct database errors, we can only
INFORM about them.
>To that point, this feels like a "postgresql_panic.log" file (within
>the postgresql files?)... Something that would prevent pg_upgrade,
>etc. That everyone recognizes is serious. Especially 3rd party vendors.
>I see the need for such a thing. I have to agree with others about
>questioning the proper place to write this.
>Are there other places that make sense, that you could use, especially
>if knowing it exists means there was a serious issue?
The location of the operation log (like a "postgresql_panic.log") is not
easy question.
Our technical support is sure that the large number of failures are
caused by "human factor" (actions of database administrators).
It is not difficult for database administrators to delete the
"postgresql_panic.log" file or edit it (for example, replacing it with
an old version; CRC will not save you from such an action).
Therefore, our technical support decided to place the operation log at
the end of the pg_control file, at an offset of 8192 bytes (and protect
this log with CRC).
About writing to the pg_control file what worries Tom Lane: information
in pg_control is written once at system startup (twice in case of
"promote").
Also, some utilities write information to the operation log too -
pg_resetwal, pg_rewind, pg_upgrade (these utilities also modify the
pg_control file without the operation log).
If you are interested, I can attach the current patch (for info - I
think it makes no sense to offer this patch at commitfest).
--
With best regards,
Dmitry Koval
Postgres Professional: http://postgrespro.com