Lists: | pgsql-committers |
---|
From: | Robert Haas <rhaas(at)postgresql(dot)org> |
---|---|
To: | pgsql-committers(at)postgresql(dot)org |
Subject: | pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-04 18:23:00 |
Message-ID: | E1YpL1A-0007Tz-Av@gemulon.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
Recursively fsync() the data directory after a crash.
Otherwise, if there's another crash, some writes from after the first
crash might make it to disk while writes from before the crash fail
to make it to disk. This could lead to data corruption.
Back-patch to all supported versions.
Abhijit Menon-Sen, reviewed by Andres Freund and slightly revised
by me.
Branch
------
master
Details
-------
http://git.postgresql.org/pg/commitdiff/2ce439f3379aed857517c8ce207485655000fc8e
Modified Files
--------------
src/backend/access/transam/xlog.c | 42 ++++++++++++++
src/backend/storage/file/fd.c | 115 +++++++++++++++++++++++++++++++++++++
src/include/storage/fd.h | 2 +
3 files changed, 159 insertions(+)
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Robert Haas <rhaas(at)postgresql(dot)org>, pgsql-committers(at)postgresql(dot)org |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-05 03:37:21 |
Message-ID: | 55483AF1.50208@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
On 05/04/2015 02:23 PM, Robert Haas wrote:
> Recursively fsync() the data directory after a crash.
>
> Otherwise, if there's another crash, some writes from after the first
> crash might make it to disk while writes from before the crash fail
> to make it to disk. This could lead to data corruption.
>
> Back-patch to all supported versions.
>
This appears to have broken Windows builds. At least it's broken by
using the two argument form of open instead of the three argument form,
for which we have a #define in the win32 case, and pg_win32_is_junction
is a typo - the first _ should not be there.
cheers
andrew
From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-05 13:17:07 |
Message-ID: | CA+TgmoZgU7phQDPP7jm6gQ3N5u67Vp7FLiJ56M6dv9TD_+W97w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
On Mon, May 4, 2015 at 11:37 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> On 05/04/2015 02:23 PM, Robert Haas wrote:
>> Recursively fsync() the data directory after a crash.
>>
>> Otherwise, if there's another crash, some writes from after the first
>> crash might make it to disk while writes from before the crash fail
>> to make it to disk. This could lead to data corruption.
>>
>> Back-patch to all supported versions.
>>
>
> This appears to have broken Windows builds. At least it's broken by using
> the two argument form of open instead of the three argument form, for which
> we have a #define in the win32 case, and pg_win32_is_junction is a typo -
> the first _ should not be there.
Sorry about that. Working on it now.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
---|---|
To: | Robert Haas <rhaas(at)postgresql(dot)org>, pgsql-committers(at)postgresql(dot)org |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-18 01:44:27 |
Message-ID: | 555943FB.8090703@gmx.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
On 5/4/15 2:23 PM, Robert Haas wrote:
> Recursively fsync() the data directory after a crash.
>
> Otherwise, if there's another crash, some writes from after the first
> crash might make it to disk while writes from before the crash fail
> to make it to disk. This could lead to data corruption.
If there a reason why in pre_sync_fname(), the error message does not
contain a %m?
From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
Cc: | pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-18 12:01:52 |
Message-ID: | CA+TgmobK=p4wnikYaSRgz6S3WhqDdFhpBv1Vmysz0cBQskoq=w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
On Sun, May 17, 2015 at 9:44 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On 5/4/15 2:23 PM, Robert Haas wrote:
>> Recursively fsync() the data directory after a crash.
>>
>> Otherwise, if there's another crash, some writes from after the first
>> crash might make it to disk while writes from before the crash fail
>> to make it to disk. This could lead to data corruption.
>
> If there a reason why in pre_sync_fname(), the error message does not
> contain a %m?
For consistency with the rest of the file, I suppose. Not sure why
it's like that, but all the functions in the file do it that way.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-18 15:04:33 |
Message-ID: | 14375.1431961473@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sun, May 17, 2015 at 9:44 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> If there a reason why in pre_sync_fname(), the error message does not
>> contain a %m?
> For consistency with the rest of the file, I suppose. Not sure why
> it's like that, but all the functions in the file do it that way.
Huh? All the ones that are reporting kernel call failures certainly
have %m.
regards, tom lane
From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-18 16:04:23 |
Message-ID: | CA+TgmobjLyAGC_e6MHn_mNfBVibE7JKYQtE22AS8pLwfsMDoQA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
On Mon, May 18, 2015 at 11:04 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Sun, May 17, 2015 at 9:44 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>>> If there a reason why in pre_sync_fname(), the error message does not
>>> contain a %m?
>
>> For consistency with the rest of the file, I suppose. Not sure why
>> it's like that, but all the functions in the file do it that way.
>
> Huh? All the ones that are reporting kernel call failures certainly
> have %m.
Oops, you're right. I was looking at the wrong code. Yeah, that
should probably be fixed. I'm not sure it's a good idea to do that
right now though.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-18 16:15:03 |
Message-ID: | 20913.1431965703@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, May 18, 2015 at 11:04 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Huh? All the ones that are reporting kernel call failures certainly
>> have %m.
> Oops, you're right. I was looking at the wrong code. Yeah, that
> should probably be fixed. I'm not sure it's a good idea to do that
> right now though.
It's a trivial enough change that I wouldn't see a problem with doing it
now. But if you do, please get it in by say 2PM EDT.
regards, tom lane
From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-18 16:22:01 |
Message-ID: | 20150518162201.GF9458@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
On Mon, May 18, 2015 at 12:15:03PM -0400, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Mon, May 18, 2015 at 11:04 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Huh? All the ones that are reporting kernel call failures certainly
> >> have %m.
>
> > Oops, you're right. I was looking at the wrong code. Yeah, that
> > should probably be fixed. I'm not sure it's a good idea to do that
> > right now though.
>
> It's a trivial enough change that I wouldn't see a problem with doing it
> now. But if you do, please get it in by say 2PM EDT.
Uh, doesn't that change the translated strings? Is that a problem?
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-18 16:39:13 |
Message-ID: | 21592.1431967153@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
Bruce Momjian <bruce(at)momjian(dot)us> writes:
> On Mon, May 18, 2015 at 12:15:03PM -0400, Tom Lane wrote:
>> It's a trivial enough change that I wouldn't see a problem with doing it
>> now. But if you do, please get it in by say 2PM EDT.
> Uh, doesn't that change the translated strings? Is that a problem?
Better an untranslated message than a translated one that lacks critical
info. But anyway, there are likely other instances of that same string
*with* %m ...
regards, tom lane
From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-18 16:43:40 |
Message-ID: | CA+TgmoZLvn4O7BJuB5OJEHYNx1pwkLZncFs+kws_PDoYnJpm0w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
On Mon, May 18, 2015 at 12:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>> On Mon, May 18, 2015 at 12:15:03PM -0400, Tom Lane wrote:
>>> It's a trivial enough change that I wouldn't see a problem with doing it
>>> now. But if you do, please get it in by say 2PM EDT.
>
>> Uh, doesn't that change the translated strings? Is that a problem?
>
> Better an untranslated message than a translated one that lacks critical
> info. But anyway, there are likely other instances of that same string
> *with* %m ...
Probably not, because of the way the message is worded. But we could do this:
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 6fa75d1..bed8478 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -2461,7 +2461,7 @@ pre_sync_fname(char *fname, bool isdir)
if (fd < 0)
ereport(FATAL,
- (errmsg("could not open file \"%s\"
before fsync",
+ (errmsg("could not open file \"%s\": %m",
fname)));
pg_flush_data(fd, 0, 0);
Does that sound good?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-18 16:51:14 |
Message-ID: | 21963.1431967874@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, May 18, 2015 at 12:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Better an untranslated message than a translated one that lacks critical
>> info. But anyway, there are likely other instances of that same string
>> *with* %m ...
> Probably not, because of the way the message is worded. But we could do this:
> diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
> index 6fa75d1..bed8478 100644
> --- a/src/backend/storage/file/fd.c
> +++ b/src/backend/storage/file/fd.c
> @@ -2461,7 +2461,7 @@ pre_sync_fname(char *fname, bool isdir)
> if (fd < 0)
> ereport(FATAL,
> - (errmsg("could not open file \"%s\"
> before fsync",
> + (errmsg("could not open file \"%s\": %m",
> fname)));
> pg_flush_data(fd, 0, 0);
> Does that sound good?
Works for me. The file/line info for the message would provide the
"before fsync" context anyway, if someone needed it.
regards, tom lane
From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-18 17:00:38 |
Message-ID: | 20150518170038.GQ2523@alvh.no-ip.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
Robert Haas wrote:
> On Mon, May 18, 2015 at 12:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Uh, doesn't that change the translated strings? Is that a problem?
> >
> > Better an untranslated message than a translated one that lacks critical
> > info. But anyway, there are likely other instances of that same string
> > *with* %m ...
>
> Probably not, because of the way the message is worded. But we could do this:
>
> diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
> index 6fa75d1..bed8478 100644
> --- a/src/backend/storage/file/fd.c
> +++ b/src/backend/storage/file/fd.c
> @@ -2461,7 +2461,7 @@ pre_sync_fname(char *fname, bool isdir)
>
> if (fd < 0)
> ereport(FATAL,
> - (errmsg("could not open file \"%s\"
> before fsync",
> + (errmsg("could not open file \"%s\": %m",
> fname)));
>
> pg_flush_data(fd, 0, 0);
>
> Does that sound good?
+1. It's not like the extra two words of context would be of much use
anyway.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Recursively fsync() the data directory after a crash. |
Date: | 2015-05-18 17:18:16 |
Message-ID: | CA+TgmobbVFXERf4Hat47aqUgt1oGGk0E9JSjCeBWg8a3sNz2mg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-committers |
On Mon, May 18, 2015 at 1:00 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Robert Haas wrote:
>> On Mon, May 18, 2015 at 12:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> >> Uh, doesn't that change the translated strings? Is that a problem?
>> >
>> > Better an untranslated message than a translated one that lacks critical
>> > info. But anyway, there are likely other instances of that same string
>> > *with* %m ...
>>
>> Probably not, because of the way the message is worded. But we could do this:
>>
>> diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
>> index 6fa75d1..bed8478 100644
>> --- a/src/backend/storage/file/fd.c
>> +++ b/src/backend/storage/file/fd.c
>> @@ -2461,7 +2461,7 @@ pre_sync_fname(char *fname, bool isdir)
>>
>> if (fd < 0)
>> ereport(FATAL,
>> - (errmsg("could not open file \"%s\"
>> before fsync",
>> + (errmsg("could not open file \"%s\": %m",
>> fname)));
>>
>> pg_flush_data(fd, 0, 0);
>>
>> Does that sound good?
>
> +1. It's not like the extra two words of context would be of much use
> anyway.
Done.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company