Tom Lane [Wed, 1 Oct 2014 23:30:30 +0000 (19:30 -0400)]
Fix some more problems with nested append relations.
As of commit
a87c72915 (which later got backpatched as far as 9.1),
we're explicitly supporting the notion that append relations can be
nested; this can occur when UNION ALL constructs are nested, or when
a UNION ALL contains a table with inheritance children.
Bug #11457 from Nelson Page, as well as an earlier report from Elvis
Pranskevichus, showed that there were still nasty bugs associated with such
cases: in particular the EquivalenceClass mechanism could try to generate
"join" clauses connecting an appendrel child to some grandparent appendrel,
which would result in assertion failures or bogus plans.
Upon investigation I concluded that all current callers of
find_childrel_appendrelinfo() need to be fixed to explicitly consider
multiple levels of parent appendrels. The most complex fix was in
processing of "broken" EquivalenceClasses, which are ECs for which we have
been unable to generate all the derived equality clauses we would like to
because of missing cross-type equality operators in the underlying btree
operator family. That code path is more or less entirely untested by
the regression tests to date, because no standard opfamilies have such
holes in them. So I wrote a new regression test script to try to exercise
it a bit, which turned out to be quite a worthwhile activity as it exposed
existing bugs in all supported branches.
The present patch is essentially the same as far back as 9.2, which is
where parameterized paths were introduced. In 9.0 and 9.1, we only need
to back-patch a small fragment of commit
5b7b5518d, which fixes failure to
propagate out the original WHERE clauses when a broken EC contains constant
members. (The regression test case results show that these older branches
are noticeably stupider than 9.2+ in terms of the quality of the plans
generated; but we don't really care about plan quality in such cases,
only that the plan not be outright wrong. A more invasive fix in the
older branches would not be a good idea anyway from a plan-stability
standpoint.)
Andres Freund [Wed, 1 Oct 2014 12:23:43 +0000 (14:23 +0200)]
Block signals while computing the sleep time in postmaster's main loop.
DetermineSleepTime() was previously called without blocked
signals. That's not good, because it allows signal handlers to
interrupt its workings.
DetermineSleepTime() was added in 9.3 with the addition of background
workers (
da07a1e856511), where it only read from
BackgroundWorkerList.
Since 9.4, where dynamic background workers were added (
7f7485a0cde),
the list is also manipulated in DetermineSleepTime(). That's bad
because the list now can be persistently corrupted if modified by both
a signal handler and DetermineSleepTime().
This was discovered during the investigation of hangs on buildfarm
member anole. It's unclear whether this bug is the source of these
hangs or not, but it's worth fixing either way. I have confirmed that
it can cause crashes.
It luckily looks like this only can cause problems when bgworkers are
actively used.
Discussion:
20140929193733.GB14400@awork2.anarazel.de
Backpatch to 9.3 where background workers were introduced.
Stephen Frost [Tue, 30 Sep 2014 19:55:28 +0000 (15:55 -0400)]
Correct stdin/stdout usage in COPY .. PROGRAM
The COPY documentation incorrectly stated, for the PROGRAM case,
that we read from stdin and wrote to stdout. Fix that, and improve
consistency by referring to the 'PostgreSQL' user instead of the
'postgres' user, as is done in the rest of the COPY documentation.
Pointed out by Peter van Dijk.
Back-patch to 9.3 where COPY .. PROGRAM was introduced.
Robert Haas [Fri, 26 Sep 2014 15:21:35 +0000 (11:21 -0400)]
Fix identify_locking_dependencies for schema-only dumps.
Without this fix, parallel restore of a schema-only dump can deadlock,
because when the dump is schema-only, the dependency will still be
pointing at the TABLE item rather than the TABLE DATA item.
Robert Haas and Tom Lane
Andres Freund [Thu, 25 Sep 2014 13:22:26 +0000 (15:22 +0200)]
Fix VPATH builds of the replication parser from git for some !gcc compilers.
Some compilers don't automatically search the current directory for
included files.
9cc2c182fc2 fixed that for builds from tarballs by
adding an include to the source directory. But that doesn't work when
the scanner is generated in the VPATH directory. Use the same search
path as the other parsers in the tree.
One compiler that definitely was affected is solaris' sun cc.
Backpatch to 9.1 which introduced using an actual parser for
replication commands.
Tom Lane [Wed, 24 Sep 2014 00:25:36 +0000 (20:25 -0400)]
Fix incorrect search for "x?" style matches in creviterdissect().
When the number of allowed iterations is limited (either a "?" quantifier
or a bound expression), the last sub-match has to reach to the end of the
target string. The previous coding here first tried the shortest possible
match (one character, usually) and then gave up and back-tracked if that
didn't work, typically leading to failure to match overall, as shown in
bug #11478 from Christoph Berg. The minimum change to fix that would be to
not decrement k before "goto backtrack"; but that would be a pretty stupid
solution, because we'd laboriously try each possible sub-match length
before finally discovering that only ending at the end can work. Instead,
force the sub-match endpoint limit up to the end for even the first
shortest() call if we cannot have any more sub-matches after this one.
Bug introduced in my rewrite that added the iterdissect logic, commit
173e29aa5deefd9e71c183583ba37805c8102a72. The shortest-first search code
was too closely modeled on the longest-first code, which hasn't got this
issue since it tries a match reaching to the end to start with anyway.
Back-patch to all affected branches.
Robert Haas [Mon, 22 Sep 2014 20:05:51 +0000 (16:05 -0400)]
Fix mishandling of CreateEventTrigStmt's eventname field.
It's a string, not a scalar.
Petr Jelinek
Tom Lane [Fri, 19 Sep 2014 17:19:02 +0000 (13:19 -0400)]
Fix failure of contrib/auto_explain to print per-node timing information.
This has been broken since commit
af7914c6627bcf0b0ca614e9ce95d3f8056602bf,
which added the EXPLAIN (TIMING) option. Although that commit included
updates to auto_explain, they evidently weren't tested very carefully,
because the code failed to print node timings even when it should, due to
failure to set es.timing in the ExplainState struct. Reported off-list by
Neelakanth Nadgir of Salesforce.
In passing, clean up the documentation for auto_explain's options a
little bit, including re-ordering them into what seems to me a more
logical order.
Andres Freund [Fri, 19 Sep 2014 15:04:00 +0000 (17:04 +0200)]
Mark x86's memory barrier inline assembly as clobbering the cpu flags.
x86's memory barrier assembly was marked as clobbering "memory" but
not "cc" even though 'addl' sets various flags. As it turns out gcc on
x86 implicitly assumes "cc" on every inline assembler statement, so
it's not a bug. But as that's poorly documented and might get copied
to architectures or compilers where that's not the case, it seems
better to be precise.
Discussion:
20140919100016.GH4277@alap3.anarazel.de
To keep the code common, backpatch to 9.2 where explicit memory
barriers were introduced.
Peter Eisentraut [Sun, 14 Sep 2014 14:50:04 +0000 (10:50 -0400)]
doc: Fix documentation of local_preload_libraries
The documentation used to suggest setting this parameter with ALTER ROLE
SET, but that never worked, so replace it with a working suggestion.
Reported-by: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Stephen Frost [Fri, 12 Sep 2014 15:24:09 +0000 (11:24 -0400)]
Handle border = 3 in expanded mode
In psql, expanded mode was not being displayed correctly when using
the normal ascii or unicode linestyles and border set to '3'. Now,
per the documentation, border '3' is really only sensible for HTML
and LaTeX formats, however, that's no excuse for ascii/unicode to
break in that case, and provisions had been made for psql to cleanly
handle this case (and it did, in non-expanded mode).
This was broken when ascii/unicode was initially added a good five
years ago because print_aligned_vertical_line wasn't passed in the
border setting being used by print_aligned_vertical but instead was
given the whole printTableContent. There really isn't a good reason
for vertical_line to have the entire printTableContent structure, so
just pass in the printTextFormat and border setting (similar to how
this is handled in horizontal_line).
Pointed out by Pavel Stehule, fix by me.
Back-patch to all currently-supported versions.
Tom Lane [Fri, 12 Sep 2014 03:30:57 +0000 (23:30 -0400)]
Fix power_var_int() for large integer exponents.
The code for raising a NUMERIC value to an integer power wasn't very
careful about large powers. It got an outright wrong answer for an
exponent of INT_MIN, due to failure to consider overflow of the Abs(exp)
operation; which is fixable by using an unsigned rather than signed
exponent value after that point. Also, even though the number of
iterations of the power-computation loop is pretty limited, it's easy for
the repeated squarings to result in ridiculously enormous intermediate
values, which can take unreasonable amounts of time/memory to process,
or even overflow the internal "weight" field and so produce a wrong answer.
We can forestall misbehaviors of that sort by bailing out as soon as the
weight value exceeds what will fit in int16, since then the final answer
must overflow (if exp > 0) or underflow (if exp < 0) the packed numeric
format.
Per off-list report from Pavel Stehule. Back-patch to all supported
branches.
Bruce Momjian [Thu, 11 Sep 2014 22:39:46 +0000 (18:39 -0400)]
pg_upgrade: preserve the timestamp epoch
This is useful for replication tools like Slony and Skytools. This is a
backpatch of
a74a4aa23bb95b590ff01ee564219d2eacea3706.
Report by Sergey Konoplev
Backpatch through 9.3
Andres Freund [Tue, 9 Sep 2014 11:57:38 +0000 (13:57 +0200)]
Fix typo in solaris spinlock fix.
07968dbfaad03 missed part of the S_UNLOCK define when building for
sparcv8+.
Andres Freund [Mon, 8 Sep 2014 22:47:32 +0000 (00:47 +0200)]
Fix spinlock implementation for some !solaris sparc platforms.
Some Sparc CPUs can be run in various coherence models, ranging from
RMO (relaxed) over PSO (partial) to TSO (total). Solaris has always
run CPUs in TSO mode while in userland, but linux didn't use to and
the various *BSDs still don't. Unfortunately the sparc TAS/S_UNLOCK
were only correct under TSO. Fix that by adding the necessary memory
barrier instructions. On sparcv8+, which should be all relevant CPUs,
these are treated as NOPs if the current consistency model doesn't
require the barriers.
Discussion:
20140630222854.GW26930@awork2.anarazel.de
Will be backpatched to all released branches once a few buildfarm
cycles haven't shown up problems. As I've no access to sparc, this is
blindly written.
Tom Lane [Mon, 8 Sep 2014 20:09:52 +0000 (16:09 -0400)]
Fix psql \s to work with recent libedit, and add pager support.
psql's \s (print command history) doesn't work at all with recent libedit
versions when printing to the terminal, because libedit tries to do an
fchmod() on the target file which will fail if the target is /dev/tty.
(We'd already noted this in the context of the target being /dev/null.)
Even before that, it didn't work pleasantly, because libedit likes to
encode the command history file (to ensure successful reloading), which
renders it nigh unreadable, not to mention significantly different-looking
depending on exactly which libedit version you have. So let's forget using
write_history() for this purpose, and instead print the data ourselves,
using logic similar to that used to iterate over the history for newline
encoding/decoding purposes.
While we're at it, insert the ability to use the pager when \s is printing
to the terminal. This has been an acknowledged shortcoming of \s for many
years, so while you could argue it's not exactly a back-patchable bug fix
it still seems like a good improvement. Anyone who's seriously annoyed
at this can use "\s /dev/tty" or local equivalent to get the old behavior.
Experimentation with this showed that the history iteration logic was
actually rather broken when used with libedit. It turns out that with
libedit you have to use previous_history() not next_history() to advance
to more recent history entries. The easiest and most robust fix for this
seems to be to make a run-time test to verify which function to call.
We had not noticed this because libedit doesn't really need the newline
encoding logic: its own encoding ensures that command entries containing
newlines are reloaded correctly (unlike libreadline). So the effective
behavior with recent libedits was that only the oldest history entry got
newline-encoded or newline-decoded. However, because of yet other bugs in
history_set_pos(), some old versions of libedit allowed the existing loop
logic to reach entries besides the oldest, which means there may be libedit
~/.psql_history files out there containing encoded newlines in more than
just the oldest entry. To ensure we can reload such files, it seems
appropriate to back-patch this fix, even though that will result in some
incompatibility with older psql versions (ie, multiline history entries
written by a psql with this fix will look corrupted to a psql without it,
if its libedit is reasonably up to date).
Stepan Rutz and Tom Lane
Tom Lane [Mon, 8 Sep 2014 02:40:41 +0000 (22:40 -0400)]
Documentation fix: sum(float4) returns float4, not float8.
The old claim is from my commit
d06ebdb8d3425185d7e641d15e45908658a0177d of
2000-07-17, but it seems to have been a plain old thinko; sum(float4) has
been distinct from sum(float8) since Berkeley days. Noted by KaiGai Kohei.
While at it, mention the existence of sum(money), which is also of
embarrassingly ancient vintage.
Fujii Masao [Thu, 4 Sep 2014 17:17:57 +0000 (02:17 +0900)]
Fix segmentation fault that an empty prepared statement could cause.
Back-patch to all supported branches.
Per bug #11335 from Haruka Takatsuka
Kevin Grittner [Sat, 30 Aug 2014 16:03:23 +0000 (11:03 -0500)]
doc: Various typo/grammar fixes
Errors detected using Topy (https://github.com/intgr/topy), all
changes verified by hand and some manual tweaks added.
Marti Raudsepp
Individual changes backpatched, where applicable, as far as 9.0.
Tom Lane [Thu, 28 Aug 2014 22:21:14 +0000 (18:21 -0400)]
Fix citext upgrade script for disallowance of oidvector element assignment.
In commit
45e02e3232ac7cc5ffe36f7986159b5e0b1f6fdc, we intentionally
disallowed updates on individual elements of oidvector columns. While that
still seems like a sane idea in the abstract, we (I) forgot that citext's
"upgrade from unpackaged" script did in fact perform exactly such updates,
in order to fix the problem that citext indexes should have a collation
but would not in databases dumped or upgraded from pre-9.1 installations.
Even if we wanted to add casts to allow such updates, there's no practical
way to do so in the back branches, so the only real alternative is to make
citext's kluge even klugier. In this patch, I cast the oidvector to text,
fix its contents with regexp_replace, and cast back to oidvector. (Ugh!)
Since the aforementioned commit went into all active branches, we have to
fix this in all branches that contain the now-broken update script.
Per report from Eric Malm.
Andres Freund [Mon, 25 Aug 2014 16:30:46 +0000 (18:30 +0200)]
Fix typos in some error messages thrown by extension scripts when fed to psql.
Some of the many error messages introduced in
458857cc missed 'FROM
unpackaged'. Also
e016b724 and
45ffeb7e forgot to quote extension
version numbers.
Backpatch to 9.1, just like
458857cc which introduced the messages. Do
so because the error messages thrown when the wrong command is copy &
pasted aren't easy to understand.
Fujii Masao [Mon, 25 Aug 2014 16:30:46 +0000 (18:30 +0200)]
Backpatch: Fix typo in update scripts for some contrib modules.
Backpatch as discussed in
20140702192641.GD22738@awork2.anarazel.de
ff. as the error messages are user facing and possibly confusing.
Original commit:
6f9e39bc9993c18686f0950f9b9657c7c97c7450
Alvaro Herrera [Fri, 22 Aug 2014 17:55:34 +0000 (13:55 -0400)]
Fix outdated comment
Noah Misch [Tue, 19 Aug 2014 03:00:38 +0000 (23:00 -0400)]
Install libpq DLL with $(INSTALL_SHLIB).
Programs need execute permission on a DLL file to load it. MSYS
"install" ignores the mode argument, and our Cygwin build statically
links libpq into programs. That explains the lack of buildfarm trouble.
Back-patch to 9.0 (all supported versions).
Tom Lane [Mon, 18 Aug 2014 05:17:49 +0000 (01:17 -0400)]
Fix obsolete mention of non-int64 support in CREATE SEQUENCE documentation.
The old text explained what happened if we didn't have working int64
arithmetic. Since that case has been explicitly rejected by configure
since 8.4.3, documenting it in the 9.x branches can only produce confusion.
Tom Lane [Sat, 16 Aug 2014 17:48:46 +0000 (13:48 -0400)]
Fix bogus return macros in range_overright_internal().
PG_RETURN_BOOL() should only be used in functions following the V1 SQL
function API. This coding accidentally fails to fail since letting the
compiler coerce the Datum representation of bool back to plain bool
does give the right answer; but that doesn't make it a good idea.
Back-patch to older branches just to avoid unnecessary code divergence.
Tom Lane [Thu, 14 Aug 2014 20:05:52 +0000 (16:05 -0400)]
Update SysV parameter configuration documentation for FreeBSD.
FreeBSD hasn't made any use of kern.ipc.semmap since 1.1, and newer
releases reject attempts to set it altogether; so stop recommending
that it be adjusted. Per bug #11161.
Back-patch to all supported branches. Before 9.3, also incorporate
commit
7a42dff47, which touches the same text and for some reason
was not back-patched at the time.
Fujii Masao [Thu, 14 Aug 2014 04:57:52 +0000 (13:57 +0900)]
Fix help message in pg_ctl.
Previously the help message described that -m is an option for
"stop", "restart" and "promote" commands in pg_ctl. But actually
that's not an option for "promote". So this commit fixes that
incorrect description in the help message.
Back-patch to 9.3 where the incorrect description was added.
Fujii Masao [Mon, 11 Aug 2014 14:19:23 +0000 (23:19 +0900)]
Fix failure to follow the directions when "init" fork was added.
Specifically this commit updates forkname_to_number() so that the HINT
message includes "init" fork, and also adds the description of "init" fork
into pg_relation_size() document.
This is a part of the commit
2d00190495b22e0d0ba351b2cda9c95fb2e3d083
which has fixed the same oversight in master and 9.4. Back-patch to
9.1 where "init" fork was added.
Fujii Masao [Mon, 11 Aug 2014 13:52:16 +0000 (22:52 +0900)]
Fix documentation oversights about pageinspect and initialization fork.
The initialization fork was added in 9.1, but has not been taken into
consideration in documents of get_raw_page function in pageinspect and
storage layout. This commit fixes those oversights.
get_raw_page can read not only a table but also an index, etc. So it
should be documented that the function can read any relation. This commit
also fixes the document of pageinspect that way.
Back-patch to 9.1 where those oversights existed.
Vik Fearing, review by MauMau
Tom Lane [Sun, 10 Aug 2014 20:13:19 +0000 (16:13 -0400)]
Clarify type resolution behavior for domain types.
The user documentation was vague and not entirely accurate about how
we treat domain inputs for ambiguous operators/functions. Clarify
that, and add an example and some commentary. Per a recent question
from Adam Mackler.
It's acted like this ever since we added domains, so back-patch
to all supported branches.
Tom Lane [Sat, 9 Aug 2014 22:40:34 +0000 (18:40 -0400)]
Fix conversion of domains to JSON in 9.3 and 9.2.
In commit
0ca6bda8e7501947c05f30c127f6d12ff90b5a64, I rewrote the json.c
code that decided how to convert SQL data types into JSON values, so that
it no longer relied on typcategory which is a pretty untrustworthy guide
to the output format of user-defined datatypes. However, I overlooked the
fact that CREATE DOMAIN inherits typcategory from the base type, so that
the old coding did have the desirable property of treating domains like
their base types --- but only in some cases, because not all its decisions
turned on typcategory. The version of the patch that went into 9.4 and
up did a getBaseType() call to ensure that domains were always treated
like their base types, but I omitted that from the older branches, because
it would result in a behavioral change for domains over json or hstore;
a change that's arguably a bug fix, but nonetheless a change that users
had not asked for. What I overlooked was that this meant that domains
over numerics and boolean were no longer treated like their base types,
and that we *did* get a complaint about, ie bug #11103 from David Grelaud.
So let's do the getBaseType() call in the older branches as well, to
restore their previous behavior in these cases. That means 9.2 and 9.3
will now make these decisions just like 9.4. We could probably kluge
things to still ignore the domain's base type if it's json etc, but that
seems a bit silly.
Tom Lane [Sat, 9 Aug 2014 17:46:42 +0000 (13:46 -0400)]
Reject duplicate column names in foreign key referenced-columns lists.
Such cases are disallowed by the SQL spec, and even if we wanted to allow
them, the semantics seem ambiguous: how should the FK columns be matched up
with the columns of a unique index? (The matching could be significant in
the presence of opclasses with different notions of equality, so this issue
isn't just academic.) However, our code did not previously reject such
cases, but instead would either fail to match to any unique index, or
generate a bizarre opclass-lookup error because of sloppy thinking in the
index-matching code.
David Rowley
Bruce Momjian [Thu, 7 Aug 2014 18:56:13 +0000 (14:56 -0400)]
pg_upgrade: prevent oid conflicts with new-cluster TOAST tables
Previously, TOAST tables only required in the new cluster could cause
oid conflicts if they were auto-numbered and a later conflicting oid had
to be assigned.
Backpatch through 9.3
Bruce Momjian [Mon, 4 Aug 2014 15:45:45 +0000 (11:45 -0400)]
pg_upgrade: remove reference to autovacuum_multixact_freeze_max_age
autovacuum_multixact_freeze_max_age was added as a pg_ctl start
parameter in 9.3.X to prevent autovacuum from running. However, only
some 9.3.X releases have autovacuum_multixact_freeze_max_age as it was
added in a minor PG 9.3 release. It also isn't needed because -b turns
off autovacuum in 9.1+.
Without this fix, trying to upgrade from an early 9.3 release to 9.4
would fail.
Report by EDB
Backpatch through 9.3
Fujii Masao [Sat, 2 Aug 2014 06:18:09 +0000 (15:18 +0900)]
Add missing PQclear() calls into pg_receivexlog.
Back-patch to 9.3.
Fujii Masao [Sat, 2 Aug 2014 05:57:21 +0000 (14:57 +0900)]
Fix bug in pg_receivexlog --verbose.
In 9.2, pg_receivexlog with verbose option has emitted the messages
at the end of each WAL file. But the commit
0b63291 suppressed such
messages by mistake. This commit fixes the bug so that pg_receivexlog
--verbose outputs such messages again.
Back-patch to 9.3 where the bug was added.
Heikki Linnakangas [Fri, 1 Aug 2014 18:13:17 +0000 (21:13 +0300)]
Fix typo in user manual
Tom Lane [Wed, 30 Jul 2014 18:41:35 +0000 (14:41 -0400)]
Avoid wholesale autovacuuming when autovacuum is nominally off.
When autovacuum is nominally off, we will still launch autovac workers
to vacuum tables that are at risk of XID wraparound. But after we'd done
that, an autovac worker would proceed to autovacuum every table in the
targeted database, if they meet the usual thresholds for autovacuuming.
This is at best pretty unexpected; at worst it delays response to the
wraparound threat. Fix it so that if autovacuum is nominally off, we
*only* do forced vacuums and not any other work.
Per gripe from Andrey Zhidenkov. This has been like this all along,
so back-patch to all supported branches.
Robert Haas [Wed, 30 Jul 2014 15:25:58 +0000 (11:25 -0400)]
Fix mishandling of background worker PGPROCs in EXEC_BACKEND builds.
InitProcess() relies on IsBackgroundWorker to decide whether the PGPROC
for a new backend should be taken from ProcGlobal's freeProcs or from
bgworkerFreeProcs. In EXEC_BACKEND builds, InitProcess() is called
sooner than in non-EXEC_BACKEND builds, and IsBackgroundWorker wasn't
getting initialized soon enough.
Report by Noah Misch. Diagnosis and fix by me.
Heikki Linnakangas [Tue, 29 Jul 2014 07:33:15 +0000 (10:33 +0300)]
Treat 2PC commit/abort the same as regular xacts in recovery.
There were several oversights in recovery code where COMMIT/ABORT PREPARED
records were ignored:
* pg_last_xact_replay_timestamp() (wasn't updated for 2PC commits)
* recovery_min_apply_delay (2PC commits were applied immediately)
* recovery_target_xid (recovery would not stop if the XID used 2PC)
The first of those was reported by Sergiy Zuban in bug #11032, analyzed by
Tom Lane and Andres Freund. The bug was always there, but was masked before
commit
d19bd29f07aef9e508ff047d128a4046cc8bc1e2, because COMMIT PREPARED
always created an extra regular transaction that was WAL-logged.
Backpatch to all supported versions (older versions didn't have all the
features and therefore didn't have all of the above bugs).
Tom Lane [Fri, 25 Jul 2014 23:48:48 +0000 (19:48 -0400)]
Fix a performance problem in pg_dump's dump order selection logic.
findDependencyLoops() was not bright about cases where there are multiple
dependency paths between the same two dumpable objects. In most scenarios
this did not hurt us too badly; but since the introduction of section
boundary pseudo-objects in commit
a1ef01fe163b304760088e3e30eb22036910a495,
it was possible for this code to take unreasonable amounts of time (tens
of seconds on a database with a couple thousand objects), as reported in
bug #11033 from Joe Van Dyk. Joe's particular problem scenario involved
"pg_dump -a" mode with long chains of foreign key constraints, but I think
that similar problems could arise with other situations as long as there
were enough objects. To fix, add a flag array that lets us notice when we
arrive at the same object again while searching from a given start object.
This simple change seems to be enough to eliminate the performance problem.
Back-patch to 9.1, like the patch that introduced section boundary objects.
Robert Haas [Thu, 24 Jul 2014 12:19:19 +0000 (08:19 -0400)]
Avoid access to already-released lock in LockRefindAndRelease.
Spotted by Tom Lane.
Tom Lane [Wed, 23 Jul 2014 19:20:37 +0000 (15:20 -0400)]
Rearrange documentation paragraph describing pg_relation_size().
Break the list of available options into an <itemizedlist> instead of
inline sentences. This is mostly motivated by wanting to ensure that the
cross-references to the FSM and VM docs don't cross page boundaries in PDF
format; but it seems to me to read more easily this way anyway. I took the
liberty of editorializing a bit further while at it.
Per complaint from Magnus about 9.0.18 docs not building in A4 format.
Patch all active branches so we don't get blind-sided by this particular
issue again in future.
Noah Misch [Wed, 23 Jul 2014 04:35:13 +0000 (00:35 -0400)]
Report success when Windows kill() emulation signals an exiting process.
This is consistent with the POSIX verdict that kill() shall not report
ESRCH for a zombie process. Back-patch to 9.0 (all supported versions).
Test code from commit
d7cdf6ee36adeac9233678fb8f2a112e6678a770 depends
on it, and log messages about kill() reporting "Invalid argument" will
cease to appear for this not-unexpected condition.
Noah Misch [Wed, 23 Jul 2014 04:35:07 +0000 (00:35 -0400)]
MSVC: Substitute $(top_builddir) in REGRESS_OPTS.
Commit
d7cdf6ee36adeac9233678fb8f2a112e6678a770 introduced a usage
thereof. Back-patch to 9.0, like that commit.
Tom Lane [Tue, 22 Jul 2014 17:30:01 +0000 (13:30 -0400)]
Re-enable error for "SELECT ... OFFSET -1".
The executor has thrown errors for negative OFFSET values since 8.4 (see
commit
bfce56eea45b1369b7bb2150a150d1ac109f5073), but in a moment of brain
fade I taught the planner that OFFSET with a constant negative value was a
no-op (commit
1a1832eb085e5bca198735e5d0e766a3cb61b8fc). Reinstate the
former behavior by only discarding OFFSET with a value of exactly 0. In
passing, adjust a planner comment that referenced the ancient behavior.
Back-patch to 9.3 where the mistake was introduced.
Tom Lane [Tue, 22 Jul 2014 15:45:53 +0000 (11:45 -0400)]
Check block number against the correct fork in get_raw_page().
get_raw_page tried to validate the supplied block number against
RelationGetNumberOfBlocks(), which of course is only right when
accessing the main fork. In most cases, the main fork is longer
than the others, so that the check was too weak (allowing a
lower-level error to be reported, but no real harm to be done).
However, very small tables could have an FSM larger than their heap,
in which case the mistake prevented access to some FSM pages.
Per report from Torsten Foertsch.
In passing, make the bad-block-number error into an ereport not elog
(since it's certainly not an internal error); and fix sloppily
maintained comment for RelationGetNumberOfBlocksInFork.
This has been wrong since we invented relation forks, so back-patch
to all supported branches.
Noah Misch [Tue, 22 Jul 2014 15:01:03 +0000 (11:01 -0400)]
Diagnose incompatible OpenLDAP versions during build and test.
With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL
backends can crash at exit. Raise a warning during "configure" based on
the compile-time OpenLDAP version number, and test the crash scenario in
the dblink test suite. Back-patch to 9.0 (all supported versions).
Tom Lane [Tue, 22 Jul 2014 02:41:27 +0000 (22:41 -0400)]
Reject out-of-range numeric timezone specifications.
In commit
631dc390f49909a5c8ebd6002cfb2bcee5415a9d, we started to handle
simple numeric timezone offsets via the zic library instead of the old
CTimeZone/HasCTZSet kluge. However, we overlooked the fact that the zic
code will reject UTC offsets exceeding a week (which seems a bit arbitrary,
but not because it's too tight ...). This led to possibly setting
session_timezone to NULL, which results in crashes in most timezone-related
operations as of 9.4, and crashes in a small number of places even before
that. So check for NULL return from pg_tzset_offset() and report an
appropriate error message. Per bug #11014 from Duncan Gillis.
Back-patch to all supported branches, like the previous patch.
(Unfortunately, as of today that no longer includes 8.4.)
Tom Lane [Mon, 21 Jul 2014 19:10:42 +0000 (15:10 -0400)]
Stamp 9.3.5.
Tom Lane [Mon, 21 Jul 2014 18:59:29 +0000 (14:59 -0400)]
Release notes for 9.3.5, 9.2.9, 9.1.14, 9.0.18, 8.4.22.
Tom Lane [Mon, 21 Jul 2014 16:58:49 +0000 (12:58 -0400)]
Adjust cutoff points in newly-added sanity tests.
Per recommendation from Andres.
Tom Lane [Mon, 21 Jul 2014 15:41:36 +0000 (11:41 -0400)]
Defend against bad relfrozenxid/relminmxid/datfrozenxid/datminmxid values.
In commit
a61daa14d56867e90dc011bbba52ef771cea6770, we fixed pg_upgrade so
that it would install sane relminmxid and datminmxid values, but that does
not cure the problem for installations that were already pg_upgraded to
9.3; they'll initially have "1" in those fields. This is not a big problem
so long as 1 is "in the past" compared to the current nextMultiXact
counter. But if an installation were more than halfway to the MXID wrap
point at the time of upgrade, 1 would appear to be "in the future" and
that would effectively disable tracking of oldest MXIDs in those
tables/databases, until such time as the counter wrapped around.
While in itself this isn't worse than the situation pre-9.3, where we did
not manage MXID wraparound risk at all, the consequences of premature
truncation of pg_multixact are worse now; so we ought to make some effort
to cope with this. We discussed advising users to fix the tracking values
manually, but that seems both very tedious and very error-prone.
Instead, this patch adopts two amelioration rules. First, a relminmxid
value that is "in the future" is allowed to be overwritten with a
full-table VACUUM's actual freeze cutoff, ignoring the normal rule that
relminmxid should never go backwards. (This essentially assumes that we
have enough defenses in place that wraparound can never occur anymore,
and thus that a value "in the future" must be corrupt.) Second, if we see
any "in the future" values then we refrain from truncating pg_clog and
pg_multixact. This prevents loss of clog data until we have cleaned up
all the broken tracking data. In the worst case that could result in
considerable clog bloat, but in practice we expect that relfrozenxid-driven
freezing will happen soon enough to fix the problem before clog bloat
becomes intolerable. (Users could do manual VACUUM FREEZEs if not.)
Note that this mechanism cannot save us if there are already-wrapped or
already-truncated-away MXIDs in the table; it's only capable of dealing
with corrupt tracking values. But that's the situation we have with the
pg_upgrade bug.
For consistency, apply the same rules to relfrozenxid/datfrozenxid. There
are not known mechanisms for these to get messed up, but if they were, the
same tactics seem appropriate for fixing them.
Peter Eisentraut [Mon, 21 Jul 2014 05:04:46 +0000 (01:04 -0400)]
Translation updates
Tom Lane [Sun, 20 Jul 2014 02:20:42 +0000 (22:20 -0400)]
Fix xreflabel for hot_standby_feedback.
Rather remarkable that this has been wrong since 9.1 and nobody noticed.
Tom Lane [Sat, 19 Jul 2014 19:00:50 +0000 (15:00 -0400)]
Update time zone data files to tzdata release 2014e.
DST law changes in Crimea, Egypt, Morocco. New zone Antarctica/Troll
for Norwegian base in Queen Maud Land.
Tom Lane [Sat, 19 Jul 2014 18:28:30 +0000 (14:28 -0400)]
Partial fix for dropped columns in functions returning composite.
When a view has a function-returning-composite in FROM, and there are
some dropped columns in the underlying composite type, ruleutils.c
printed junk in the column alias list for the reconstructed FROM entry.
Before 9.3, this was prevented by doing get_rte_attribute_is_dropped
tests while printing the column alias list; but that solution is not
currently available to us for reasons I'll explain below. Instead,
check for empty-string entries in the alias list, which can only exist
if that column position had been dropped at the time the view was made.
(The parser fills in empty strings to preserve the invariant that the
aliases correspond to physical column positions.)
While this is sufficient to handle the case of columns dropped before
the view was made, we have still got issues with columns dropped after
the view was made. In particular, the view could contain Vars that
explicitly reference such columns! The dependency machinery really
ought to refuse the column drop attempt in such cases, as it would do
when trying to drop a table column that's explicitly referenced in
views. However, we currently neglect to store dependencies on columns
of composite types, and fixing that is likely to be too big to be
back-patchable (not to mention that existing views in existing databases
would not have the needed pg_depend entries anyway). So I'll leave that
for a separate patch.
Pre-9.3, ruleutils would print such Vars normally (with their original
column names) even though it suppressed their entries in the RTE's
column alias list. This is certainly bogus, since the printed view
definition would fail to reload, but at least it didn't crash. However,
as of 9.3 the printed column alias list is tightly tied to the names
printed for Vars; so we can't treat columns as dropped for one purpose
and not dropped for the other. This is why we can't just put back the
get_rte_attribute_is_dropped test: it results in an assertion failure
if the view in fact contains any Vars referencing the dropped column.
Once we've got dependencies preventing such cases, we'll probably want
to do it that way instead of relying on the empty-string test used here.
This fix turned up a very ancient bug in outfuncs/readfuncs, namely
that T_String nodes containing empty strings were not dumped/reloaded
correctly: the node was printed as "<>" which is read as a string
value of <>. Since (per SQL) we disallow empty-string identifiers,
such nodes don't occur normally, which is why we'd not noticed.
(Such nodes aren't used for literal constants, just identifiers.)
Per report from Marc Schablewski. Back-patch to 9.3 which is where
the rule printing behavior changed. The dangling-variable case is
broken all the way back, but that's not what his complaint is about.
Noah Misch [Fri, 18 Jul 2014 20:05:17 +0000 (16:05 -0400)]
Limit pg_upgrade authentication advice to always-secure techniques.
~/.pgpass is a sound choice everywhere, and "peer" authentication is
safe on every platform it supports. Cease to recommend "trust"
authentication, the safety of which is deeply configuration-specific.
Back-patch to 9.0, where pg_upgrade was introduced.
Tom Lane [Fri, 18 Jul 2014 17:00:27 +0000 (13:00 -0400)]
Fix two low-probability memory leaks in regular expression parsing.
If pg_regcomp failed after having invoked markst/cleanst, it would leak any
"struct subre" nodes it had created. (We've already detected all regex
syntax errors at that point, so the only likely causes of later failure
would be query cancel or out-of-memory.) To fix, make sure freesrnode
knows the difference between the pre-cleanst and post-cleanst cleanup
procedures. Add some documentation of this less-than-obvious point.
Also, newlacon did the wrong thing with an out-of-memory failure from
realloc(), so that the previously allocated array would be leaked.
Both of these are pretty low-probability scenarios, but a bug is a bug,
so patch all the way back.
Per bug #10976 from Arthur O'Dwyer.
Heikki Linnakangas [Wed, 16 Jul 2014 06:10:54 +0000 (09:10 +0300)]
Fix bugs in SP-GiST search with range type's -|- (adjacent) operator.
The consistent function contained several bugs:
* The "if (which2) { ... }" block was broken. It compared the argument's
lower bound against centroid's upper bound, while it was supposed to compare
the argument's upper bound against the centroid's lower bound (the comment
was correct, code was wrong). Also, it cleared bits in the "which1"
variable, while it was supposed to clear bits in "which2".
* If the argument's upper bound was equal to the centroid's lower bound, we
descended to both halves (= all quadrants). That's unnecessary, searching
the right quadrants is sufficient. This didn't lead to incorrect query
results, but was clearly wrong, and slowed down queries unnecessarily.
* In the case that argument's lower bound is adjacent to the centroid's
upper bound, we also don't need to visit all quadrants. Per similar
reasoning as previous point.
* The code where we compare the previous centroid with the current centroid
should match the code where we compare the current centroid with the
argument. The point of that code is to redo the calculation done in the
previous level, to see if we were supposed to traverse left or right (or up
or down), and if we actually did. If we moved in the different direction,
then we know there are no matches for bound.
Refactor the code and adds comments to make it more readable and easier to
reason about.
Backpatch to 9.3 where SP-GiST support for range types was introduced.
Alvaro Herrera [Tue, 15 Jul 2014 17:24:07 +0000 (13:24 -0400)]
Fix REASSIGN OWNED for text search objects
Trying to reassign objects owned by a user that had text search
dictionaries or configurations used to fail with:
ERROR: unexpected classid 3600
or
ERROR: unexpected classid 3602
Fix by adding cases for those object types in a switch in pg_shdepend.c.
Both REASSIGN OWNED and text search objects go back all the way to 8.1,
so backpatch to all supported branches. In 9.3 the alter-owner code was
made generic, so the required change in recent branches is pretty
simple; however, for 9.2 and older ones we need some additional
reshuffling to enable specifying objects by OID rather than name.
Text search templates and parsers are not owned objects, so there's no
change required for them.
Per bug #9749 reported by Michal Novotný
Peter Eisentraut [Tue, 15 Jul 2014 00:37:00 +0000 (20:37 -0400)]
doc: small fixes for REINDEX reference page
From: Josh Kupershmidt <schmiddy@gmail.com>
Magnus Hagander [Sat, 12 Jul 2014 12:19:57 +0000 (14:19 +0200)]
Add autocompletion of locale keywords for CREATE DATABASE
Adds support for autocomplete of LC_COLLATE and LC_CTYPE to
the CREATE DATABASE command in psql.
Tom Lane [Fri, 11 Jul 2014 23:12:42 +0000 (19:12 -0400)]
Fix bug with whole-row references to append subplans.
ExecEvalWholeRowVar incorrectly supposed that it could "bless" the source
TupleTableSlot just once per query. But if the input is coming from an
Append (or, perhaps, other cases?) more than one slot might be returned
over the query run. This led to "record type has not been registered"
errors when a composite datum was extracted from a non-blessed slot.
This bug has been there a long time; I guess it escaped notice because when
dealing with subqueries the planner tends to expand whole-row Vars into
RowExprs, which don't have the same problem. It is possible to trigger
the problem in all active branches, though, as illustrated by the added
regression test.
Tom Lane [Tue, 8 Jul 2014 18:03:19 +0000 (14:03 -0400)]
Don't assume a subquery's output is unique if there's a SRF in its tlist.
While the x output of "select x from t group by x" can be presumed unique,
this does not hold for "select x, generate_series(1,10) from t group by x",
because we may expand the set-returning function after the grouping step.
(Perhaps that should be re-thought; but considering all the other oddities
involved with SRFs in targetlists, it seems unlikely we'll change it.)
Put a check in query_is_distinct_for() so it's not fooled by such cases.
Back-patch to all supported branches.
David Rowley
Bruce Momjian [Mon, 7 Jul 2014 17:24:08 +0000 (13:24 -0400)]
pg_upgrade: allow upgrades for new-only TOAST tables
Previously, when calculations on the need for toast tables changed,
pg_upgrade could not handle cases where the new cluster needed a TOAST
table and the old cluster did not. (It already handled the opposite
case.) This fixes the "OID mismatch" error typically generated in this
case.
Backpatch through 9.2
Bruce Momjian [Wed, 2 Jul 2014 19:29:38 +0000 (15:29 -0400)]
pg_upgrade: preserve database and relation minmxid values
Also set these values for pre-9.3 old clusters that don't have values to
preserve.
Analysis by Alvaro
Backpatch through 9.3
Tom Lane [Wed, 2 Jul 2014 18:20:34 +0000 (14:20 -0400)]
Add some errdetail to checkRuleResultList().
This function wasn't originally thought to be really user-facing,
because converting a table to a view isn't something we expect people
to do manually. So not all that much effort was spent on the error
messages; in particular, while the code will complain that you got
the column types wrong it won't say exactly what they are. But since
we repurposed the code to also check compatibility of rule RETURNING
lists, it's definitely user-facing. It now seems worthwhile to add
errdetail messages showing exactly what the conflict is when there's
a mismatch of column names or types. This is prompted by bug #10836
from Matthias Raffelsieper, which might have been forestalled if the
error message had reported the wrong column type as being "record".
Per Alvaro's advice, back-patch to branches before 9.4, but resist
the temptation to rephrase any existing strings there. Adding new
strings is not really a translation degradation; anyway having the
info presented in English is better than not having it at all.
Bruce Momjian [Wed, 2 Jul 2014 17:11:04 +0000 (13:11 -0400)]
pg_upgrade: no need to remove "members" files for pre-9.3 upgrades
Per analysis by Alvaro
Backpatch through 9.3
Tom Lane [Tue, 1 Jul 2014 15:22:50 +0000 (11:22 -0400)]
Fix inadequately-sized output buffer in contrib/unaccent.
The output buffer size in unaccent_lexize() was calculated as input string
length times pg_database_encoding_max_length(), which effectively assumes
that replacement strings aren't more than one character. While that was
all that we previously documented it to support, the code actually has
always allowed replacement strings of arbitrary length; so if you tried
to make use of longer strings, you were at risk of buffer overrun. To fix,
use an expansible StringInfo buffer instead of trying to determine the
maximum space needed a-priori.
This would be a security issue if unaccent rules files could be installed
by unprivileged users; but fortunately they can't, so in the back branches
the problem can be labeled as improper configuration by a superuser.
Nonetheless, a memory stomp isn't a nice way of reacting to improper
configuration, so let's back-patch the fix.
Noah Misch [Mon, 30 Jun 2014 20:59:19 +0000 (16:59 -0400)]
Don't prematurely free the BufferAccessStrategy in pgstat_heap().
This function continued to use it after heap_endscan() freed it. In
passing, don't explicit create a strategy here. Instead, use the one
created by heap_beginscan_strat(), if any. Back-patch to 9.2, where use
of a BufferAccessStrategy here was introduced.
Alvaro Herrera [Fri, 27 Jun 2014 18:43:52 +0000 (14:43 -0400)]
Have multixact be truncated by checkpoint, not vacuum
Instead of truncating pg_multixact at vacuum time, do it only at
checkpoint time. The reason for doing it this way is twofold: first, we
want it to delete only segments that we're certain will not be required
if there's a crash immediately after the removal; and second, we want to
do it relatively often so that older files are not left behind if
there's an untimely crash.
Per my proposal in
http://www.postgresql.org/message-id/
20140626044519.GJ7340@eldon.alvh.no-ip.org
we now execute the truncation in the checkpointer process rather than as
part of vacuum. Vacuum is in only charge of maintaining in shared
memory the value to which it's possible to truncate the files; that
value is stored as part of checkpoints also, and so upon recovery we can
reuse the same value to re-execute truncate and reset the
oldest-value-still-safe-to-use to one known to remain after truncation.
Per bug reported by Jeff Janes in the course of his tests involving
bug #8673.
While at it, update some comments that hadn't been updated since
multixacts were changed.
Backpatch to 9.3, where persistency of pg_multixact files was
introduced by commit
0ac5ad5134f2.
Alvaro Herrera [Fri, 27 Jun 2014 18:43:45 +0000 (14:43 -0400)]
Don't allow relminmxid to go backwards during VACUUM FULL
We were allowing a table's pg_class.relminmxid value to move backwards
when heaps were swapped by VACUUM FULL or CLUSTER. There is a
similar protection against relfrozenxid going backwards, which we
neglected to clone when the multixact stuff was rejiggered by commit
0ac5ad5134f276.
Backpatch to 9.3, where relminmxid was introduced.
As reported by Heikki in
http://www.postgresql.org/message-id/
52401AEA.
9000608@vmware.com
Alvaro Herrera [Fri, 27 Jun 2014 18:43:38 +0000 (14:43 -0400)]
Fix broken Assert() introduced by
8e9a16ab8f7f0e58
Don't assert MultiXactIdIsRunning if the multi came from a tuple that
had been share-locked and later copied over to the new cluster by
pg_upgrade. Doing that causes an error to be raised unnecessarily:
MultiXactIdIsRunning is not open to the possibility that its argument
came from a pg_upgraded tuple, and all its other callers are already
checking; but such multis cannot, obviously, have transactions still
running, so the assert is pointless.
Noticed while investigating the bogus pg_multixact/offsets/0000 file
left over by pg_upgrade, as reported by Andres Freund in
http://www.postgresql.org/message-id/
20140530121631.GE25431@alap3.anarazel.de
Backpatch to 9.3, as the commit that introduced the buglet.
Tom Lane [Thu, 26 Jun 2014 17:41:01 +0000 (10:41 -0700)]
Back-patch "Fix EquivalenceClass processing for nested append relations".
When we committed
a87c729153e372f3731689a7be007bc2b53f1410, we somehow
failed to notice that it didn't merely improve plan quality for expression
indexes; there were very closely related cases that failed outright with
"could not find pathkey item to sort". The failing cases seem to be those
where the planner was already capable of selecting a MergeAppend plan,
and there was inheritance involved: the lack of appropriate eclass child
members would prevent prepare_sort_from_pathkeys() from succeeding on the
MergeAppend's child plan nodes for inheritance child tables.
Accordingly, back-patch into 9.1 through 9.3, along with an extra
regression test case covering the problem.
Per trouble report from Michael Glaesemann.
Fujii Masao [Thu, 26 Jun 2014 05:27:27 +0000 (14:27 +0900)]
Remove obsolete example of CSV log file name from log_filename document.
7380b63 changed log_filename so that epoch was not appended to it
when no format specifier is given. But the example of CSV log file name
with epoch still left in log_filename document. This commit removes
such obsolete example.
This commit also documents the defaults of log_directory and
log_filename.
Backpatch to all supported versions.
Christoph Berg
Tom Lane [Wed, 25 Jun 2014 04:22:47 +0000 (21:22 -0700)]
Fix handling of nested JSON objects in json_populate_recordset and friends.
populate_recordset_object_start() improperly created a new hash table
(overwriting the link to the existing one) if called at nest levels
greater than one. This resulted in previous fields not appearing in
the final output, as reported by Matti Hameister in bug #10728.
In 9.4 the problem also affects json_to_recordset.
This perhaps missed detection earlier because the default behavior is to
throw an error for nested objects: you have to pass use_json_as_text = true
to see the problem.
In addition, fix query-lifespan leakage of the hashtable created by
json_populate_record(). This is pretty much the same problem recently
fixed in dblink: creating an intended-to-be-temporary context underneath
the executor's per-tuple context isn't enough to make it go away at the
end of the tuple cycle, because MemoryContextReset is not
MemoryContextResetAndDeleteChildren.
Michael Paquier and Tom Lane
Bruce Momjian [Tue, 24 Jun 2014 20:11:06 +0000 (16:11 -0400)]
pg_upgrade: remove pg_multixact files left by initdb
This fixes a bug that caused vacuum to fail when the '0000' files left
by initdb were accessed as part of vacuum's cleanup of old pg_multixact
files.
Backpatch through 9.3
Heikki Linnakangas [Tue, 24 Jun 2014 09:31:36 +0000 (12:31 +0300)]
Don't allow foreign tables with OIDs.
The syntax doesn't let you specify "WITH OIDS" for foreign tables, but it
was still possible with default_with_oids=true. But the rest of the system,
including pg_dump, isn't prepared to handle foreign tables with OIDs
properly.
Backpatch down to 9.1, where foreign tables were introduced. It's possible
that there are databases out there that already have foreign tables with
OIDs. There isn't much we can do about that, but at least we can prevent
them from being created in the future.
Patch by Etsuro Fujita, reviewed by Hadi Moshayedi.
Kevin Grittner [Sat, 21 Jun 2014 14:17:24 +0000 (09:17 -0500)]
Fix documentation template for CREATE TRIGGER.
By using curly braces, the template had specified that one of
"NOT DEFERRABLE", "INITIALLY IMMEDIATE", or "INITIALLY DEFERRED"
was required on any CREATE TRIGGER statement, which is not
accurate. Change to square brackets makes that optional.
Backpatch to 9.1, where the error was introduced.
Joe Conway [Fri, 20 Jun 2014 19:22:50 +0000 (12:22 -0700)]
Clean up data conversion short-lived memory context.
dblink uses a short-lived data conversion memory context. However it
was not deleted when no longer needed, leading to a noticeable memory
leak under some circumstances. Plug the hole, along with minor
refactoring. Backpatch to 9.2 where the leak was introduced.
Report and initial patch by MauMau. Reviewed/modified slightly by
Tom Lane and me.
Andres Freund [Fri, 20 Jun 2014 09:06:53 +0000 (11:06 +0200)]
Do all-visible handling in lazy_vacuum_page() outside its critical section.
Since
fdf9e21196a lazy_vacuum_page() rechecks the all-visible status
of pages in the second pass over the heap. It does so inside a
critical section, but both visibilitymap_test() and
heap_page_is_all_visible() perform operations that should not happen
inside one. The former potentially performs IO and both potentially do
memory allocations.
To fix, simply move all the all-visible handling outside the critical
section. Doing so means that the PD_ALL_VISIBLE on the page won't be
included in the full page image of the HEAP2_CLEAN record anymore. But
that's fine, the flag will be set by the HEAP2_VISIBLE logged later.
Backpatch to 9.3 where the problem was introduced. The bug only came
to light due to the assertion added in
4a170ee9 and isn't likely to
cause problems in production scenarios. The worst outcome is a
avoidable PANIC restart.
This also gets rid of the difference in the order of operations
between master and standby mentioned in
2a8e1ac5.
Per reports from David Leverton and Keith Fiske in bug #10533.
Tom Lane [Fri, 20 Jun 2014 02:13:47 +0000 (22:13 -0400)]
Avoid leaking memory while evaluating arguments for a table function.
ExecMakeTableFunctionResult evaluated the arguments for a function-in-FROM
in the query-lifespan memory context. This is insignificant in simple
cases where the function relation is scanned only once; but if the function
is in a sub-SELECT or is on the inside of a nested loop, any memory
consumed during argument evaluation can add up quickly. (The potential for
trouble here had been foreseen long ago, per existing comments; but we'd
not previously seen a complaint from the field about it.) To fix, create
an additional temporary context just for this purpose.
Per an example from MauMau. Back-patch to all active branches.
Noah Misch [Sat, 14 Jun 2014 13:41:13 +0000 (09:41 -0400)]
Secure Unix-domain sockets of "make check" temporary clusters.
Any OS user able to access the socket can connect as the bootstrap
superuser and proceed to execute arbitrary code as the OS user running
the test. Protect against that by placing the socket in a temporary,
mode-0700 subdirectory of /tmp. The pg_regress-based test suites and
the pg_upgrade test suite were vulnerable; the $(prove_check)-based test
suites were already secure. Back-patch to 8.4 (all supported versions).
The hazard remains wherever the temporary cluster accepts TCP
connections, notably on Windows.
As a convenient side effect, this lets testing proceed smoothly in
builds that override DEFAULT_PGSOCKET_DIR. Popular non-default values
like /var/run/postgresql are often unwritable to the build user.
Security: CVE-2014-0067
Noah Misch [Sat, 14 Jun 2014 13:41:13 +0000 (09:41 -0400)]
Add mkdtemp() to libpgport.
This function is pervasive on free software operating systems; import
NetBSD's implementation. Back-patch to 8.4, like the commit that will
harness it.
Tom Lane [Fri, 13 Jun 2014 00:14:39 +0000 (20:14 -0400)]
Fix pg_restore's processing of old-style BLOB COMMENTS data.
Prior to 9.0, pg_dump handled comments on large objects by dumping a bunch
of COMMENT commands into a single BLOB COMMENTS archive object. With
sufficiently many such comments, some of the commands would likely get
split across bufferloads when restoring, causing failures in
direct-to-database restores (though no problem would be evident in text
output). This is the same type of issue we have with table data dumped as
INSERT commands, and it can be fixed in the same way, by using a mini SQL
lexer to figure out where the command boundaries are. Fortunately, the
COMMENT commands are no more complex to lex than INSERTs, so we can just
re-use the existing lexer for INSERTs.
Per bug #10611 from Jacek Zalewski. Back-patch to all active branches.
Tom Lane [Thu, 12 Jun 2014 20:51:08 +0000 (16:51 -0400)]
Remove inadvertent copyright violation in largeobject regression test.
Robert Frost is no longer with us, but his copyrights still are, so
let's stop using "Stopping by Woods on a Snowy Evening" as test data
before somebody decides to sue us. Wordsworth is more safely dead.
Tom Lane [Wed, 11 Jun 2014 02:48:16 +0000 (22:48 -0400)]
Fix ancient encoding error in hungarian.stop.
When we grabbed this file off the Snowball project's website, we mistakenly
supposed that it was in LATIN1 encoding, but evidently it was actually in
LATIN2. This resulted in ő (o-double-acute, U+0151, which is code 0xF5 in
LATIN2) being misconverted into õ (o-tilde, U+00F5), as complained of in
bug #10589 from Zoltán Sörös. We'd have messed up u-double-acute too,
but there aren't any of those in the file. Other characters used in the
file have the same codes in LATIN1 and LATIN2, which no doubt helped hide
the problem for so long.
The error is not only ours: the Snowball project also was confused about
which encoding is required for Hungarian. But dealing with that will
require source-code changes that I'm not at all sure we'll wish to
back-patch. Fixing the stopword file seems reasonably safe to back-patch
however.
Tom Lane [Tue, 10 Jun 2014 01:37:20 +0000 (21:37 -0400)]
Forward-port regression test for bug #10587 into 9.3 and HEAD.
Although this bug is already fixed in post-9.2 branches, the case
triggering it is quite different from what was under consideration
at the time. It seems worth memorializing this example in HEAD
just to make sure it doesn't get broken again in future.
Extracted from commit
187ae17300776f48b2bd9d0737923b1bf70f606e.
Tom Lane [Mon, 9 Jun 2014 20:30:43 +0000 (16:30 -0400)]
Fix infinite loop when splitting inner tuples in SPGiST text indexes.
Previously, the code used a node label of zero both for strings that
contain no bytes beyond the inner tuple's prefix, and for cases where an
"allTheSame" inner tuple has to be split to allow a string with a different
next byte to be inserted into it. Failing to distinguish these cases meant
that if a string ending with the current prefix needed to be inserted into
an allTheSame tuple, we got into an infinite loop, because after splitting
the tuple we'd descend into the child allTheSame tuple and then find we
need to split again.
To fix, instead use -1 and -2 as the node labels for these two cases.
This requires widening the node label type from "char" to int2, but
fortunately SPGiST stores all pass-by-value node label types in their
Datum representation, which means that this change is transparently upward
compatible so far as the on-disk representation goes. We continue to
recognize zero as a dummy node label for reading purposes, but will not
attempt to push new index entries down into such a label, so that the loop
won't occur even when dealing with an existing index.
Per report from Teodor Sigaev. Back-patch to 9.2 where the faulty
code was introduced.
Alvaro Herrera [Mon, 9 Jun 2014 19:17:23 +0000 (15:17 -0400)]
Wrap multixact/members correctly during extension, take 2
In
a50d97625497b7 I already changed this, but got it wrong for the case
where the number of members is larger than the number of entries that
fit in the last page of the last segment.
As reported by Serge Negodyuck in a followup to bug #8673.
Fujii Masao [Fri, 6 Jun 2014 09:46:32 +0000 (18:46 +0900)]
Fix breakages of hot standby regression test.
This commit changes HS regression test so that it uses
REPEATABLE READ transaction instead of SERIALIZABLE one
because SERIALIZABLE transaction isolation level is not
available in HS. Also this commit fixes VACUUM/ANALYZE
label mixup.
This was fixed in HEAD (commit
2985e16), but it should
have been back-patched to 9.1 which had introduced SSI
and forbidden SERIALIZABLE transaction in HS.
Amit Langote
Tom Lane [Thu, 5 Jun 2014 15:31:09 +0000 (11:31 -0400)]
Add defenses against running with a wrong selection of LOBLKSIZE.
It's critical that the backend's idea of LOBLKSIZE match the way data has
actually been divided up in pg_largeobject. While we don't provide any
direct way to adjust that value, doing so is a one-line source code change
and various people have expressed interest recently in changing it. So,
just as with TOAST_MAX_CHUNK_SIZE, it seems prudent to record the value in
pg_control and cross-check that the backend's compiled-in setting matches
the on-disk data.
Also tweak the code in inv_api.c so that fetches from pg_largeobject
explicitly verify that the length of the data field is not more than
LOBLKSIZE. Formerly we just had Asserts() for that, which is no protection
at all in production builds. In some of the call sites an overlength data
value would translate directly to a security-relevant stack clobber, so it
seems worth one extra runtime comparison to be sure.
In the back branches, we can't change the contents of pg_control; but we
can still make the extra checks in inv_api.c, which will offer some amount
of protection against running with the wrong value of LOBLKSIZE.
Andres Freund [Wed, 4 Jun 2014 21:26:08 +0000 (23:26 +0200)]
Fix longstanding bug in HeapTupleSatisfiesVacuum().
HeapTupleSatisfiesVacuum() didn't properly discern between
DELETE_IN_PROGRESS and INSERT_IN_PROGRESS for rows that have been
inserted in the current transaction and deleted in a aborted
subtransaction of the current backend. At the very least that caused
problems for CLUSTER and CREATE INDEX in transactions that had
aborting subtransactions producing rows, leading to warnings like:
WARNING: concurrent delete in progress within table "..."
possibly in an endless, uninterruptible, loop.
Instead of treating *InProgress xmins the same as *IsCurrent ones,
treat them as being distinct like the other visibility routines. As
implemented this separatation can cause a behaviour change for rows
that have been inserted and deleted in another, still running,
transaction. HTSV will now return INSERT_IN_PROGRESS instead of
DELETE_IN_PROGRESS for those. That's both, more in line with the other
visibility routines and arguably more correct. The latter because a
INSERT_IN_PROGRESS will make callers look at/wait for xmin, instead of
xmax.
The only current caller where that's possibly worse than the old
behaviour is heap_prune_chain() which now won't mark the page as
prunable if a row has concurrently been inserted and deleted. That's
harmless enough.
As a cautionary measure also insert a interrupt check before the gotos
in IndexBuildHeapScan() that lead to the uninterruptible loop. There
are other possible causes, like a row that several sessions try to
update and all fail, for repeated loops and the cost of doing so in
the retry case is low.
As this bug goes back all the way to the introduction of
subtransactions in
573a71a5da backpatch to all supported releases.
Reported-By: Sandro Santilli
Fujii Masao [Wed, 4 Jun 2014 16:46:31 +0000 (01:46 +0900)]
Add description of pg_stat directory into doc.
Back-patch to 9.3 where pg_stat directory was introduced.
Tom Lane [Tue, 3 Jun 2014 16:01:30 +0000 (12:01 -0400)]
Make plpython_unicode regression test work in more database encodings.
This test previously used a data value containing U+0080, and would
therefore fail if the database encoding didn't have an equivalent to
that; which only about half of our supported server encodings do.
We could fall back to using some plain-ASCII character, but that seems
like it's losing most of the point of the test. Instead switch to using
U+00A0 (no-break space), which translates into all our supported encodings
except the four in the EUC_xx family.
Per buildfarm testing. Back-patch to 9.1, which is as far back as this
test is expected to succeed everywhere. (9.0 has the test, but without
back-patching some 9.1 code changes we could not expect to get consistent
results across platforms anyway.)
Andres Freund [Tue, 3 Jun 2014 12:02:54 +0000 (14:02 +0200)]
Set the process latch when processing recovery conflict interrupts.
Because RecoveryConflictInterrupt() didn't set the process latch
anything using the latter to wait for events didn't get notified about
recovery conflicts. Most latch users are never the target of recovery
conflicts, which explains the lack of reports about this until
now.
Since 9.3 two possible affected users exist though: The sql callable
pg_sleep() now uses latches to wait and background workers are
expected to use latches in their main loop. Both would currently wait
until the end of WaitLatch's timeout.
Fix by adding a SetLatch() to RecoveryConflictInterrupt(). It'd also
be possible to fix the issue by having each latch user set
set_latch_on_sigusr1. That seems failure prone and though, as most of
these callsites won't often receive recovery conflicts and thus will
likely only be tested against normal query cancels et al. It'd also be
unnecessarily verbose.
Backpatch to 9.1 where latches were introduced. Arguably 9.3 would be
sufficient, because that's where pg_sleep() was converted to waiting
on the latch and background workers got introduced; but there could be
user level code making use of the latch pre 9.3.
Tom Lane [Sun, 1 Jun 2014 19:03:15 +0000 (15:03 -0400)]
PL/Python: Adjust the regression tests for Python 3.4
Back-patch commit
d0765d50f429472d00554701ac6531c84d324811 into 9.3
and 9.2, which is as far back as we previously bothered to adjust the
regression tests for Python 3.3. Per gripe from Honza Horak.
Tom Lane [Fri, 30 May 2014 22:18:14 +0000 (18:18 -0400)]
On OS X, link libpython normally, ignoring the "framework" framework.
As of Xcode 5.0, Apple isn't including the Python framework as part of the
SDK-level files, which means that linking to it might fail depending on
whether Xcode thinks you've selected a specific SDK version. According to
their Tech Note 2328, they've basically deprecated the framework method of
linking to libpython and are telling people to link to the shared library
normally. (I'm pretty sure this is in direct contradiction to the advice
they were giving a few years ago, but whatever.) Testing says that this
approach works fine at least as far back as OS X 10.4.11, so let's just
rip out the framework special case entirely. We do still need a special
case to decide that OS X provides a shared library at all, unfortunately
(I wonder why the distutils check doesn't work ...). But this is still
less of a special case than before, so it's fine.
Back-patch to all supported branches, since we'll doubtless be hearing
about this more as more people update to recent Xcode.