Age | Commit message (Collapse) | Author |
|
While GTM allows long jump in case of errors, we were not careful to release
locks currently held by the executing thread. That could lead to threads
leaving a critical section still holding a lock and thus causing deadlocks.
We now properly track currently held locks in the thread-specific information
and release those locks in case of an error. Same is done for mutex locks as
well, though there is only one that gets used.
This change required using a malloc-ed memory for thread-specific info. While
due care has been taken to free the structure, we should keep an eye on it for
any possible memory leaks.
In passing also improve handling of bad-protocol startup messages which may
have caused deadlock and resource starvation.
|
|
Chi Gao and Hengbing Wang reported certain issues around transaction handling
and demonstrated via xlogdump how certain transactions were getting marked
committed/aborted repeatedly on a datanode. When an already committed
transaction is attempted to be aborted again, it results in a PANIC. Upon
investigation, this uncovered a very serious yet long standing bug in
transaction handling.
If the client is running in autocommit mode, we try to avoid starting a
transaction block on the datanode side if only one datanode is going to be
involved in the transaction. This is an optimisation to speed up short queries
touching only a single node. But when the query rewriter transforms a single
statement into multiple statements, we would still (and incorrectly) run each
statement in an autocommit mode on the datanode. This can cause inconsistencies
when one statement commits but the next statement aborts. And it may also lead
to the PANIC situations if we continue to use the same global transaction
identifier for the statements.
This can also happen when the user invokes a user-defined function. If the
function has multiple statements, each statement will run in an autocommit
mode, if it's FQSed, thus again creating inconsistency if a following statement
in the function fails.
We now have a more elaborate mechanism to tackle autocommit and transaction
block needs. The special casing for force_autocommit is now removed, thus
making it more predictable. We also have specific conditions to check to ensure
that we don't mixup autocommit and transaction block for the same global xid.
Finally, if a query rewriter transforms a single statement into multiple
statements, we run those statements in a transaction block. Together these
changes should help us fix the problems.
|
|
A number of functions were defined in pgxcnode.h/pgxnnode.h, but
only ever used in poolmgr.c. Those are:
- PGXCNodeConnect - open libpq connection using conn. string
- PGXCNodePing - ping node using connection string
- PGXCNodeClose - close libpq connection
- PGXCNodeConnected - verify connection status
- PGXCNodeConnStr - build connection string
So move them to poolmgr.c and make them static, so that poolmgr
is the only part dealing with libpq connections directly.
|
|
Similarly to a39b06b0c6, this does minor cleanup in the pool manager
code by removing unused functions and adding a lot of comments, both
at the file level (explaining the concepts and basic API methods)
and for individual functions.
|
|
This patch improves comments in gtm_txn.c and gtm_snap.c in three
basic ways:
1) Adds global comments explaining the basics of transaction and
snapshot management APIs - underlying concepts, main methods.
2) Improves (and adds) function-level comments, explaining the
meaning of parameters, return values, and other details.
3) Tweaks the naming of several API functions, to make them more
consistent with the rest of the module.
|
|
The cleanup does two basic things:
* Functions used only in a single source file are made static (and also
removed from the header file, of course). This reduces the size of the
public GTM API.
* Unused functions (identified by the compiler thanks to making other
functions static in the previous step) are removed. The assumption is
that this code was not really tested at all, and would only make
future improvements harder.
|
|
Since commit fb56418d66 the snapshots are computed in thread-local
storage, but we haven't been freeing the memory (on thread exit).
As the memory is allocated in the global (TopMostMemoryContext),
this presented a memory leak of 64kB on each GTM connection.
One way to fix this would be to track when the thread-local storage
is used in GTM_GetTransactionSnapshot(), and allocate the memory
in TopMemoryContext (which is per-thread and released on exit).
But there's a simpler way - allocate the thread-specific storage as
part of GTM_ThreadInfo, and just redirect sn_xip from the snapshot.
This way we don't have to worry about palloc/pfree at all, and we
mostly assume that every connection will need to compute at least
one snapshot anyway.
Reported by Rob Canavan <rxcanavan@gmail.com>, investigation and fix
by me. For more discussion see
<CAFTg0q6VC_11+c=Q=gsAcDsBrDjvuGKjzNwH4Lr8vERRDn4Ycw@mail.gmail.com>
Backpatch to Postgres-XL 9.5.
|
|
Our efforts to improve shared queue synchronization continues. We now have a
per queue producer lwlock that must be held for synchronization between
consumers and the producer. Consumers must hold this lock before setting the
producer latch to ensure the producer does not miss out any signals and does
not go into unnecessary waits.
We still can't get rid of all the timeouts, especially we see that sometimes a
producer finishes and tries to unbind from the queue, even before a consumer
gets chance to connect to the queue. We left the 10s wait to allow consumers to
connect. There is still net improvement because when the consumer is not going
to connect, it tells the producer and we avoid the 10s timeout, like we used to
see earlier.
|
|
Rules are converted in their string representation and stored in the catalog.
While building relation descriptor, this information is read back and converted
into a Node representation. Since relation descriptors could be built when we
are reading plan information sent by the remote server in a stringified
representation, trying to read the rules with portable input on may lead to
unpleasant behaviour. So we must first reset portable input and restore it back
after reading the rules. The same applies to RLS policies (even though we don't
have a test showing the impact, but it looks like a sane thing to fix anyways)
|
|
|
|
We never had this support and we never felt the need because the use of FQS was
limited for utility statements and simple queries which can be completed
pushed down to the remote node. But in PG 10, we're seeing errors while using
cursors for queries which are FQSed. So instead of forcing regular remote
subplan on such queries, we are adding support for rescan of RemoteQuery node.
Patch by Senhu <senhu@tencent.com>
|
|
The coordinator_lxid GUC is internally stored as uint32, but was defined
as plaint int32, triggering a compiler warning. It's also unclear what
would happen for transaction IDs outside the signed range (possibly some
strange issues).
This adds a new GUC type (UInt), used only for this one GUC. The patch
is fairly large, but most of it is boilerplate infrastructure to support
the new GUC type. We have considered simpler workarounds (e.g. treating
the GUC as string and converting it to/from uint32 using the GUC hooks,
but this seems much cleaner and tidier.
|
|
This is the merge-base of PostgreSQL's master branch and REL_10_STABLE branch.
This should be the last merge from PG's master branch into XL 10 branch.
Subsequent merges must happen from REL_10_STABLE branch
|
|
|
|
The sole useful effect of this function, to check that no catcache
entries have positive refcounts at transaction end, has really been
obsolete since we introduced ResourceOwners in PG 8.1. We reduced the
checks to assertions years ago, so that the function was a complete
no-op in production builds. There have been previous discussions about
removing it entirely, but consensus up to now was that it had some small
value as a cross-check for bugs in the ResourceOwner logic.
However, it now emerges that it's possible to trigger these assertions
if you hit an assert-enabled backend with SIGTERM during a call to
SearchCatCacheList, because that function temporarily increases the
refcounts of entries it's intending to add to a catcache list construct.
In a normal ERROR scenario, the extra refcounts are cleaned up by
SearchCatCacheList's PG_CATCH block; but in a FATAL exit we do a
transaction abort and exit without ever executing PG_CATCH handlers.
There's a case to be made that this is a generic hazard and we should
consider restructuring elog(FATAL) handling so that pending PG_CATCH
handlers do get run. That's pretty scary though: it could easily create
more problems than it solves. Preliminary stress testing by Andreas
Seltenreich suggests that there are not many live problems of this ilk,
so we rejected that idea.
There are more-localized ways to fix the problem; the most principled
one would be to use PG_ENSURE_ERROR_CLEANUP instead of plain PG_TRY.
But adding cycles to SearchCatCacheList isn't very appealing. We could
also weaken the assertions in AtEOXact_CatCache in some more or less
ad-hoc way, but that just makes its raison d'etre even less compelling.
In the end, the most reasonable solution seems to be to just remove
AtEOXact_CatCache altogether, on the grounds that it's not worth trying
to fix it. It hasn't found any bugs for us in many years.
Per report from Jeevan Chalke. Back-patch to all supported branches.
Discussion: https://postgr.es/m/CAM2+6=VEE30YtRQCZX7_sCFsEpoUkFBV1gZazL70fqLn8rcvBA@mail.gmail.com
|
|
Various bugs can cause crashes, so don't use that function before ICU
53. It will fall back to the code path used for other encodings.
Since we now tie the function availability to an ICU version, we don't
need the configure test anymore. That also resolves the issue that the
test result was previously hardcoded for Windows.
researched by Daniel Verite <daniel@manitou-mail.org>, Peter Geoghegan
<pg@bowt.ie>, Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/f1438ec6-22aa-4029-9a3b-26f79d330e72%40manitou-mail.org
|
|
The previous message didn't mention the name of the table or the
bounds. Put the table name in the primary error message and the
bounds in the detail message.
Amit Langote, changed slightly by me. Suggestions on the exac
phrasing from Tom Lane, David G. Johnston, and Dean Rasheed.
Discussion: http://postgr.es/m/CA+Tgmoae6bpwVa-1BMaVcwvCCeOoJ5B9Q9-RHWo-1gJxfPBZ5Q@mail.gmail.com
|
|
Etsuro Fujita
Discussion: http://postgr.es/m/5f794b91-67df-1ac6-8a4f-069f8e8e169d@lab.ntt.co.jp
|
|
Similar to what was fixed in commit 9915de6c1cb2 for replication slots,
but this time it's related to replication origins: DROP SUBSCRIPTION
attempts to drop the replication origin, but that fails if the
replication worker process hasn't yet marked it unused. This causes
failures in the buildfarm:
ERROR: could not drop replication origin with OID 1, in use by PID 34069
Like the aforementioned commit, fix by having the process running DROP
SUBSCRIPTION sleep until the worker marks the the replication origin
struct as free. This uses a condition variable on each replication
origin shmem state struct, so that the session trying to drop can sleep
and expect to be awakened by the process keeping the origin open.
Also fix a SGML markup in the previous commit.
Discussion: https://postgr.es/m/20170808001433.rozlseaf4m2wkw3n@alvherre.pgsql
|
|
In commit 9915de6c1cb2, we introduced a new wait point for replication
slots and incorrectly labelled it as wait event PG_WAIT_LOCK. That's
wrong, so invent an appropriate new wait event instead, and document it
properly.
While at it, fix numerous other problems in the vicinity:
- two different walreceiver wait events were being mixed up in a single
wait event (which wasn't documented either); split it out so that they
can be distinguished, and document the new events properly.
- ParallelBitmapPopulate was documented but didn't exist.
- ParallelBitmapScan was not documented (I think this should be called
"ParallelBitmapScanInit" instead.)
- Logical replication wait events weren't documented
- various symbols had been added in dartboard order in various places.
Put them in alphabetical order instead, as was originally intended.
Discussion: https://postgr.es/m/20170808181131.mu4fjepuh5m75cyq@alvherre.pgsql
|
|
|
|
This allows a transaction abort to avoid killing those workers.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
|
|
Per a report from AP, it's not that hard to exhaust the supply of
bitmap pages if you create a table with a hash index and then insert a
few billion rows - and then you start getting errors when you try to
insert additional rows. In the particular case reported by AP,
there's another fix that we can make to improve recycling of overflow
pages, which is another way to avoid the error, but there may be other
cases where this problem happens and that fix won't help. So let's
buy ourselves as much headroom as we can without rearchitecting
anything.
The comments claim that the old limit was 64GB, but it was really
only 32GB, because we didn't use all the bits in the page for bitmap
bits - only the largest power of 2 that could fit after deducting
space for the page header and so forth. Thus, we have 4kB per page
for bitmap bits, not 8kB. The new limit is thus actually 8 times the
old *real* limit but only 4 times the old *purported* limit.
Since this breaks on-disk compatibility, bump HASH_VERSION. We've
already done this earlier in this release cycle, so this doesn't cause
any incremental inconvenience for people using pg_upgrade from
releases prior to v10. However, users who use pg_upgrade to reach
10beta3 or later from 10beta2 or earlier will need to REINDEX any hash
indexes again.
Amit Kapila and Robert Haas
Discussion: http://postgr.es/m/20170704105728.mwb72jebfmok2nm2@zip.com.au
|
|
Otherwise, partitioned tables with RETURNING expressions or subject
to a WITH CHECK OPTION do not work properly.
Amit Langote, reviewed by Amit Khandekar and Etsuro Fujita. A few
comment changes by me.
Discussion: http://postgr.es/m/9a39df80-871e-6212-0684-f93c83be4097@lab.ntt.co.jp
|
|
Since a temporary table may be accessed by multiple backends on a datanode, XL
mostly treats such tables as regular tables. But the technique that was used to
distingush between temporary tables that may need shared storage vs those which
are accessed only by a single backend, wasn't very full proof. We were relying
on global session activation to make that distinction. This clearly fails when
a background process, such as autovacuuum process, tries to figure out whether
a table is using local or shared storage. This was leading to various problems,
such as, when the underlying file system objects for the table were getting
cleaned up, but without first discarding all references to the table from the
shared buffers.
We now make all temp tables to use shared storage on the datanodes and thus
simplify things. Only EXECUTE DIRECT anyways does not set up global session, so
I don't think this will have any meaningful impact on the performance.
This should fix the checkpoint failures during regression tests.
|
|
XLByteToSeg and XLByteToPrevSeg calculate only a segment number. The
definition of these macros were modified by commit
dfda6ebaec6763090fb78b458a979b558c50b39b but the comment remain
unchanged.
Patch by Yugo Nagata. Back patched to 9.3 and beyond.
|
|
1024 bits is considered weak these days, but OpenSSL always passes 1024 as
the key length to the tmp_dh callback. All the code to handle other key
lengths is, in fact, dead.
To remedy those issues:
* Only include hard-coded 2048-bit parameters.
* Set the parameters directly with SSL_CTX_set_tmp_dh(), without the
callback
* The name of the file containing the DH parameters is now a GUC. This
replaces the old hardcoded "dh1024.pem" filename. (The files for other
key lengths, dh512.pem, dh2048.pem, etc. were never actually used.)
This is not a new problem, but it doesn't seem worth the risk and churn to
backport. If you care enough about the strength of the DH parameters on
old versions, you can create custom DH parameters, with as many bits as you
wish, and put them in the "dh1024.pem" file.
Per report by Nicolas Guini and Damian Quiroga. Reviewed by Michael Paquier.
Discussion: https://www.postgresql.org/message-id/CAMxBoUyjOOautVozN6ofzym828aNrDjuCcOTcCquxjwS-L2hGQ@mail.gmail.com
|
|
This allows us to add stack-depth checks the first time an executor
node is called, and skip that overhead on following
calls. Additionally it yields a nice speedup.
While it'd probably have been a good idea to have that check all
along, it has become more important after the new expression
evaluation framework in b8d7f053c5c2bf2a7e - there's no stack depth
check in common paths anymore now. We previously relied on
ExecEvalExpr() being executed somewhere.
We should move towards that model for further routines, but as this is
required for v10, it seems better to only do the necessary (which
already is quite large).
Author: Andres Freund, Tom Lane
Reported-By: Julien Rouhaud
Discussion:
https://postgr.es/m/22833.1490390175@sss.pgh.pa.us
https://postgr.es/m/b0af9eaa-130c-60d0-9e4e-7a135b1e0c76@dalibo.com
|
|
|
|
It is relatively easy to get a replication slot to look as still active
while one process is in the process of getting rid of it; when some
other process tries to "acquire" the slot, it would fail with an error
message of "replication slot XYZ is active for PID N".
The error message in itself is fine, except that when the intention is
to drop the slot, it is unhelpful: the useful behavior would be to wait
until the slot is no longer acquired, so that the drop can proceed. To
implement this, we use a condition variable so that slot acquisition can
be told to wait on that condition variable if the slot is already
acquired, and we make any change in active_pid broadcast a signal on the
condition variable. Thus, as soon as the slot is released, the drop
will proceed properly.
Reported by: Tom Lane
Discussion: https://postgr.es/m/11904.1499039688@sss.pgh.pa.us
Authors: Petr Jelínek, Álvaro Herrera
|
|
Previously, UNBOUNDED meant no lower bound when used in the FROM list,
and no upper bound when used in the TO list, which was OK for
single-column range partitioning, but problematic with multiple
columns. For example, an upper bound of (10.0, UNBOUNDED) would not be
collocated with a lower bound of (10.0, UNBOUNDED), thus making it
difficult or impossible to define contiguous multi-column range
partitions in some cases.
Fix this by using MINVALUE and MAXVALUE instead of UNBOUNDED to
represent a partition column that is unbounded below or above
respectively. This syntax removes any ambiguity, and ensures that if
one partition's lower bound equals another partition's upper bound,
then the partitions are contiguous.
Also drop the constraint prohibiting finite values after an unbounded
column, and just document the fact that any values after MINVALUE or
MAXVALUE are ignored. Previously it was necessary to repeat UNBOUNDED
multiple times, which was needlessly verbose.
Note: Forces a post-PG 10 beta2 initdb.
Report by Amul Sul, original patch by Amit Langote with some
additional hacking by me.
Discussion: https://postgr.es/m/CAAJ_b947mowpLdxL3jo3YLKngRjrq9+Ej4ymduQTfYR+8=YAYQ@mail.gmail.com
|
|
When pg_control was first designed, sizeof(ControlFileData) was small
enough that a comment seemed like plenty to document the assumption that
it'd fit into one disk sector. Now it's nearly 300 bytes, raising the
possibility that somebody would carelessly add enough stuff to create
a problem. Let's add a StaticAssertStmt() to ensure that the situation
doesn't pass unnoticed if it ever occurs.
While at it, rename PG_CONTROL_SIZE to PG_CONTROL_FILE_SIZE to make it
clearer what that symbol means, and convert the existing runtime
comparisons of sizeof(ControlFileData) vs. PG_CONTROL_FILE_SIZE to be
static asserts --- we didn't have that technology when this code was
first written.
Discussion: https://postgr.es/m/9192.1500490591@sss.pgh.pa.us
|
|
In an off-list followup to bug #14745, Bob Jones complained that
to_tsvector() on a 2MB jsonb value took an unreasonable amount of
time and space --- enough to draw the wrath of the OOM killer on
his machine. On my machine, his example proved to require upwards
of 18 seconds and 4GB, which seemed pretty bogus considering that
to_tsvector() on the same data treated as text took just a couple
hundred msec and 10 or so MB.
On investigation, the problem is that the implementation scans each
string element of the json(b) and converts it to tsvector separately,
then applies tsvector_concat() to join those separate tsvectors.
The unreasonable memory usage came from leaking every single one of
the transient tsvectors --- but even without that mistake, this is an
O(N^2) or worse algorithm, because tsvector_concat() has to repeatedly
process the words coming from earlier elements.
We can fix it by accumulating all the lexeme data and applying
make_tsvector() just once. As a side benefit, that also makes the
desired adjustment of lexeme positions far cheaper, because we can
just tweak the running "pos" counter between JSON elements.
In passing, try to make the explanation of that tweak more intelligible.
(I didn't think that a barely-readable comment far removed from the
actual code was helpful.) And do some minor other code beautification.
|
|
Before, we always used a dummy value of 1, but that's not right when
the partitioned table being modified is inside of a WITH clause
rather than part of the main query.
Amit Langote, reported and reviewd by Etsuro Fujita, with a comment
change by me.
Discussion: http://postgr.es/m/ee12f648-8907-77b5-afc0-2980bcb0aa37@lab.ntt.co.jp
|
|
Add missing infrastructure for this node type, notably in ruleutils.c where
its lack could demonstrably cause EXPLAIN to fail. Add outfuncs/readfuncs
support. (outfuncs support is useful today for debugging purposes. The
readfuncs support may never be needed, since at present it would only
matter for parallel query and NextValueExpr should never appear in a
parallelizable query; but it seems like a bad idea to have a primnode type
that isn't fully supported here.) Teach planner infrastructure that
NextValueExpr is a volatile, parallel-unsafe, non-leaky expression node
with cost cpu_operator_cost. Given its limited scope of usage, there
*might* be no live bug today from the lack of that knowledge, but it's
certainly going to bite us on the rear someday. Teach pg_stat_statements
about the new node type, too.
While at it, also teach cost_qual_eval() that MinMaxExpr, SQLValueFunction,
XmlExpr, and CoerceToDomain should be charged as cpu_operator_cost.
Failing to do this for SQLValueFunction was an oversight in my commit
0bb51aa96. The others are longer-standing oversights, but no time like the
present to fix them. (In principle, CoerceToDomain could have cost much
higher than this, but it doesn't presently seem worth trying to examine the
domain's constraints here.)
Modify execExprInterp.c to execute NextValueExpr as an out-of-line
function; it seems quite unlikely to me that it's worth insisting that
it be inlined in all expression eval methods. Besides, providing the
out-of-line function doesn't stop anyone from inlining if they want to.
Adjust some places where NextValueExpr support had been inserted with the
aid of a dartboard rather than keeping it in the same order as elsewhere.
Discussion: https://postgr.es/m/23862.1499981661@sss.pgh.pa.us
|
|
This merge includes all commits upto bc2d716ad09fceeb391c755f78c256ddac9d3b9f
of PG 10.
|
|
|
|
The storm_catalog schema is supposed to contain the same catalogs and
views as pg_catalog, but filtered to the current database. The use case
for this is multi-tenant systems, which was a StormDB feature.
But on XL this is mostly irrelevant, and the schema was not populated
since commit 8096e3edf17b260de15472eb04567d1beec1e3e6 which disabled
this part of initdb.
So instead of fixing the regression failures in misc_sanity caused by
this (initdb-time schema with no pinned objects), just rip all the
remaining bits out, including the pgxc_catalog_remap GUC etc.
This also removes the setup_storm() call disabled by 8096e3edf1, as the
function got removed since then.
|
|
In WAL receiver and WAL server, some accesses to their corresponding
shared memory control structs were done without holding any kind of
lock, which could lead to inconsistent and possibly insecure results.
In walsender, fix by clarifying the locking rules and following them
correctly, as documented in the new comment in walsender_private.h;
namely that some members can be read in walsender itself without a lock,
because the only writes occur in the same process. The rest of the
struct requires spinlock for accesses, as usual.
In walreceiver, fix by always holding spinlock while accessing the
struct.
While there is potentially a problem in all branches, it is minor in
stable ones. This only became a real problem in pg10 because of quorum
commit in synchronous replication (commit 3901fd70cc7c), and a potential
security problem in walreceiver because a superuser() check was removed
by default monitoring roles (commit 25fff40798fc). Thus, no backpatch.
In passing, clean up some leftover braces which were used to create
unconditional blocks. Once upon a time these were used for
volatile-izing accesses to those shmem structs, which is no longer
required. Many other occurrences of this pattern remain.
Author: Michaël Paquier
Reported-by: Michaël Paquier
Reviewed-by: Masahiko Sawada, Kyotaro Horiguchi, Thomas Munro,
Robert Haas
Discussion: https://postgr.es/m/CAB7nPqTWYqtzD=LN_oDaf9r-hAjUEPAy0B9yRkhcsLdRN8fzrw@mail.gmail.com
|
|
Author: Thomas Munro <thomas.munro@enterprisedb.com>
|
|
Traditionally, "pg_ctl start -w" has waited for the server to become
ready to accept connections by attempting a connection once per second.
That has the major problem that connection issues (for instance, a
kernel packet filter blocking traffic) can't be reliably told apart
from server startup issues, and the minor problem that if server startup
isn't quick, we accumulate "the database system is starting up" spam
in the server log. We've hacked around many of the possible connection
issues, but it resulted in ugly and complicated code in pg_ctl.c.
In commit c61559ec3, I changed the probe rate to every tenth of a second.
That prompted Jeff Janes to complain that the log-spam problem had become
much worse. In the ensuing discussion, Andres Freund pointed out that
we could dispense with connection attempts altogether if the postmaster
were changed to report its status in postmaster.pid, which "pg_ctl start"
already relies on being able to read. This patch implements that, teaching
postmaster.c to report a status string into the pidfile at the same
state-change points already identified as being of interest for systemd
status reporting (cf commit 7d17e683f). pg_ctl no longer needs to link
with libpq at all; all its functions now depend on reading server files.
In support of this, teach AddToDataDirLockFile() to allow addition of
postmaster.pid lines in not-necessarily-sequential order. This is needed
on Windows where the SHMEM_KEY line will never be written at all. We still
have the restriction that we don't want to truncate the pidfile; document
the reasons for that a bit better.
Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode
is broken --- before, they'd just wait out the sixty seconds until
the loop gives up, and then report success anyway. (Yes, I found that
out the hard way.)
While at it, arrange for pg_ctl to not need to #include miscadmin.h;
as a rather low-level backend header, requiring that to be compilable
client-side is pretty dubious. This requires moving the #define's
associated with the pidfile into a new header file, and moving
PG_BACKEND_VERSIONSTR someplace else. For lack of a clearly better
"someplace else", I put it into port.h, beside the declaration of
find_other_exec(), since most users of that macro are passing the value to
find_other_exec(). (initdb still depends on miscadmin.h, but at least
pg_ctl and pg_upgrade no longer do.)
In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the
output of "postgres -V", which remarkably it had never done before.
Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com
|
|
We now disallow having triggers with both transition tables and ON
INSERT OR UPDATE (which was a PG extension to the spec anyway),
because in this case it's not at all clear how the transition tables
should work for an INSERT ... ON CONFLICT query. Separate ON INSERT
and ON UPDATE triggers with transition tables are allowed, and the
transition tables for these reflect only the inserted and only the
updated tuples respectively.
Patch by Thomas Munro
Discussion: https://postgr.es/m/CAEepm%3D11KHQ0JmETJQihSvhZB5mUZL2xrqHeXbCeLhDiqQ39%3Dw%40mail.gmail.com
|
|
The original coding didn't handle this case properly; each separate
DML substatement needs its own set of transitions.
Patch by Thomas Munro
Discussion: https://postgr.es/m/CAL9smLCDQ%3D2o024rBgtD4WihzX8B3C6u_oSQ2K3%2BR5grJrV0bg%40mail.gmail.com
|
|
We disallow row-level triggers with transition tables on child tables.
Transition tables for triggers on the parent table contain only those
columns present in the parent. (We can't mix tuple formats in a
single transition table.)
Patch by Thomas Munro
Discussion: https://postgr.es/m/CA%2BTgmoZzTBBAsEUh4MazAN7ga%3D8SsMC-Knp-6cetts9yNZUCcg%40mail.gmail.com
|
|
This merges the current master branch of XL with the XL 10 development branch.
Commits upto f72330316ea5796a2b11a05710b98eba4e706788 are included in this
merge.
|
|
This commit merges PG10 branch upto commit
2710ccd782d0308a3fa1ab193531183148e9b626. Regression tests show no noteworthy
additional failures. This merge includes major pgindent work done with the
newer version of pgindent
|
|
pg_import_system_collations() refused to create any ICU collations if
the current database's encoding didn't support ICU. This is wrongheaded:
initdb must initialize pg_collation in an encoding-independent way
since it might be used in other databases with different encodings.
The reason for the restriction seems to be that get_icu_locale_comment()
used icu_from_uchar() to convert the UChar-format display name, and that
unsurprisingly doesn't know what to do in unsupported encodings.
But by the same token that the initial catalog contents must be
encoding-independent, we can't allow non-ASCII characters in the comment
strings. So we don't really need icu_from_uchar() here: just check for
Unicode codes outside the ASCII range, and if there are none, the format
conversion is trivial. If there are some, we can simply not install the
comment. (In my testing, this affects only Norwegian Bokmål, which has
given us trouble before.)
For paranoia's sake, also check for non-ASCII characters in ICU locale
names, and skip such locales, as we do for libc locales. I don't
currently have a reason to believe that this will ever reject anything,
but then again the libc maintainers should have known better too.
With just the import changes, ICU collations can be found in pg_collation
in databases with unsupported encodings. This resulted in more or less
clean failures at runtime, but that's not how things act for unsupported
encodings with libc collations. Make it work the same as our traditional
behavior for libc collations by having collation lookup take into account
whether is_encoding_supported_by_icu().
Adjust documentation to match. Also, expand Table 23.1 to show which
encodings are supported by ICU.
catversion bump because of likely change in pg_collation/pg_description
initial contents in ICU-enabled builds.
Discussion: https://postgr.es/m/20c74bc3-d6ca-243d-1bbc-12f17fa4fe9a@gmail.com
|
|
Marco Atzeri reported that initdb would fail if "locale -a" reported
the same locale name more than once. All previous versions of Postgres
implicitly de-duplicated the results of "locale -a", but the rewrite
to move the collation import logic into C had lost that property.
It had also lost the property that locale names matching built-in
collation names were silently ignored.
The simplest way to fix this is to make initdb run the function in
if-not-exists mode, which means that there's no real use-case for
non if-not-exists mode; we might as well just drop the boolean argument
and simplify the function's definition to be "add any collations not
already known". This change also gets rid of some odd corner cases
caused by the fact that aliases were added in if-not-exists mode even
if the function argument said otherwise.
While at it, adjust the behavior so that pg_import_system_collations()
doesn't spew "collation foo already exists, skipping" messages during a
re-run; that's completely unhelpful, especially since there are often
hundreds of them. And make it return a count of the number of collations
it did add, which seems like it might be helpful.
Also, re-integrate the previous coding's property that it would make a
deterministic selection of which alias to use if there were conflicting
possibilities. This would only come into play if "locale -a" reports
multiple equivalent locale names, say "de_DE.utf8" and "de_DE.UTF-8",
but that hardly seems out of the question.
In passing, fix incorrect behavior in pg_import_system_collations()'s
ICU code path: it neglected CommandCounterIncrement, which would result
in failures if ICU returns duplicate names, and it would try to create
comments even if a new collation hadn't been created.
Also, reorder operations in initdb so that the 'ucs_basic' collation
is created before calling pg_import_system_collations() not after.
This prevents a failure if "locale -a" were to report a locale named
that. There's no reason to think that that ever happens in the wild,
but the old coding would have survived it, so let's be equally robust.
Discussion: https://postgr.es/m/20c74bc3-d6ca-243d-1bbc-12f17fa4fe9a@gmail.com
|
|
Callers of icu_to_uchar() neglected to pfree the result string when done
with it. This results in catastrophic memory leaks in varstr_cmp(),
because of our prevailing assumption that btree comparison functions don't
leak memory. For safety, make all the call sites clean up leaks, though
I suspect that we could get away without it in formatting.c. I audited
callers of icu_from_uchar() as well, but found no places that seemed to
have a comparable issue.
Add function API specifications for icu_to_uchar() and icu_from_uchar();
the lack of any thought-through specification is perhaps not unrelated
to the existence of this bug in the first place. Fix icu_to_uchar()
to guarantee a nul-terminated result; although no existing caller appears
to care, the fact that it would have been nul-terminated except in
extreme corner cases seems ideally designed to bite someone on the rear
someday. Fix ucnv_fromUChars() destCapacity argument --- in the worst
case, that could perhaps have led to a non-nul-terminated result, too.
Fix icu_from_uchar() to have a more reasonable definition of the function
result --- no callers are actually paying attention, so this isn't a live
bug, but it's certainly sloppily designed. Const-ify icu_from_uchar()'s
input string for consistency.
That is not the end of what needs to be done to these functions, but
it's as much as I have the patience for right now.
Discussion: https://postgr.es/m/1955.1498181798@sss.pgh.pa.us
|
|
Also add a comment on its new member PartitionRoot.
Reported-by: Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>
|