summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-06-10Disable aggregate paths with extra COMBINE phaseTomas Vondra
The planner was generating aggregate paths with an additional COMBINE step pushed to the remote side, like this: QUERY PLAN --------------------------------------------------------- Finalize Aggregate -> Remote Subquery Scan on all (datanode1,datanode2) -> Partial Aggregate (Combine) -> Gather -> Partial Aggregate -> Parallel Seq Scan on public.t This was done with the goal to reduce the amount of data transmitted over network, and the amount of work to be done on a coordinator. Unfortunately, the upstream code seems not quite ready for such plans, leading to failures like this ERROR: variable not found in subplan target list for large amounts of data and high max_parallel_workers_per_gather. Those plans would still be quite beneficial, improving the scalability of Postgres-XL clusters in analytics. But we can reintroduce them once the targetlist issue gets fixed.
2017-06-10Remove bogus http:// from a program listing in docsTomas Vondra
2017-06-10doc: Add Node.js and Go drivers to client interfacesPeter Eisentraut
Also, fix client interface JDBC language name to Java. Author: Sehrope Sarkuni <sehrope@jackdb.com>
2017-06-09doc: Improve types in examplePeter Eisentraut
Reported-by: Nikolaus Thiel <klt@fsfe.org>
2017-06-09doc: Document that subscriptions to same server might hangPeter Eisentraut
2017-06-09Silence warning about uninitialized 'ret' variable on some compilers.Heikki Linnakangas
If the compiler doesn't notice that the switch-statement handles all possible values of the enum, it might complain that 'ret' is being used without initialization. Jeff Janes reported that on gcc 4.4.7. Discussion: https://www.postgresql.org/message-id/CAMkU=1x31RvP+cpooFbmc8K8nt-gNO8woGFhXcgQYYZ5ozYpFA@mail.gmail.com
2017-06-09Formatting improvements in config file samplesPeter Eisentraut
2017-06-09Update code commentsPeter Eisentraut
Author: Neha Khatri <nehakhatri5@gmail.com>
2017-06-09Fix typoPeter Eisentraut
Author: Masahiko Sawada <sawada.mshk@gmail.com>
2017-06-09Fix minor issues in the tpcb-like pgbench scriptTomas Vondra
The tpcb-like built-in script in pgbench contained two simple bugs. It was still using the old \setrandom command to generate the delta value, instead of the new \set delta random(-5000, 5000) This is mostly an omission in 32d57848458595a487d251b37c2872d86de439ef. There was also a missing semicolon at the end of one of the commands, causing cryptic syntax errors.
2017-06-09psql: Update tab completion for ALTER SUBSCRIPTIONPeter Eisentraut
Author: Masahiko Sawada <sawada.mshk@gmail.com>
2017-06-09Improve tablesync behavior with concurrent changesPeter Eisentraut
When a table is removed from a subscription before the tablesync worker could start, this would previously result in an error when reading pg_subscription_rel. Now we just ignore this. Author: Masahiko Sawada <sawada.mshk@gmail.com>
2017-06-09Give a better error message on invalid hostaddr option.Heikki Linnakangas
If you accidentally pass a host name in the hostaddr option, e.g. hostaddr=localhost, you get an error like: psql: could not translate host name "localhost" to address: Name or service not known That's a bit confusing, because it implies that we tried to look up "localhost" in DNS, but it failed. To make it more clear that we tried to parse "localhost" as a numeric network address, change the message to: psql: could not parse network address "localhost": Name or service not known Discussion: https://www.postgresql.org/message-id/10badbc6-4d5a-a769-623a-f7ada43e14dd@iki.fi
2017-06-09Fix script name in README.Heikki Linnakangas
The script was rewritten in Perl, and renamed from regress.sh to regress.pl, back in 2012.
2017-06-08Use standard interrupt handling in logical replication launcher.Andres Freund
Previously the exit handling was only able to exit from within the main loop, and not from within the backend code it calls. Fix that by using the standard die() SIGTERM handler, and adding the necessary CHECK_FOR_INTERRUPTS() call. This requires adding yet another process-type-specific branch to ProcessInterrupts(), which hints that we probably should generalize that handling. But that's work for another day. Author: Petr Jelinek Reviewed-By: Andres Freund Discussion: https://postgr.es/m/fe072153-babd-3b5d-8052-73527a6eb657@2ndquadrant.com
2017-06-08Again report a useful error message when walreceiver's connection closes.Andres Freund
Since 7c4f52409a8c (merged in v10), a shutdown master is reported as FATAL: unexpected result after CommandComplete: server closed the connection unexpectedly by walsender. It used to be LOG: replication terminated by primary server FATAL: could not send end-of-streaming message to primary: no COPY in progress while the old message clearly is not perfect, it's definitely better than what's reported now. The change comes from the attempt to handle finished COPYs without erroring out, needed for the new logical replication, which wasn't needed before. There's probably better ways to handle this, but for now just explicitly check for a closed connection. Author: Petr Jelinek Reviewed-By: Andres Freund Discussion: https://postgr.es/m/f7c7dd08-855c-e4ed-41f4-d064a6c0665a@2ndquadrant.com Backpatch: -
2017-06-08Update key words table for version 10Peter Eisentraut
2017-06-08Mark to_tsvector(regconfig,json[b]) functions immutableAndrew Dunstan
This make them consistent with the text function and means they can be used in functional indexes. Catalog version bumped. Per gripe from Josh Berkus.
2017-06-08Fix bit-rot in pg_upgrade's test.sh, and improve documentation.Tom Lane
Doing a cross-version upgrade test with test.sh evidently hasn't been tested since circa 9.2, because the script lacked case branches for old-version servers newer than 9.1. Future-proof that a bit, and clean up breakage induced by our recent drop of V0 function call protocol (namely that oldstyle_length() isn't in the regression suite anymore). (This isn't enough to make the test work perfectly cleanly across versions, but at least it finishes and provides dump files that you can diff manually. One issue I didn't touch is that we might want to execute the "reindex_hash.sql" file in the new DB before dumping it, so that the hash indexes don't vanish from the dump.) Improve the TESTING doc file: put the tl;dr version at the top not the bottom, and bring its explanation of how to run a cross-version test up to speed, since the installcheck target isn't there and won't be resurrected. Improve the comment in the Makefile about why not. In passing, teach .gitignore and "make clean" about a couple more junk output files. Discussion: https://postgr.es/m/14058.1496892482@sss.pgh.pa.us
2017-06-08Improve authentication error messages.Heikki Linnakangas
Most of the improvements were in the new SCRAM code: * In SCRAM protocol violation messages, use errdetail to provide the details. * If pg_backend_random() fails, throw an ERROR rather than just LOG. We shouldn't continue authentication if we can't generate a random nonce. * Use ereport() rather than elog() for the "invalid SCRAM verifier" messages. They shouldn't happen, if everything works, but it's not inconceivable that someone would have invalid scram verifiers in pg_authid, e.g. if a broken client application was used to generate the verifier. But this change applied to old code: * Use ERROR rather than COMMERROR for protocol violation errors. There's no reason to not tell the client what they did wrong. The client might be confused already, so that it cannot read and display the error correctly, but let's at least try. In the "invalid password packet size" case, we used to actually continue with authentication anyway, but that is now a hard error. Patch by Michael Paquier and me. Thanks to Daniel Varrazzo for spotting the typo in one of the messages that spurred the discussion and these larger changes. Discussion: https://www.postgresql.org/message-id/CA%2Bmi_8aZYLhuyQi1Jo0hO19opNZ2OEATEOM5fKApH7P6zTOZGg%40mail.gmail.com
2017-06-08Fix warnings about uninitialized vars in pg_dump.cTomas Vondra
Three XL-specific fields in getTables() were initialized and used in two independent if blocks looking like this: if (fout->isPostgresXL) { i_pgxclocatortype = PQfnumber(res, "pgxclocatortype"); i_pgxcattnum = PQfnumber(res, "pgxcattnum"); i_pgxc_node_names = PQfnumber(res, "pgxc_node_names"); } if (fout->isPostgresXL) { ... use the variables ... } Which however confuses the compiler (gcc 5.3.1) which then complains that the variables are maybe used uninitialized. The fix is simple, just make the initialization unconditional - if there are no such columns then PQfnumber() will return -1, but we'll not use the value anyway.
2017-06-08Put new command-line options in alphabetical orderPeter Eisentraut
2017-06-08Fix compiler warnings due to unused variablesTomas Vondra
Removes a few variables that were either entirely unused, or just set and never read again.
2017-06-08Add statistics subdirectory to Makefile.Robert Haas
Commit 7b504eb282ca2f5104b5c00b4f05a3ef6bb1385b overlooked this. Report and patch by Kyotaro Horiguchi Discussion: http://postgr.es/m/20170608.145852.54673832.horiguchi.kyotaro@lab.ntt.co.jp
2017-06-08Add remote subquery step to recurse_set_operationsTomas Vondra
During the initial phase of resolving 9.6 merge conflicts in the planner we have switched back to a clean upstream code for some files (including prepunion.c). Then we reintroduced the XL-specific bits with necessary tweaks, caused particularly by upper-planner pathification. We have however forgot about this bit in recurse_set_operations, so this commit fixes that by adding the redistribution again. This fixes failures in collate, copyselect and union regression suites. Patch by senhu <senhu@tencent.com>, review and commit by me.
2017-06-08Switch connections after processing PGXLRemoteFetchSize rowsPavan Deolasee
Fast-query-shipping consumes all rows produced by one datanode (connection) before moving to the next connection. This leads to suboptimal performance when the datanodes can't prroduce tuples at a desired pace. Instead, we switch between connections after every PGXLRemoteFetchSize (pgx_remote_fetch_size) rows are fetched. This gives datanode a chance to produce more tuples while the coordinator consumes tuples already produced and sent across by another datanode. This seems to improve performance for FQS-ed queries significantly when they are returning large number of rows from more than one datanodes. Report by Pilar de Teodoro <pteodoro@sciops.esa.int>, initial analysis and performance tests by Tomas Vondra, further analysis and patch by me. Backpatched to XL9_5_STABLE.
2017-06-08Fix contrib/sepgsql regr tests for tup-routing constraint check change.Joe Conway
Commit 15ce775 changed tuple-routing constraint checking logic. This affects the expected output for contrib/sepgsql, because there's no longer LOG entries reporting allowance of int4eq() execution. Per buildfarm.
2017-06-07Docs: improve CREATE TABLE ref page's discussion of partition bounds.Tom Lane
Clarify in the syntax synopsis that partition bound values must be exactly numeric literals or string literals; previously it said "bound_literal" which was defined nowhere. Replace confusing --- and, I think, incorrect in detail --- definition of how range bounds work with a reference to row-wise comparison plus a concrete example (which I stole from Robert Haas). Minor copy-editing in the same area. Discussion: https://postgr.es/m/30475.1496005465@sss.pgh.pa.us Discussion: https://postgr.es/m/28106.1496041449@sss.pgh.pa.us
2017-06-07postgres_fdw: Allow cancellation of transaction control commands.Robert Haas
Commit f039eaac7131ef2a4cf63a10cf98486f8bcd09d2, later back-patched with commit 1b812afb0eafe125b820cc3b95e7ca03821aa675, allowed many of the queries issued by postgres_fdw to fetch remote data to respond to cancel interrupts in a timely fashion. However, it didn't do anything about the transaction control commands, which remained noninterruptible. Improve the situation by changing do_sql_command() to retrieve query results using pgfdw_get_result(), which uses the asynchronous interface to libpq so that it can check for interrupts every time libpq returns control. Since this might result in a situation where we can no longer be sure that the remote transaction state matches the local transaction state, add a facility to force all levels of the local transaction to abort if we've lost track of the remote state; without this, an apparently-successful commit of the local transaction might fail to commit changes made on the remote side. Also, add a 60-second timeout for queries issue during transaction abort; if that expires, give up and mark the state of the connection as unknown. Drop all such connections when we exit the local transaction. Together, these changes mean that if we're aborting the local toplevel transaction anyway, we can just drop the remote connection in lieu of waiting (possibly for a very long time) for it to complete an abort. This still leaves quite a bit of room for improvement. PQcancel() has no asynchronous interface, so if we get stuck sending the cancel request we'll still hang. Also, PQsetnonblocking() is not used, which means we could block uninterruptibly when sending a query. There might be some other optimizations possible as well. Nonetheless, this allows us to escape a wait for an unresponsive remote server quickly in many more cases than previously. Report by Suraj Kharage. Patch by me and Rafia Sabih. Review and testing by Amit Kapila and Tushar Ahuja. Discussion: http://postgr.es/m/CAF1DzPU8Kx+fMXEbFoP289xtm3bz3t+ZfxhmKavr98Bh-C0TqQ@mail.gmail.com
2017-06-07Fix updating of pg_subscription_rel from workersPeter Eisentraut
A logical replication worker should not insert new rows into pg_subscription_rel, only update existing rows, so that there are no races if a concurrent refresh removes rows. Adjust the API to be able to choose that behavior. Author: Masahiko Sawada <sawada.mshk@gmail.com> Reported-by: tushar <tushar.ahuja@enterprisedb.com>
2017-06-07Prevent BEFORE triggers from violating partitioning constraints.Robert Haas
Since tuple-routing implicitly checks the partitioning constraints at least for the levels of the partitioning hierarchy it traverses, there's normally no need to revalidate the partitioning constraint after performing tuple routing. However, if there's a BEFORE trigger on the target partition, it could modify the tuple, causing the partitioning constraint to be violated. Catch that case. Also, instead of checking the root table's partition constraint after tuple-routing, check it beforehand. Otherwise, the rules for when the partitioning constraint gets checked get too complicated, because you sometimes have to check part of the constraint but not all of it. This effectively reverts commit 39162b2030fb0a35a6bb28dc636b5a71b8df8d1c in favor of a different approach altogether. Report by me. Initial debugging by Jeevan Ladhe. Patch by Amit Langote, reviewed by me. Discussion: http://postgr.es/m/CA+Tgmoa9DTgeVOqopieV8d1QRpddmP65aCdxyjdYDoEO5pS5KA@mail.gmail.com
2017-06-07Clear auth context correctly when re-connecting after failed auth attempt.Heikki Linnakangas
If authentication over an SSL connection fails, with sslmode=prefer, libpq will reconnect without SSL and retry. However, we did not clear the variables related to GSS, SSPI, and SASL authentication state, when reconnecting. Because of that, the second authentication attempt would always fail with a "duplicate GSS/SASL authentication request" error. pg_SSPI_startup did not check for duplicate authentication requests like the corresponding GSS and SASL functions, so with SSPI, you would leak some memory instead. Another way this could manifest itself, on version 10, is if you list multiple hostnames in the "host" parameter. If the first server requests Kerberos or SCRAM authentication, but it fails, the attempts to connect to the other servers will also fail with "duplicate authentication request" errors. To fix, move the clearing of authentication state from closePGconn to pgDropConnection, so that it is cleared also when re-connecting. Patch by Michael Paquier, with some kibitzing by me. Backpatch down to 9.3. 9.2 has the same bug, but the code around closing the connection is somewhat different, so that this patch doesn't apply. To fix this in 9.2, I think we would need to back-port commit 210eb9b743 first, and then apply this patch. However, given that we only bumped into this in our own testing, we haven't heard any reports from users about this, and that 9.2 will be end-of-lifed in a couple of months anyway, it doesn't seem worth the risk and trouble. Discussion: https://www.postgresql.org/message-id/CAB7nPqRuOUm0MyJaUy9L3eXYJU3AKCZ-0-03=-aDTZJGV4GyWw@mail.gmail.com
2017-06-07Fix double-free bug in GSS authentication.Heikki Linnakangas
The logic to free the buffer after the gss_init_sec_context() call was always a bit wonky. Because gss_init_sec_context() sets the GSS context variable, conn->gctx, we would in fact always attempt to free the buffer. That only works, because previously conn->ginbuf.value was initialized to NULL, and free(NULL) is a no-op. Commit 61bf96cab0 refactored things so that the GSS input token buffer is allocated locally in pg_GSS_continue, and not held in the PGconn object. After that, the now-local ginbuf.value variable isn't initialized when it's not used, so we pass a bogus pointer to free(). To fix, only try to free the input buffer if we allocated it. That was the intention, certainly after the refactoring, and probably even before that. But because there's no live bug before the refactoring, I refrained from backpatching this. The bug was also independently reported by Graham Dutton, as bug #14690. Patch reviewed by Michael Paquier. Discussion: https://www.postgresql.org/message-id/6288d80e-a0bf-d4d3-4e12-7b79c77f1771%40iki.fi Discussion: https://www.postgresql.org/message-id/20170605130954.1438.90535%40wrigleys.postgresql.org
2017-06-07Consistently use subscription name as application namePeter Eisentraut
The logical replication apply worker uses the subscription name as application name, except for table sync. This was incorrectly set to use the replication slot name, which might be different, in one case. Also add a comment why the other case is different.
2017-06-06Clean up latch related code.Andres Freund
The larger part of this patch replaces usages of MyProc->procLatch with MyLatch. The latter works even early during backend startup, where MyProc->procLatch doesn't yet. While the affected code shouldn't run in cases where it's not initialized, it might get copied into places where it might. Using MyLatch is simpler and a bit faster to boot, so there's little point to stick with the previous coding. While doing so I noticed some weaknesses around newly introduced uses of latches that could lead to missed events, and an omitted CHECK_FOR_INTERRUPTS() call in worker_spi. As all the actual bugs are in v10 code, there doesn't seem to be sufficient reason to backpatch this. Author: Andres Freund Discussion: https://postgr.es/m/20170606195321.sjmenrfgl2nu6j63@alap3.anarazel.de https://postgr.es/m/20170606210405.sim3yl6vpudhmufo@alap3.anarazel.de Backpatch: -
2017-06-06Improve handover logic between sync and apply workersPeter Eisentraut
Make apply busy wait check the catalog instead of shmem state to ensure that next transaction will see the expected table synchronization state. Also make the handover always go through same set of steps to make the overall process easier to understand and debug. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Tested-by: Mark Kirkwood <mark.kirkwood@catalyst.net.nz> Tested-by: Erik Rijkers <er@xs4all.nl>
2017-06-06Fix some cases of "the the" split across two lines.Robert Haas
Kevin Grittner observed that 2186b608b3cb859fe0ec04015a5c4e4cbf69caed introduced a new occurence of this by copying existing text, and I found a few more cases using grep. Discussion: http://postgr.es/m/CADAecHWfG-K+YvocHCkrXV-ycm+eUOaaUVfYZNOnwf0pSmuQCw@mail.gmail.com
2017-06-06Use NIL rather than NULL to represent an empty list.Robert Haas
Just to be tidy. Amit Langote Discussion: http://postgr.es/m/9297f80f-e4ab-7dda-33d4-8580bab6d634@lab.ntt.co.jp
2017-06-06Clean up partcollation handling for OID 0.Robert Haas
Consistent with what we do for indexes, we shouldn't try to record dependencies on collation OID 0 or the default collation OID (which is pinned). Also, the fact that indcollation and partcollation can contain zero OIDs when the data type is not collatable should be documented. Amit Langote, per a complaint from me. Discussion: http://postgr.es/m/CA+Tgmoba5mtPgM3NKfG06vv8na5gGbVOj0h4zvivXQwLw8wXXQ@mail.gmail.com
2017-06-06Fix docs to not claim ECPG's SET CONNECTION is not thread-aware.Michael Meskes
Changed by: Tsunakawa, Takayuki <tsunakawa.takay@jp.fujitsu.com>
2017-06-06Wire up query cancel interrupt for walsender backends.Andres Freund
This allows to cancel commands run over replication connections. While it might have some use before v10, it has become important now that normal SQL commands are allowed in database connected walsender connections. Author: Petr Jelinek Reviewed-By: Andres Freund, Michael Paquier Discussion: https://postgr.es/m/7966f454-7cd7-2b0c-8b70-cdca9d5a8c97@2ndquadrant.com
2017-06-06Unify SIGHUP handling between normal and walsender backends.Andres Freund
Because walsender and normal backends share the same main loop it's problematic to have two different flag variables, set in signal handlers, indicating a pending configuration reload. Only certain walsender commands reach code paths checking for the variable (START_[LOGICAL_]REPLICATION, CREATE_REPLICATION_SLOT ... LOGICAL, notably not base backups). This is a bug present since the introduction of walsender, but has gotten worse in releases since then which allow walsender to do more. A later patch, not slated for v10, will similarly unify SIGHUP handling in other types of processes as well. Author: Petr Jelinek, Andres Freund Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20170423235941.qosiuoyqprq4nu7v@alap3.anarazel.de Backpatch: 9.2-, bug is present since 9.0
2017-06-06Prevent possibility of panics during shutdown checkpoint.Andres Freund
When the checkpointer writes the shutdown checkpoint, it checks afterwards whether any WAL has been written since it started and throws a PANIC if so. At that point, only walsenders are still active, so one might think this could not happen, but walsenders can also generate WAL, for instance in BASE_BACKUP and logical decoding related commands (e.g. via hint bits). So they can trigger this panic if such a command is run while the shutdown checkpoint is being written. To fix this, divide the walsender shutdown into two phases. First, checkpointer, itself triggered by postmaster, sends a PROCSIG_WALSND_INIT_STOPPING signal to all walsenders. If the backend is idle or runs an SQL query this causes the backend to shutdown, if logical replication is in progress all existing WAL records are processed followed by a shutdown. Otherwise this causes the walsender to switch to the "stopping" state. In this state, the walsender will reject any further replication commands. The checkpointer begins the shutdown checkpoint once all walsenders are confirmed as stopping. When the shutdown checkpoint finishes, the postmaster sends us SIGUSR2. This instructs walsender to send any outstanding WAL, including the shutdown checkpoint record, wait for it to be replicated to the standby, and then exit. Author: Andres Freund, based on an earlier patch by Michael Paquier Reported-By: Fujii Masao, Andres Freund Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20170602002912.tqlwn4gymzlxpvs2@alap3.anarazel.de Backpatch: 9.4, where logical decoding was introduced
2017-06-06Have walsenders participate in procsignal infrastructure.Andres Freund
The non-participation in procsignal was a problem for both changes in master, e.g. parallelism not working for normal statements run in walsender backends, and older branches, e.g. recovery conflicts and catchup interrupts not working for logical decoding walsenders. This commit thus replaces the previous WalSndXLogSendHandler with procsignal_sigusr1_handler. In branches since db0f6cad48 that can lead to additional SetLatch calls, but that only rarely seems to make a difference. Author: Andres Freund Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20170421014030.fdzvvvbrz4nckrow@alap3.anarazel.de Backpatch: 9.4, earlier commits don't seem to benefit sufficiently
2017-06-06Revert "Prevent panic during shutdown checkpoint"Andres Freund
This reverts commit 086221cf6b1727c2baed4703c582f657b7c5350e, which was made to master only. The approach implemented in the above commit has some issues. While those could easily be fixed incrementally, doing so would make backpatching considerably harder, so instead first revert this patch. Discussion: https://postgr.es/m/20170602002912.tqlwn4gymzlxpvs2@alap3.anarazel.de
2017-06-06Don't set application_name in logical replication workersPeter Eisentraut
This was bothering some people because it's not the intended use of application_name and it makes the default view of pg_stat_activity bulky.
2017-06-06Fix ALTER SUBSCRIPTION grammar ambiguityPeter Eisentraut
There was a grammar ambiguity between SET PUBLICATION name REFRESH and SET PUBLICATION SKIP REFRESH, because SKIP is not a reserved word. To resolve that, fold the refresh choice into the WITH options. Refreshing is the default now. Reported-by: tushar <tushar.ahuja@enterprisedb.com>
2017-06-06Ignore WL_POSTMASTER_DEATH latch event in single user modePeter Eisentraut
Otherwise code that uses this will abort with an assertion failure, because postmaster_alive_fds are not initialized. Reported-by: tushar <tushar.ahuja@enterprisedb.com>
2017-06-06Fix thinko in previous openssl changeAndrew Dunstan
2017-06-05Fix record length computation in pg_waldump/xlogdump.Andres Freund
The current method of computing the record length (excluding the lenght of full-page images) has been wrong since the WAL format has been revamped in 2c03216d831160bedd72d45f712601b6f7d03f1c. Only the main record's length was counted, but that can be significantly too little if there's data associated with further blocks. Fix by computing the record length as total_lenght - fpi_length. Reported-By: Chen Huajun Bug: #14687 Reviewed-By: Heikki Linnakangas Discussion: https://postgr.es/m/20170603165939.1436.58887@wrigleys.postgresql.org Backpatch: 9.5-