summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-09-24Improve behavior of tsearch_readline(), and remove t_readline().Tom Lane
Commit fbeb9da22, which added the tsearch_readline APIs, left t_readline() in place as a compatibility measure. But that function has been unused and deprecated for twelve years now, so that seems like enough time to remove it. Doing so, and merging t_readline's code into tsearch_readline, aids in making several useful improvements: * The hard-wired 4K limit on line length in tsearch data files is removed, by using a StringInfo buffer instead of a fixed-size buffer. * We can buy back the per-line palloc/pfree added by 3ea7e9550 in the common case where encoding conversion is not required. * We no longer need a separate pg_verify_mbstr call, as that functionality was folded into encoding conversion some time ago. (We could have done some of this stuff while keeping t_readline as a separate API, but there seems little point, since there's no reason for anyone to still be using t_readline directly.) Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
2020-09-23Fix missing fsync of SLRU directories.Thomas Munro
Harmonize behavior by moving reponsibility for fsyncing directories down into slru.c. In 10 and later, only the multixact directories were missed (see commit 1b02be21), and in older branches all SLRUs were missed. Back-patch to all supported releases. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CA%2BhUKGLtsTUOScnNoSMZ-2ZLv%2BwGh01J6kAo_DM8mTRq1sKdSQ%40mail.gmail.com
2020-09-23Improve error cursor positions for problems with partition bounds.Tom Lane
We failed to pass down the query string to check_new_partition_bound, so that its attempts to provide error cursor positions were for naught; one must have the query string to get parser_errposition to do anything. Adjust its API to require a ParseState to be passed down. Also, improve the logic inside check_new_partition_bound so that the cursor points at the partition bound for the specific column causing the issue, when one can be identified. That part is also for naught if we can't determine the query position of the column with the problem. Improve transformPartitionBoundValue so that it makes sure that const-simplified partition expressions will be properly labeled with positions. In passing, skip calling evaluate_expr if the value is already a Const, which is surely the most common case. Alexandra Wang, Ashwin Agrawal, Amit Langote; reviewed by Ashutosh Bapat Discussion: https://postgr.es/m/CACiyaSopZoqssfMzgHk6fAkp01cL6vnqBdmTw2C5_KJaFR_aMg@mail.gmail.com Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com
2020-09-23Avoid possible dangling-pointer access in tsearch_readline_callback.Tom Lane
tsearch_readline() saves the string pointer it returns to the caller for possible use in the associated error context callback. However, the caller will usually pfree that string sometime before it next calls tsearch_readline(), so that there is a window where an ereport will try to print an already-freed string. The built-in users of tsearch_readline() happen to all do that pfree at the bottoms of their loops, so that the window is effectively empty for them. However, this is not documented as a requirement, and contrib/dict_xsyn doesn't do it like that, so it seems likely that third-party dictionaries might have live bugs here. The practical consequences of this seem pretty limited in any case, since production builds wouldn't clobber the freed string immediately, besides which you'd not expect syntax errors in dictionary files being used in production. Still, it's clearly a bug waiting to bite somebody. Fix by pstrdup'ing the string to be saved for the error callback, and then pfree'ing it next time through. It's been like this for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
2020-09-23Allow WaitLatch() to be used without a latch.Thomas Munro
Due to flaws in commit 3347c982bab, using WaitLatch() without WL_LATCH_SET could cause an assertion failure or crash. Repair. While here, also add a check that the latch we're switching to belongs to this backend, when changing from one latch to another. Discussion: https://postgr.es/m/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com
2020-09-22Simplify SortTocFromFile() by removing fixed buffer-size limit.Tom Lane
pg_restore previously coped with overlength TOC-file lines using some complicated logic to ignore additional bufferloads. While this isn't wrong, since we don't expect that the interesting part of a line would run to more than a dozen or so bytes, it's more complex than it needs to be. Use a StringInfo instead of a fixed-size buffer so that we can process long lines as single entities and thus not need the extra logic. Daniel Gustafsson Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
2020-09-22Remove arbitrary line length limit for libpq service files.Tom Lane
Use a StringInfo instead of a fixed-size buffer in parseServiceInfo(). While we've not heard complaints about the existing 255-byte limit, it certainly seems possible that complex cases could run afoul of it. Daniel Gustafsson Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
2020-09-22Rethink API for pg_get_line.c, one more time.Tom Lane
Further experience says that the appending behavior offered by pg_get_line_append is useful to only a very small minority of callers. For most, the requirement to reset the buffer after each line is just an error-prone nuisance. Hence, invent another alternative call pg_get_line_buf, which takes care of that detail. Noted while reviewing a patch from Daniel Gustafsson. Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
2020-09-22Exclude fmgrprotos.h from pgindent processing.Tom Lane
pgindent messes up entries in this file if their names match typedef names. While there's reason to avoid choosing conflicting names, we have some historical exceptions, and there's no guarantee that more duplicates won't appear in future. Since this is a derived file anyway, there's little harm in just excluding it. I said yesterday that all our derived files are pgindent-clean, or else explicitly excluded from indentation, but I'd forgotten about this one. Now that project is really done, as confirmed by a test run. Discussion: https://postgr.es/m/79ed5348-be7a-b647-dd40-742207186a22@2ndquadrant.com
2020-09-22Improve the error message for an inappropriate column definition list.Tom Lane
The existing message about "a column definition list is only allowed for functions returning "record"" could be given in some cases where it was fairly confusing; in particular, a function with multiple OUT parameters *does* return record according to pg_proc. Break it down into a couple more cases to deliver a more on-point complaint. Per complaint from Bruce Momjian. Discussion: https://postgr.es/m/798909.1600562993@sss.pgh.pa.us
2020-09-21Fix a few more generator scripts to produce pgindent-clean output.Tom Lane
This completes the project of making all our derived files be pgindent-clean (or else explicitly excluded from indentation), so that no surprises result when running pgindent in a built-out development tree. Discussion: https://postgr.es/m/79ed5348-be7a-b647-dd40-742207186a22@2ndquadrant.com
2020-09-21Copy editing: fix a bunch of misspellings and poor wording.Tom Lane
99% of this is docs, but also a couple of comments. No code changes. Justin Pryzby Discussion: https://postgr.es/m/20200919175804.GE30557@telsasoft.com
2020-09-21Standardize order of use strict and use warnings in Perl codePeter Eisentraut
The standard order in PostgreSQL and other code is use strict first, but some code was uselessly inconsistent about this.
2020-09-21Fix checksum calculation in the new sorting GiST build.Heikki Linnakangas
Since we're bypassing the buffer manager, we need to call PageSetChecksumInplace() directly. As reported by Justin Pryzby. In the passing, add RelationOpenSmgr() calls before all smgrwrite() and smgrextend() calls. Tom added one before the first smgrextend() call in commit c2bb287025, which seems to be enough, but let's play it safe and do it before each one. That's how it's done in the similar code in nbtsort.c, too. Discussion: https://www.postgresql.org/message-id/20200920224446.GF30557@telsasoft.com
2020-09-20Fix new GIST build code to work under CLOBBER_CACHE_ALWAYS.Tom Lane
Can't say if this fixes *all* cases, but at least we get through the "point" regression test now, which hyrax's last run did not. Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2020-09-19%2021%3A27%3A23
2020-09-20Fix whitespacePeter Eisentraut
2020-09-19Remove precedence hacks no longer needed without postfix operators.Tom Lane
It's no longer necessary to assign explicit precedences to GENERATED, NULL_P, PRESERVE, or STRIP_P. Actually, we don't need to assign precedence to IDENT either; that was really just there to govern the behavior of target_el's "a_expr IDENT" production, which no longer ends with that terminal. However, it seems like a good idea to continue to do so, because it provides a reference point for a precedence level that we can assign to other unreserved keywords that lack a natural precedence level. Research by Peter Eisentraut and John Naylor; comment rewrite by me. Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-19Remove unused parametersPeter Eisentraut
Remove various unused parameters in pg_dump code. These have all become unused over time or were never used. Discussion: https://www.postgresql.org/message-id/flat/511bb100-f829-ba21-2f10-9f952ec06ead%402ndquadrant.com
2020-09-19Code review for dynahash change.Thomas Munro
Commit be0a6666 left behind a comment about the order of some tests that didn't make sense without the expensive division, and in fact we might as well change the order to one that fails more cheaply most of the time as a micro-optimization. Also, remove the "+ 1" applied to max_bucket, to drop an instruction and match the original behavior. Per review from Tom Lane. Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
2020-09-18Remove large fill factor support from dynahash.c.Thomas Munro
Since ancient times we have had support for a fill factor (maximum load factor) to be set for a dynahash hash table, but: 1. It was an integer, whereas for in-memory hash tables interesting load factor targets are probably somewhere near the 0.75-1.0 range. 2. It was implemented in a way that performed an expensive division operation that regularly showed up in profiles. 3. We are not aware of anyone ever having used a non-default value. Therefore, remove support, effectively fixing it at 1. Author: Jakub Wartak <Jakub.Wartak@tomtom.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
2020-09-18Allow most keywords to be used as column labels without requiring AS.Tom Lane
Up to now, if you tried to omit "AS" before a column label in a SELECT list, it would only work if the column label was an IDENT, that is not any known keyword. This is rather unfriendly considering that we have so many keywords and are constantly growing more. In the wake of commit 1ed6b8956 it's possible to improve matters quite a bit. We'd originally tried to make this work by having some of the existing keyword categories be allowed without AS, but that didn't work too well, because each category contains a few special cases that don't work without AS. Instead, invent an entirely orthogonal keyword property "can be bare column label", and mark all keywords that way for which we don't get shift/reduce errors by doing so. It turns out that of our 450 current keywords, all but 39 can be made bare column labels, improving the situation by over 90%. This number might move around a little depending on future grammar work, but it's a pretty nice improvement. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-18pg_surgery: Try to stabilize regression tests.Robert Haas
According to buildfarm member sungazer, the behavior of VACUUM can be unstable in these tests even if we prevent autovacuum from running on the tables in question, apparently because even a manual vacuum can behave differently depending on whether anything else is running that holds back the global xmin. So use a temporary table instead, which as of commit a7212be8b9e0885ee769e8c55f99ef742cda487b enables vacuuming using a more aggressive cutoff. This approach can't be used for the regression test that involves a materialized view, but that test doesn't run vacuum, so it shouldn't be prone to this particular failure mode. Analysis by Tom Lane. Patch by Ashutosh Sharma and me. Discussion: http://postgr.es/m/665524.1599948007@sss.pgh.pa.us
2020-09-18Update file header comments for logical/relation.c.Amit Kapila
Author: Amit Langote Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CA+HiwqE20oZoix13JyCeALpTf_SmjarZWtBFe5sND6zz+iupAw@mail.gmail.com
2020-09-18Fix comments in heapam.c.Amit Kapila
After commits 85f6b49c2c and 3ba59ccc89, we can allow parallel inserts which was earlier not possible as parallel group members won't conflict for relation extension and page lock. In those commits, we forgot to update comments at few places. Author: Amit Kapila Reviewed-by: Robert Haas and Dilip Kumar Backpatch-through: 13 Discussion: https://postgr.es/m/CAFiTN-tMrQh5FFMPx5aWJ+1gi1H6JxktEhq5mDwCHgnEO5oBkA@mail.gmail.com
2020-09-18Try to stabilize output from rolenames regression test.Tom Lane
It's not quite clear why commit 45b980570 has resulted in some instability here, though interference from concurrent autovacuum runs seems like a reasonable guess. What is clear is that the output ordering of the test queries is underdetermined for no very good reason. Extend the ORDER BY keys in hopes of fixing the buildfarm. Discussion: https://postgr.es/m/147499.1600351924@sss.pgh.pa.us
2020-09-17Remove support for postfix (right-unary) operators.Tom Lane
This feature has been a thorn in our sides for a long time, causing many grammatical ambiguity problems. It doesn't seem worth the pain to continue to support it, so remove it. There are some follow-on improvements we can make in the grammar, but this commit only removes the bare minimum number of productions, plus assorted backend support code. Note that pg_dump and psql continue to have full support, since they may be used against older servers. However, pg_dump warns about postfix operators. There is also a check in pg_upgrade. Documentation-wise, I (tgl) largely removed the "left unary" terminology in favor of saying "prefix operator", which is a more standard and IMO less confusing term. I included a catversion bump, although no initial catalog data changes here, to mark the boundary at which oprkind = 'r' stopped being valid in pg_operator. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-17Remove factorial operators, leaving only the factorial() function.Tom Lane
The "!" operator is our only built-in postfix operator. Remove it, on the way to removal of grammar support for postfix operators. There is also a "!!" prefix operator, but since it's been marked deprecated for most of its existence, we might as well remove it too. Also zap the SQL alias function numeric_fac(), which seems to have equally little reason to live. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-17Further improve pgindent's list of file exclusions.Tom Lane
I despair of people keeping the README file's notes in sync with the actual exclusion list, so move the notes into the exclusion file. Adjust the pgindent script to explicitly ignore comments in the file, just in case (though it's hard to believe any would match filenames). Extend the list so that it doesn't process any files we'd not wish it to even in a fully-built-out development directory. (There are still a couple of derived files that it wants to reformat, but fixing the generator scripts for those seems like fit material for a separate patch.) Discussion: https://postgr.es/m/79ed5348-be7a-b647-dd40-742207186a22@2ndquadrant.com
2020-09-17Improve common/logging.c's support for multiple verbosity levels.Tom Lane
Instead of hard-wiring specific verbosity levels into the option processing of client applications, invent pg_logging_increase_verbosity() and encourage clients to implement --verbose by calling that. Then, the common convention that more -v's gets you more verbosity just works. In particular, this allows resurrection of the debug-grade messages that have long existed in pg_dump and its siblings. They were unreachable before this commit due to lack of a way to select PG_LOG_DEBUG logging level. (It appears that they may have been unreachable for some time before common/logging.c was introduced, too, so I'm not specifically blaming cc8d41511 for the oversight. One reason for thinking that is that it's now apparent that _allocAH()'s message needs a null-pointer guard. Testing might have failed to reveal that before 96bf88d52.) Discussion: https://postgr.es/m/1173106.1600116625@sss.pgh.pa.us
2020-09-17Update parallel BTree scan state when the scan keys can't be satisfied.Amit Kapila
For parallel btree scan to work for array of scan keys, it should reach BTPARALLEL_DONE state once for every distinct combination of array keys. This is required to ensure that the parallel workers don't try to seize blocks at the same time for different scan keys. We missed to update this state when we discovered that the scan keys can't be satisfied. Author: James Hunter Reviewed-by: Amit Kapila Tested-by: Justin Pryzby Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/4248CABC-25E3-4809-B4D0-128E1BAABC3C@amazon.com
2020-09-17Allow CURRENT_ROLE where CURRENT_USER is acceptedPeter Eisentraut
In the particular case of GRANTED BY, this is specified in the SQL standard. Since in PostgreSQL, CURRENT_ROLE is equivalent to CURRENT_USER, and CURRENT_USER is already supported here, adding CURRENT_ROLE is trivial. The other cases are PostgreSQL extensions, but for the same reason it also makes sense there. Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Asif Rehman <asifr.rehman@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/f2feac44-b4c5-f38f-3699-2851d6a76dc9%402ndquadrant.com
2020-09-17Add support for building GiST index by sorting.Heikki Linnakangas
This adds a new optional support function to the GiST access method: sortsupport. If it is defined, the GiST index is built by sorting all data to the order defined by the sortsupport's comparator function, and packing the tuples in that order to GiST pages. This is similar to how B-tree index build works, and is much faster than inserting the tuples one by one. The resulting index is smaller too, because the pages are packed more tightly, upto 'fillfactor'. The normal build method works by splitting pages, which tends to lead to more wasted space. The quality of the resulting index depends on how good the opclass-defined sort order is. A good order preserves locality of the input data. As the first user of this facility, add 'sortsupport' function to the point_ops opclass. It sorts the points in Z-order (aka Morton Code), by interleaving the bits of the X and Y coordinates. Author: Andrey Borodin Reviewed-by: Pavel Borisov, Thomas Munro Discussion: https://www.postgresql.org/message-id/1A36620E-CAD8-4267-9067-FB31385E7C0D%40yandex-team.ru
2020-09-17doc: Apply more consistently <productname> markup for OpenSSLMichael Paquier
OpenSSL was quoted in inconsistent ways in many places of the docs, sometimes with <application>, <productname> or just nothing. Author: Daniel Gustafsson Discussion: https://postgr.es/m/DA91E5F0-5F9D-41A7-A7A6-B91CDE0F1D63@yesql.se
2020-09-17Improve tab completion of IMPORT FOREIGN SCHEMA in psqlMichael Paquier
It is not possible to get a list of foreign schemas as the server is not known, so this provides instead a list of local schemas, which is more useful than nothing if using a loopback server or having schema names matching in the local and remote servers. Author: Jeff Janes Reviewed-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CAMkU=1wr7Roj41q-XiJs=Uyc2xCmHhcGGy7J-peJQK-e+w=ghw@mail.gmail.com
2020-09-17Teach walsender to update its process title for replication commands.Tom Lane
Because the code path taken for SQL commands executed in a walsender will update the process title, we pretty much have to update the title for replication commands as well. Otherwise, the title shows "idle" for the rest of a logical walsender's lifetime once it's executed any SQL command. Playing with this, I confirm that a walsender now typically spends most of its life reporting walsender postgres [local] START_REPLICATION Considering this in isolation, it might be better to have it say walsender postgres [local] sending replication data However, consistency with the other cases seems to be a stronger argument. In passing, remove duplicative pgstat_report_activity call. Discussion: https://postgr.es/m/880181.1600026471@sss.pgh.pa.us
2020-09-17Improve formatting of create_help.pl and plperl_opmask.pl output.Tom Lane
Adjust the whitespace in the emitted files so that it matches what pgindent would do. This makes the generated files look like they match project style, and avoids confusion if someone does run pgindent on the generated files. Also, add probes.h to pgindent's exclusion list, because it can confuse pgindent, plus there's not much point in processing it. Daniel Gustafsson, additional fixes by me Discussion: https://postgr.es/m/79ed5348-be7a-b647-dd40-742207186a22@2ndquadrant.com
2020-09-17Fix bogus completion tag usage in walsenderAlvaro Herrera
Since commit fd5942c18f97 (2012, 9.3-era), walsender has been sending completion tags for certain replication commands twice -- and they're not even consistent. Apparently neither libpq nor JDBC have a problem with it, but it's not kosher. Fix by remove the EndCommand() call in the common code path for them all, and inserting specific calls to EndReplicationCommand() specifically in those places where it's needed. EndReplicationCommand() is a new simple function to send the completion tag for replication commands. Do this instead of sending a generic SELECT completion tag for them all, which was also pretty bogus (if innocuous). While at it, change StartReplication() to use EndReplicationCommand() instead of pg_puttextmessage(). In commit 2f9661311b83, I failed to realize that replication commands are not close-enough kin of regular SQL commands, so the DROP_REPLICATION_SLOT tag I added is undeserved and a type pun. Take it out. Backpatch to 13, where the latter commit appeared. The duplicate tag has been sent since 9.3, but since nothing is broken, it doesn't seem worth fixing. Per complaints from Tom Lane. Discussion: https://postgr.es/m/1347966.1600195735@sss.pgh.pa.us
2020-09-16Centralize setup of SIGQUIT handling for postmaster child processes.Tom Lane
We decided that the policy established in commit 7634bd4f6 for the bgwriter, checkpointer, walwriter, and walreceiver processes, namely that they should accept SIGQUIT at all times, really ought to apply uniformly to all postmaster children. Therefore, get rid of the duplicative and inconsistent per-process code for establishing that signal handler and removing SIGQUIT from BlockSig. Instead, make InitPostmasterChild do it. The handler set up by InitPostmasterChild is SignalHandlerForCrashExit, which just summarily does _exit(2). In interactive backends, we almost immediately replace that with quickdie, since we would prefer to try to tell the client that we're dying. However, this patch is changing the behavior of autovacuum (both launcher and workers), as well as walsenders. Those processes formerly also used quickdie, but AFAICS that was just mindless copy-and-paste: they don't have any interactive client that's likely to benefit from being told this. The stats collector continues to be an outlier, in that it thinks SIGQUIT means normal exit. That should probably be changed for consistency, but there's another patch set where that's being dealt with, so I didn't do so here. Discussion: https://postgr.es/m/644875.1599933441@sss.pgh.pa.us
2020-09-16Don't fetch partition check expression during InitResultRelInfo.Tom Lane
Since there is only one place that actually needs the partition check expression, namely ExecPartitionCheck, it's better to fetch it from the relcache there. In this way we will never fetch it at all if the query never has use for it, and we still fetch it just once when we do need it. The reason for taking an interest in this is that if the relcache doesn't already have the check expression cached, fetching it requires obtaining AccessShareLock on the partition root. That means that operations that look like they should only touch the partition itself will also take a lock on the root. In particular we observed that TRUNCATE on a partition may take a lock on the partition's root, contributing to a deadlock situation in parallel pg_restore. As written, this patch does have a small cost, which is that we are microscopically reducing efficiency for the case where a partition has an empty check expression. ExecPartitionCheck will be called, and will go through the motions of setting up and checking an empty qual, where before it would not have been called at all. We could avoid that by adding a separate boolean flag to track whether there is a partition expression to test. However, this case only arises for a default partition with no siblings, which surely is not an interesting case in practice. Hence adding complexity for it does not seem like a good trade-off. Amit Langote, per a suggestion by me Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com
2020-09-16Fix amcheck child check pg_upgrade bug.Peter Geoghegan
Commit d114cc53 overlooked the fact that pg_upgrade'd B-Tree indexes have leaf page high keys whose offset numbers do not match the one from the copy of the tuple one level up (the copy stored with a downlink for leaf page's right sibling page). This led to false positive reports of corruption from bt_index_parent_check() when it was called to verify a pg_upgrade'd index. To fix, skip comparing the offset number on pg_upgrade'd B-Tree indexes. Author: Anastasia Lubennikova <a.lubennikova@postgrespro.ru> Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Andrew Bille <andrewbille@gmail.com> Diagnosed-By: Anastasia Lubennikova <a.lubennikova@postgrespro.ru> Bug: #16619 Discussion: https://postgr.es/m/16619-aaba10f83fdc1c3c@postgresql.org Backpatch: 13-, where child check was enhanced.
2020-09-16Avoid unnecessary recursion to child tables in ALTER TABLE SET NOT NULL.Tom Lane
If a partitioned table's column is already marked NOT NULL, there is no need to examine its partitions, because we can rely on previous DDL to have enforced that the child columns are NOT NULL as well. (Unfortunately, the same cannot be said for traditional inheritance, so for now we have to restrict the optimization to partitioned tables.) Hence, we may skip recursing to child tables in this situation. The reason this case is worth worrying about is that when pg_dump dumps a partitioned table having a primary key, it will include the requisite NOT NULL markings in the CREATE TABLE commands, and then add the primary key as a separate step. The primary key addition generates a SET NOT NULL as a subcommand, just to be sure. So the situation where a SET NOT NULL is redundant does arise in the real world. Skipping the recursion does more than just save a few cycles: it means that a command such as "ALTER TABLE ONLY partition_parent ADD PRIMARY KEY" will take locks only on the partition parent table, not on the partitions. It turns out that parallel pg_restore is effectively assuming that that's true, and has little choice but to do so because the dependencies listed for such a TOC entry don't include the partitions. pg_restore could thus issue this ALTER while data restores on the partitions are still in progress. Taking unnecessary locks on the partitions not only hurts concurrency, but can lead to actual deadlock failures, as reported by Domagoj Smoljanovic. (A contributing factor in the deadlock is that TRUNCATE on a child partition wants a non-exclusive lock on the parent. This seems likewise unnecessary, but the fix for it is more invasive so we won't consider back-patching it. Fortunately, getting rid of one of these two poor behaviors is enough to remove the deadlock.) Although support for partitioned primary keys came in with v11, this patch is dependent on the SET NOT NULL refactoring done by commit f4a3fdfbd, so we can only patch back to v12. Patch by me; thanks to Alvaro Herrera and Amit Langote for review. Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com
2020-09-16Fix bogus cache-invalidation logic in logical replication worker.Tom Lane
The code recorded cache invalidation events by zeroing the "localreloid" field of affected cache entries. However, it's possible for an inval event to occur even while we have the entry open and locked. So an ill-timed inval could result in "cache lookup failed for relation 0" errors, if the worker's code tried to use the cleared field. We can fix that by creating a separate bool field to record whether the entry needs to be revalidated. (In the back branches, cram the bool into what had been padding space, to avoid an ABI break in the somewhat unlikely event that any extension is looking at this struct.) Also, rearrange the logic in logicalrep_rel_open so that it does the right thing in cases where table_open would fail. We should retry the lookup by name in that case, but we didn't. The real-world impact of this is probably small. In the first place, the error conditions are very low probability, and in the second place, the worker would just exit and get restarted. We only noticed because in a CLOBBER_CACHE_ALWAYS build, the failure can occur repeatedly, preventing the worker from making progress. Nonetheless, it's clearly a bug, and it impedes a useful type of testing; so back-patch to v10 where this code was introduced. Discussion: https://postgr.es/m/1032727.1600096803@sss.pgh.pa.us
2020-09-16Add leader_pid field into the example of file_fdw for csvlog.Fujii Masao
Commit b8fdee7d0c added leader_pid field into csvlog, but forgot to update the example of file_fdw for csvlog. Author: Yuta Katsuragi
2020-09-16Avoid retrieval of CHECK constraints and DEFAULT exprs in data-only dumpMichael Paquier
Those extra queries are not necessary when doing a data-only dump. With this change, this means that the dependencies between CHECK/DEFAULT and the parent table are not tracked anymore for a data-only dump. However, these dependencies are only used for the schema generation and we have never guaranteed that a dump can be reloaded if a CHECK constraint uses a custom function whose behavior changes when loading the data, like when using cross-table references in the CHECK function. Author: Julien Rouhaud Reviewed-by: Daniel Gustafsson, Michael Paquier Discussion: https://postgr.es/m/20200712054850.GA92357@nol
2020-09-16Change LogicalTapeSetBlocks() to use nBlocksWritten.Jeff Davis
Previously, it was based on nBlocksAllocated to account for tapes with open write buffers that may not have made it to the BufFile yet. That was unnecessary, because callers do not need to get the number of blocks while a tape has an open write buffer; and it also conflicted with the preallocation logic added for HashAgg. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/ce5af05900fdbd0e9185747825a7423c48501964.camel@j-davis.com Backpatch-through: 13
2020-09-16HashAgg: release write buffers sooner by rewinding tape.Jeff Davis
This was an oversight. The purpose of 7fdd919ae7 was to avoid keeping tape buffers around unnecessisarily, but HashAgg didn't rewind early enough. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/1fb1151c2cddf8747d14e0532da283c3f97e2685.camel@j-davis.com Backpatch-through: 13
2020-09-16Fix initialization of RelationSyncEntry for streaming transactions.Amit Kapila
In commit 464824323e, for each RelationSyncEntry we maintained the list of xids (streamed_txns) for which we have already sent the schema. This helps us to track when to send the schema to the downstream node for replication of streaming transactions. Before this list got initialized, we were processing invalidation messages which access this list and led to an assertion failure. In passing, clean up the nearby code: * Initialize the list of xids with NIL instead of NULL which is our usual coding practice. * Remove the MemoryContext switch for creating a RelationSyncEntry in dynahash. Diagnosed-by: Amit Kapila and Tom Lane Author: Amit Kapila Reviewed-by: Tom Lane and Dilip Kumar Discussion: https://postgr.es/m/904373.1600033123@sss.pgh.pa.us
2020-09-16Optimize compactify_tuples functionDavid Rowley
This function could often be seen in profiles of vacuum and could often be a significant bottleneck during recovery. The problem was that a qsort was performed in order to sort an array of item pointers in reverse offset order so that we could use that to safely move tuples up to the end of the page without overwriting the memory of yet-to-be-moved tuples. i.e. we used to compact the page starting at the back of the page and move towards the front. The qsort that this required could be expensive for pages with a large number of tuples. In this commit, we take another approach to tuple compactification. Now, instead of sorting the remaining item pointers array we first check if the array is presorted and only memmove() the tuples that need to be moved. This presorted check can be done very cheaply in the calling functions when the array is being populated. This presorted case is very fast. When the item pointer array is not presorted we must copy tuples that need to be moved into a temp buffer before copying them back into the page again. This differs from what we used to do here as we're now copying the tuples back into the page in reverse line pointer order. Previously we left the existing order alone. Reordering the tuples results in an increased likelihood of hitting the pre-sorted case the next time around. Any newly added tuple which consumes a new line pointer will also maintain the correct sort order of tuples in the page which will also result in the presorted case being hit the next time. Only consuming an unused line pointer can cause the order of tuples to go out again, but that will be corrected next time the function is called for the page. Benchmarks have shown that the non-presorted case is at least equally as fast as the original qsort method even when the page just has a few tuples. As the number of tuples becomes larger the new method maintains its performance whereas the original qsort method became much slower when the number of tuples on the page became large. Author: David Rowley Reviewed-by: Thomas Munro Tested-by: Jakub Wartak Discussion: https://postgr.es/m/CA+hUKGKMQFVpjr106gRhwk6R-nXv0qOcTreZuQzxgpHESAL6dw@mail.gmail.com
2020-09-16Fix use-after-free bug with event triggers in an extension scriptAlvaro Herrera
ALTER TABLE commands in an extension script are added to an event trigger command list; but starting with commit b5810de3f4 they do so in a memory context that's too short-lived, so when execution ends and time comes to use the entries, they've already been freed. (This would also be a problem with ALTER TABLE commands in a multi-command query string, but these serendipitously end in PortalContext -- which probably explains why it took so long for this to be reported.) Fix by using the memory context specifically set for that, instead. Backpatch to 13, where the aforementioned commit appeared. Reported-by: Philippe Beaudoin Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://postgr.es/m/20200902193715.6e0269d4@firost
2020-09-15Report resource usage at the end of recoveryDavid Rowley
Reporting this has been rather useful in some recent recovery speedup work. It also seems like something that will be useful to the average DBA too. Author: David Rowley Reviewed-by: Thomas Munro Discussion: https://postgr.es/m/CAApHDvqYVORiZxq2xPvP6_ndmmsTkvr6jSYv4UTNaFa5i1kd%3DQ%40mail.gmail.com