summaryrefslogtreecommitdiff
path: root/contrib
AgeCommit message (Collapse)Author
2025-02-02Move CompareType to separate header filePeter Eisentraut
We'll want to make use of it in more places, and we'd prefer to not have to include all of primnodes.h everywhere. Author: Mark Dilger <mark.dilger@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-02-01Add get_opfamily_name() functionPeter Eisentraut
This refactors and simplifies various existing code to make use of the new function. Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-01-29Handle default NULL insertion a little better.Tom Lane
If a column is omitted in an INSERT, and there's no column default, the code in preptlist.c generates a NULL Const to be inserted. Furthermore, if the column is of a domain type, we wrap the Const in CoerceToDomain, so as to throw a run-time error if the domain has a NOT NULL constraint. That's fine as far as it goes, but there are two problems: 1. We're being sloppy about the type/typmod that the Const is labeled with. It really should have the domain's base type/typmod, since it's the input to CoerceToDomain not the output. This can result in coerce_to_domain inserting a useless length-coercion function (useless because it's being applied to a null). The coercion would typically get const-folded away later, but it'd be better not to create it in the first place. 2. We're not applying expression preprocessing (specifically, eval_const_expressions) to the resulting expression tree. The planner's primary expression-preprocessing pass already happened, so that means the length coercion step and CoerceToDomain node miss preprocessing altogether. This is at the least inefficient, since it means the length coercion and CoerceToDomain will actually be executed for each inserted row, though they could be const-folded away in most cases. Worse, it seems possible that missing preprocessing for the length coercion could result in an invalid plan (for example, due to failing to perform default-function-argument insertion). I'm not aware of any live bug of that sort with core datatypes, and it might be unreachable for extension types as well because of restrictions of CREATE CAST, but I'm not entirely convinced that it's unreachable. Hence, it seems worth back-patching the fix (although I only went back to v14, as the patch doesn't apply cleanly at all in v13). There are several places in the rewriter that are building null domain constants the same way as preptlist.c. While those are before the planner and hence don't have any reachable bug, they're still applying a length coercion that will be const-folded away later, uselessly wasting cycles. Hence, make a utility routine that all of these places can call to do it right. Making this code more careful about the typmod assigned to the generated NULL constant has visible but cosmetic effects on some of the plans shown in contrib/postgres_fdw's regression tests. Discussion: https://postgr.es/m/1865579.1738113656@sss.pgh.pa.us Backpatch-through: 14
2025-01-29Fix grammatical typos around possessive "its"John Naylor
Some places spelled it "it's", which is short for "it is". In passing, fix a couple other nearby grammatical errors. Author: Jacob Brazeal <jacob.brazeal@gmail.com> Discussion: https://postgr.es/m/CA+COZaAO8g1KJCV0T48=CkJMjAnnfTGLWOATz+2aCh40c2Nm+g@mail.gmail.com
2025-01-25Merge copies of converting an XID to a FullTransactionId.Noah Misch
Assume twophase.c is the performance-sensitive caller, and preserve its choice of unlikely() branch hint. Add some retrospective rationale for that choice. Back-patch to v17, for the next commit to use it. Reviewed (in earlier versions) by Michael Paquier. Discussion: https://postgr.es/m/17821-dd8c334263399284@postgresql.org Discussion: https://postgr.es/m/20250116010051.f3.nmisch@google.com
2025-01-24Fix copy-and-paste typoPeter Eisentraut
2025-01-24pgcrypto: Make it possible to disable built-in cryptoDaniel Gustafsson
When using OpenSSL and/or the underlying operating system in FIPS mode no non-FIPS certified crypto implementations should be used. While that is already possible by just not invoking the built-in crypto in pgcrypto, this adds a GUC which prohibit the code from being called. This doesn't change the FIPS status of PostgreSQL but can make it easier for sites which target FIPS compliance to ensure that violations cannot occur. Author: Daniel Gustafsson <daniel@yesql.se> Author: Joe Conway <mail@joeconway.com> Reviewed-by: Joe Conway <mail@joeconway.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/16b4a157-9ea1-44d0-b7b3-4c85df5de97b@joeconway.com
2025-01-24pgcrypto: Add function to check FIPS modeDaniel Gustafsson
This adds a SQL callable function for reading and returning the status of FIPS configuration of OpenSSL. If OpenSSL is operating with FIPS enabled it will return true, otherwise false. As this adds a function to the SQL file, bump the extension version to 1.4. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Joe Conway <mail@joeconway.com> Discussion: https://postgr.es/m/8f979145-e206-475a-a31b-73c977a4134c@joeconway.com
2025-01-24Convert sepgsql tests to TAPPeter Eisentraut
Add a TAP test for sepgsql. This automates the previously required manual setup before the test. The actual tests are still run by pg_regress, as before, but now called from within the TAP Perl script. The previous manual test script (test_sepgsql) is left in place, since its purpose is (also) to test whether a running instance was properly initialized for sepgsql. But it has been changed to call pg_regress directly and no longer require make. Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://www.postgresql.org/message-id/flat/651a5baf-5c45-4a5a-a202-0c8453a4ebf8@eisentraut.org
2025-01-24meson: Fix sepgsql installationPeter Eisentraut
The sepgsql.sql file should be installed under share/contrib/, not share/extension/, since it is not an extension. This makes it match what make install does. Discussion: https://www.postgresql.org/message-id/flat/651a5baf-5c45-4a5a-a202-0c8453a4ebf8@eisentraut.org
2025-01-23Convert macros to static inline functions (htup_details.h, itup.h)Peter Eisentraut
Discussion: https://www.postgresql.org/message-id/flat/5b558da8-99fb-0a99-83dd-f72f05388517@enterprisedb.com
2025-01-22Repair incorrect handling of AfterTriggerSharedData.ats_modifiedcols.Tom Lane
This patch fixes two distinct errors that both ultimately trace to commit 71d60e2aa, which added the ats_modifiedcols field. The more severe error is that ats_modifiedcols wasn't accounted for in afterTriggerAddEvent's scanning loop that looks for a pre-existing duplicate AfterTriggerSharedData. Thus, a new event could be incorrectly matched to an AfterTriggerSharedData that has a different value of ats_modifiedcols, resulting in the wrong tg_updatedcols bitmap getting passed to the trigger whenever it finally gets fired. We'd not noticed because (a) few triggers consult tg_updatedcols, and (b) we had no tests exercising a case where such a trigger was called as an AFTER trigger. In the test case added by this commit, contrib/lo's trigger fails to remove a large object when expected because (without this fix) it thinks the LO OID column hasn't changed. The other problem was introduced by commit ce5aaea8c, which copied the modified-columns bitmap into trigger-related storage. It made a copy for every trigger event, whereas what we really want is to make a new copy only when we make a new AfterTriggerSharedData entry. (We could imagine adding extra logic to reduce the number of bitmap copies still more, but it doesn't look worthwhile at the moment.) In a simple test of an UPDATE of 10000000 rows with a single AFTER trigger, this thinko roughly tripled the amount of memory consumed by the pending-triggers data structures, from 160446744 to 480443440 bytes. Fixing the first problem requires introducing a bms_equal() call into afterTriggerAddEvent's scanning loop, which is slightly annoying from a speed perspective. However, getting rid of the excessive bms_copy() calls from the second problem balances that out; overall speed of trigger operations is the same or slightly better, in my tests. Discussion: https://postgr.es/m/3496294.1737501591@sss.pgh.pa.us Backpatch-through: 13
2025-01-22Improve grammar of options for command arrays in TAP testsMichael Paquier
This commit rewrites a good chunk of the command arrays in TAP tests with a grammar based on the following rules: - Fat commas are used between option names and their values, making it clear to both humans and perltidy that values and names are bound together. This is particularly useful for the readability of multi-line command arrays, and there are plenty of them in the TAP tests. Most of the test code is updated to use this style. Some commands used parenthesis to show the link, or attached values and options in a single string. These are updated to use fat commas instead. - Option names are switched to use their long names, making them more self-documented. Based on a suggestion by Andrew Dunstan. - Add some trailing commas after the last item in multi-line arrays, which is a common perl style. Not all the places are taken care of, but this covers a very good chunk of them. Author: Dagfinn Ilmari Mannsåker Reviewed-by: Michael Paquier, Peter Smith, Euler Taveira Discussion: https://postgr.es/m/87jzc46d8u.fsf@wibble.ilmari.org
2025-01-16Add OLD/NEW support to RETURNING in DML queries.Dean Rasheed
This allows the RETURNING list of INSERT/UPDATE/DELETE/MERGE queries to explicitly return old and new values by using the special aliases "old" and "new", which are automatically added to the query (if not already defined) while parsing its RETURNING list, allowing things like: RETURNING old.colname, new.colname, ... RETURNING old.*, new.* Additionally, a new syntax is supported, allowing the names "old" and "new" to be changed to user-supplied alias names, e.g.: RETURNING WITH (OLD AS o, NEW AS n) o.colname, n.colname, ... This is useful when the names "old" and "new" are already defined, such as inside trigger functions, allowing backwards compatibility to be maintained -- the interpretation of any existing queries that happen to already refer to relations called "old" or "new", or use those as aliases for other relations, is not changed. For an INSERT, old values will generally be NULL, and for a DELETE, new values will generally be NULL, but that may change for an INSERT with an ON CONFLICT ... DO UPDATE clause, or if a query rewrite rule changes the command type. Therefore, we put no restrictions on the use of old and new in any DML queries. Dean Rasheed, reviewed by Jian He and Jeff Davis. Discussion: https://postgr.es/m/CAEZATCWx0J0-v=Qjc6gXzR=KtsdvAE7Ow=D=mu50AgOe+pvisQ@mail.gmail.com
2025-01-16Check return of pg_b64_encode() for errorPeter Eisentraut
Forgotten in commit 761c79508e7. Author: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEudQAq-3yHsSdWoOOaw%2BgAQYgPMpMGuB5pt2yCXgv-YuxG2Hg%40mail.gmail.com
2025-01-15postgres_fdw: SCRAM authentication pass-throughPeter Eisentraut
This enables SCRAM authentication for postgres_fdw when connecting to a foreign server without having to store a plain-text password on user mapping options. This is done by saving the SCRAM ClientKey and ServeryKey from the client authentication and using those instead of the plain-text password for the server-side SCRAM exchange. The new foreign-server or user-mapping option "use_scram_passthrough" enables this. Co-authored-by: Matheus Alcantara <mths.dev@pm.me> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/27b29a35-9b96-46a9-bc1a-914140869dac@gmail.com
2025-01-15Change gist stratnum function to use CompareTypePeter Eisentraut
This changes commit 7406ab623fe in that the gist strategy number mapping support function is changed to use the CompareType enum as input, instead of the "well-known" RT*StrategyNumber strategy numbers. This is a bit cleaner, since you are not dealing with two sets of strategy numbers. Also, this will enable us to subsume this system into a more general system of using CompareType to define operator semantics across index methods. Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-01-09Use @extschema:name@ notation in contrib transform modules.Tom Lane
Harden hstore_plperl, hstore_plpython, and ltree_plpython against search-path-based attacks by using @extschema:name@ notation to refer to the underlying hstore or ltree data type. This allows removal of the previous documentation warning suggesting that they must be installed in the same schema as the underlying data type. In passing, also improve a para in extend.sgml to suggest using @extschema:name@ for such purposes. Discussion: https://postgr.es/m/692480.1736021695@sss.pgh.pa.us
2025-01-08pg_freespacemap: Fix declaration of pg_freespace(regclass)Michael Paquier
This function called generate_series() without enforcing its input argument types, making possible for an attacker to catch this call, by defining for example a generate_series(int,bigint). The internals of pg_freespace(regclass) are changed to force the use of bigint for the inputs of generate_series(). A more consistent style is applied for all its hardcoded values, while on it. Issue introduced in 3f323eba89fb. Reported-by: Noah Misch Reviewed-by: Noah Misch Discussion: https://postgr.es/m/20250106190428.ec.nmisch@google.com
2025-01-07Add passwordcheck.min_password_length.Nathan Bossart
This new parameter can be used to change the minimum allowed password length (in bytes). Note that it has no effect if a user supplies a pre-encrypted password. Author: Emanuele Musella, Maurizio Boriani Reviewed-by: Tomas Vondra, Bertrand Drouvot, Japin Li Discussion: https://postgr.es/m/CA%2BugDNyYtHOtWCqVD3YkSVYDWD_1fO8Jm_ahsDGA5dXhbDPwrQ%40mail.gmail.com
2025-01-01Update copyright for 2025Bruce Momjian
Backpatch-through: 13
2024-12-29contrib/pageinspect: Use SQL-standard function bodies.Tom Lane
In the same spirit as 969bbd0fa, 13e3796c9, 3f323eba8. Tom Lane and Ronan Dunklau Discussion: https://postgr.es/m/3316564.aeNJFYEL58@aivenlaptop
2024-12-29contrib/xml2: Use SQL-standard function bodies.Tom Lane
In the same spirit as 969bbd0fa, 13e3796c9, 3f323eba8. Tom Lane and Ronan Dunklau Discussion: https://postgr.es/m/3316564.aeNJFYEL58@aivenlaptop
2024-12-29contrib/citext: Use SQL-standard function bodies.Tom Lane
In the same spirit as 969bbd0fa, 13e3796c9, 3f323eba8. Tom Lane and Ronan Dunklau Discussion: https://postgr.es/m/3316564.aeNJFYEL58@aivenlaptop
2024-12-25Partial pgindent of .l and .y filesPeter Eisentraut
Trying to clean up the code a bit while we're working on these files for the reentrant scanner/pure parser patches. This cleanup only touches the code sections after the second '%%' in each file, via a manually-supervised and locally hacked up pgindent.
2024-12-23postgres_fdw: re-issue cancel requests a few times if necessary.Tom Lane
Despite the best efforts of commit 0e5c82380, we're still seeing occasional failures of postgres_fdw's query_cancel test in the buildfarm. Investigation suggests that its 100ms timeout is still not enough to reliably ensure that the remote side starts the query before receiving the cancel request --- and if it hasn't, it will just discard the request because it's idle. We discussed allowing a cancel request to kill the next-received query, but that would have wide and perhaps unpleasant side-effects. What seems safer is to make postgres_fdw do what a human user would likely do, which is issue another cancel request if the first one didn't seem to do anything. We'll keep the same overall 30 second grace period before concluding things are broken, but issue additional cancel requests after 1 second, then 2 more seconds, then 4, then 8. (The next one in series is 16 seconds, but we'll hit the 30 second timeout before that.) Having done that, revert the timeout in query_cancel.sql to 10 ms. That will still be enough on most machines, most of the time, for the remote query to start; but now we're intentionally risking the race condition occurring sometimes in the buildfarm, so that the repeat-cancel code path will get some testing. As before, back-patch to v17. We might eventually contemplate back-patching this further, and/or adding similar logic to dblink. But given the lack of field complaints to date, this feels like mostly an exercise in test case stabilization, so v17 is enough. Discussion: https://postgr.es/m/colnv3lzzmc53iu5qoawynr6qq7etn47lmggqr65ddtpjliq5d@glkveb4m6nop
2024-12-23Fix some comments related to library unloadingMichael Paquier
Library unloading has never been supported with its code removed in ab02d702ef08, and there were some comments still mentioning that it was a possible operation. ChangAo has noticed the incorrect references in dfmgr.c, while I have noticed the other ones while scanning the rest of the tree for similar mistakes. Author: ChangAo Chen, Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/tencent_1D09840A1632D406A610C8C4E2491D74DB0A@qq.com
2024-12-20Optimize alignment calculations in tuple form/deformDavid Rowley
Here we convert CompactAttribute.attalign from a char, which is directly derived from pg_attribute.attalign into a uint8, which stores the number of bytes to align the column's value by in the tuple. This allows tuple deformation and tuple size calculations to move away from using the inefficient att_align_nominal() macro, which manually checks each TYPALIGN_* char to translate that into the alignment bytes for the given type. Effectively, this commit changes those to TYPEALIGN calls, which are branchless and only perform some simple arithmetic with some bit-twiddling. The removed branches were often mispredicted by CPUs, especially so in real-world tables which often contain a mishmash of different types with different alignment requirements. Author: David Rowley Reviewed-by: Andres Freund, Victor Yegorov Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
2024-12-20Fix corruption when relation truncation fails.Thomas Munro
RelationTruncate() does three things, while holding an AccessExclusiveLock and preventing checkpoints: 1. Logs the truncation. 2. Drops buffers, even if they're dirty. 3. Truncates some number of files. Step 2 could previously be canceled if it had to wait for I/O, and step 3 could and still can fail in file APIs. All orderings of these operations have data corruption hazards if interrupted, so we can't give up until the whole operation is done. When dirty pages were discarded but the corresponding blocks were left on disk due to ERROR, old page versions could come back from disk, reviving deleted data (see pgsql-bugs #18146 and several like it). When primary and standby were allowed to disagree on relation size, standbys could panic (see pgsql-bugs #18426) or revive data unknown to visibility management on the primary (theorized). Changes: * WAL is now unconditionally flushed first * smgrtruncate() is now called in a critical section, preventing interrupts and causing PANIC on file API failure * smgrtruncate() has a new parameter for existing fork sizes, because it can't call smgrnblocks() itself inside a critical section The changes apply to RelationTruncate(), smgr_redo() and pg_truncate_visibility_map(). That last is also brought up to date with other evolutions of the truncation protocol. The VACUUM FileTruncate() failure mode had been discussed in older reports than the ones referenced below, with independent analysis from many people, but earlier theories on how to fix it were too complicated to back-patch. The more recently invented cancellation bug was diagnosed by Alexander Lakhin. Other corruption scenarios were spotted by me while iterating on this patch and earlier commit 75818b3a. Back-patch to all supported releases. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reported-by: rootcause000@gmail.com Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/18146-04e908c662113ad5%40postgresql.org Discussion: https://postgr.es/m/18426-2d18da6586f152d6%40postgresql.org
2024-12-20Introduce CompactAttribute array in TupleDesc, take 2David Rowley
The new compact_attrs array stores a few select fields from FormData_pg_attribute in a more compact way, using only 16 bytes per column instead of the 104 bytes that FormData_pg_attribute uses. Using CompactAttribute allows performance-critical operations such as tuple deformation to be performed without looking at the FormData_pg_attribute element in TupleDesc which means fewer cacheline accesses. For some workloads, tuple deformation can be the most CPU intensive part of processing the query. Some testing with 16 columns on a table where the first column is variable length showed around a 10% increase in transactions per second for an OLAP type query performing aggregation on the 16th column. However, in certain cases, the increases were much higher, up to ~25% on one AMD Zen4 machine. This also makes pg_attribute.attcacheoff redundant. A follow-on commit will remove it, thus shrinking the FormData_pg_attribute struct by 4 bytes. Author: David Rowley Reviewed-by: Andres Freund, Victor Yegorov Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
2024-12-19Prevent redeclaration of typedef yyscan_tPeter Eisentraut
Fix for 1f0de66ea2a: We need to prevent redeclaration of typedef yyscan_t. (This will work with C11 but not currently with C99.) The generated scanner files provide their own typedef, but we also need to provide one for the interfaces that we expose. So we need to add some preprocessor guards to avoid a redefinition. (This is how the generated scanner files do it internally as well.) This way everything now works independent of the order in which things are included. Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2024-12-18seg: pure parser and reentrant scannerPeter Eisentraut
Use the flex %option reentrant and the bison option %pure-parser to make the generated scanner and parser pure, reentrant, and thread-safe. Make the generated scanner use palloc() etc. instead of malloc() etc. Previously, we only used palloc() for the buffer, but flex would still use malloc() for its internal structures. As a result, there could be some small memory leaks in case of uncaught errors. (We do catch normal syntax errors as soft errors.) Now, all the memory is under palloc() control, so there are no more such issues. Simplify flex scan buffer management: Instead of constructing the buffer from pieces and then using yy_scan_buffer(), we can just use yy_scan_string(), which does the same thing internally. The previous code was necessary because we allocated the buffer with palloc() and the rest of the state was handled by malloc(). But this is no longer the case; everything is under palloc() now. (We could even get rid of the yylex_destroy() call and just let the memory context cleanup handle everything. But for now, we preserve the existing behavior.) Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2024-12-18cube: pure parser and reentrant scannerPeter Eisentraut
Use the flex %option reentrant and the bison option %pure-parser to make the generated scanner and parser pure, reentrant, and thread-safe. Make the generated scanner use palloc() etc. instead of malloc() etc. Previously, we only used palloc() for the buffer, but flex would still use malloc() for its internal structures. As a result, there could be some small memory leaks in case of uncaught errors. (We do catch normal syntax errors as soft errors.) Now, all the memory is under palloc() control, so there are no more such issues. Simplify flex scan buffer management: Instead of constructing the buffer from pieces and then using yy_scan_buffer(), we can just use yy_scan_string(), which does the same thing internally. (Actually, we use yy_scan_bytes() here because we already have the length.) The previous code was necessary because we allocated the buffer with palloc() and the rest of the state was handled by malloc(). But this is no longer the case; everything is under palloc() now. (We could even get rid of the yylex_destroy() call and just let the memory context cleanup handle everything. But for now, we preserve the existing behavior.) Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2024-12-17Detect version mismatch in brin_page_itemsTomas Vondra
Commit dae761a87ed modified brin_page_items() to return the new "empty" flag for each BRIN range. But the new output parameter was added in the middle, which may cause crashes when using the new binary with old function definition. The ideal solution would be to introduce API versioning similar to what pg_stat_statements does, but it's too late for that as PG17 was already released (so we can't introduce a new extension version). We could do something similar in brin_page_items() by checking the number of output columns (and ignoring the new flag), but it doesn't seem very nice. Instead, simply error out and suggest updating the extension to the latest version. pageinspect is a superuser-only extension, and there's not much reason to run an older version. Moreover, there's a precedent for this approach in 691e8b2e18. Reported by Ľuboslav Špilák, investigation and patch by me. Backpatch to 17, same as dae761a87ed. Reported-by: Ľuboslav Špilák Reviewed-by: Michael Paquier, Hayato Kuroda, Peter Geoghegan Backpatch-through: 17 Discussion: https://postgr.es/m/VI1PR02MB63331C3D90E2104FD12399D38A5D2@VI1PR02MB6333.eurprd02.prod.outlook.com Discussion: https://postgr.es/m/flat/3385a58f-5484-49d0-b790-9a198a0bf236@vondra.me
2024-12-17Remove ts_locale.c's lowerstr()Peter Eisentraut
lowerstr() and lowerstr_with_len() in ts_locale.c do the same thing as str_tolower() that the rest of the system uses, except that the former don't use the common locale provider framework but instead use the global libc locale settings. This patch replaces uses of lowerstr*() with str_tolower(..., DEFAULT_COLLATION_OID). For instances that use a libc locale globally, this will result in exactly the same behavior. For instances that use other locale providers, you now get consistent behavior and are no longer dependent on the libc locale settings (for this case; there are others). Most uses of these functions are for processing dictionary and configuration files. In those cases, using the default collation seems appropriate. At least we don't have a more specific collation available. But the code in contrib/pg_trgm should really depend on the collation of the columns being processed. This is not done here, this can be done in a separate patch. (You can probably construct some edge cases where this change would create some locale-related upgrade incompatibility, for example if before you used a combination of ICU and a differently-behaving libc locale. We can document this in the release notes, but I don't think there is anything more we can do about this.) Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://www.postgresql.org/message-id/flat/653f3b84-fc87-45a7-9a0c-bfb4fcab3e7d%40eisentraut.org
2024-12-17Remove ts_locale.c's t_isdigit(), t_isspace(), t_isprint()Peter Eisentraut
These do the same thing as the standard isdigit(), isspace(), and isprint() but with multibyte and encoding support. But all the callers are only interested in analyzing single-byte ASCII characters. So this extra layer is overkill and we can replace the uses with the standard functions. All the t_is*() functions in ts_locale.c are under scrutiny because they don't use the common locale provider framework but instead use the global libc locale settings. For the functions being touched by this patch, we don't need all that anyway, as mentioned above, so the simplest solution is to just remove them. The few remaining t_is*() functions will need a different treatment in a separate patch. pg_trgm has some compile-time options with macros such as KEEPONLYALNUM. These are not documented, and the non-default variant is not supported by any test cases. As part of this undertaking, I'm removing the non-default variant, as it is in the way of cleanup. So in this case, the not-KEEPONLYALNUM code path is gone. Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://www.postgresql.org/message-id/flat/653f3b84-fc87-45a7-9a0c-bfb4fcab3e7d%40eisentraut.org
2024-12-14contrib/earthdistance: Use SQL-standard function bodies.Tom Lane
The @extschema:name@ feature added by 72a5b1fc8 allows us to make earthdistance's references to the cube extension fully search-path-secure, so long as all those references are resolved at extension installation time not runtime. To do that, we must convert earthdistance's SQL functions to the new SQL-standard style; but we wanted to do that anyway. The functions can be updated in our customary style by running CREATE OR REPLACE FUNCTION in an extension update script. However, there's still problems in the "CREATE DOMAIN earth" command: its references to cube functions could be captured by hostile objects in earthdistance's installation schema, if that's not where the cube extension is. Worse, the reference to the cube type itself as the domain's base could be captured, and that's not something we could fix after-the-fact in the update script. What I've done about that is to change the "CREATE DOMAIN earth" command in the base script earthdistance--1.1.sql. Ordinarily, changing a released extension script is forbidden; but I think it's okay here since the results of successful (non-trojaned) script execution will be identical to before. A good deal of care is still needed to make the extension's scripts proof against search-path-based attacks. We have to make sure that all the function and operator invocations have exact argument-type matches, to forestall attacks based on supplying a better match. Fortunately earthdistance isn't very big, so I've just gone through it and inspected each call to be sure of that. The only actual code changes needed were to spell all floating-point constants in the style '-1'::float8, rather than depending on runtime type conversions and/or negations. (I'm not sure that the shortcuts previously used were attackable, but removing run-time effort is a good thing anyway.) I believe that this fixes earthdistance enough that we could mark it trusted and remove the warnings about it that were added by 7eeb1d986; but I've not done that here. The primary reason for dealing with this now is that we've received reports of pg_upgrade failing for databases that use earthdistance functions in contexts like generated columns. That's a consequence of 2af07e2f7 having restricted the search_path used while evaluating such expressions. The only way to fix that is to make the earthdistance functions independent of run-time search_path. This patch is very much nicer than the alternative of attaching "SET search_path" clauses to earthdistance's functions: it is more secure and doesn't create a run-time penalty. Therefore, I've chosen to back-patch this to v16 where @extschema:name@ was added. It won't help unless users update to 16.7 and issue "ALTER EXTENSION earthdistance UPDATE" before upgrading to 17, but at least there's now a way to deal with the problem without manual intervention in the dump/restore process. Tom Lane and Ronan Dunklau Discussion: https://postgr.es/m/3316564.aeNJFYEL58@aivenlaptop Discussion: https://postgr.es/m/6a6439f1-8039-44e2-8fb9-59028f7f2014@mailbox.org
2024-12-11Fix further fallout from EXPLAIN ANALYZE BUFFERS changeDavid Rowley
c2a4078eb adjusted EXPLAIN ANALYZE to default the BUFFERS to ON. This (hopefully) fixes the last remaining issue with regression test failures with -D RELCACHE_FORCE_RELEASE -D CATCACHE_FORCE_RELEASE builds, where the planner accesses more buffers due to the cold caches. Discussion: https://postgr.es/m/CAApHDvqLdzgz77JsE-yTki3w9UiKQ-uTMLRctazcu+99-ips3g@mail.gmail.com
2024-12-11Enable BUFFERS with EXPLAIN ANALYZE by defaultDavid Rowley
The topic of turning EXPLAIN's BUFFERS option on with the ANALYZE option has come up a few times over the past few years. In many ways, doing this seems like a good idea as it may be more obvious to users why a given query is running more slowly than they might expect. Also, from my own (David's) personal experience, I've seen users posting to the mailing lists with two identical plans, one slow and one fast asking why their query is sometimes slow. In many cases, this is due to additional reads. Having BUFFERS on by default may help reduce some of these questions, and if not, make it more obvious to the user before they post, or save a round-trip to the mailing list when additional I/O effort is the cause of the slowness. The general consensus is that we want BUFFERS on by default with ANALYZE. However, there were more than zero concerns raised with doing so. The primary reason against is the additional verbosity, making it harder to read large plans. Another concern was that buffer information isn't always useful so may not make sense to have it on by default. It's currently December, so let's commit this to see if anyone comes forward with a strong objection against making this change. We have over half a year remaining in the v18 cycle where we could still easily consider reverting this if someone were to come forward with a convincing enough reason as to why doing this is a bad idea. There were two patches independently submitted to achieve this goal, one by me and the other by Guillaume. This commit is a mix of both of these patches with some additional work done by me to adjust various additional places in the documentation which include EXPLAIN ANALYZE output. Author: Guillaume Lelarge, David Rowley Reviewed-by: Robert Haas, Greg Sabino Mullane, Michael Christofides Discussion: https://postgr.es/m/CANNMO++W7MM8T0KyXN3ZheXXt-uLVM3aEtZd+WNfZ=obxffUiA@mail.gmail.com
2024-12-09Simplify executor's determination of whether to use parallelism.Tom Lane
Our parallel-mode code only works when we are executing a query in full, so ExecutePlan must disable parallel mode when it is asked to do partial execution. The previous logic for this involved passing down a flag (variously named execute_once or run_once) from callers of ExecutorRun or PortalRun. This is overcomplicated, and unsurprisingly some of the callers didn't get it right, since it requires keeping state that not all of them have handy; not to mention that the requirements for it were undocumented. That led to assertion failures in some corner cases. The only state we really need for this is the existing QueryDesc.already_executed flag, so let's just put all the responsibility in ExecutePlan. (It could have been done in ExecutorRun too, leading to a slightly shorter patch -- but if there's ever more than one caller of ExecutePlan, it seems better to have this logic in the subroutine than the callers.) This makes those ExecutorRun/PortalRun parameters unnecessary. In master it seems okay to just remove them, returning the API for those functions to what it was before parallelism. Such an API break is clearly not okay in stable branches, but for them we can just leave the parameters in place after documenting that they do nothing. Per report from Yugo Nagata, who also reviewed and tested this patch. Back-patch to all supported branches. Discussion: https://postgr.es/m/20241206062549.710dc01cf91224809dd6c0e1@sraoss.co.jp
2024-12-09Remove remants of "snapshot too old"Heikki Linnakangas
Remove the 'whenTaken' and 'lsn' fields from SnapshotData. After the removal of the "snapshot too old" feature, they were never set to a non-zero value. This largely reverts commit 3e2f3c2e423, which added the OldestActiveSnapshot tracking, and the init_toast_snapshot() function. That was only required for setting the 'whenTaken' and 'lsn' fields. SnapshotToast is now a constant again, like SnapshotSelf and SnapshotAny. I kept a thin get_toast_snapshot() wrapper around SnapshotToast though, to check that you have a registered or active snapshot. That's still a useful sanity check. Reviewed-by: Nathan Bossart, Andres Freund, Tom Lane Discussion: https://www.postgresql.org/message-id/cd4b4f8c-e63a-41c0-95f6-6e6cd9b83f6d@iki.fi
2024-12-09Fix invalidation of local pgstats references for entry reinitializationMichael Paquier
818119afccd3 has introduced the "generation" concept in pgstats entries, incremented a counter when a pgstats entry is reinitialized, but it did not count on the fact that backends still holding local references to such entries need to be refreshed if the cache age is outdated. The previous logic only updated local references when an entry was dropped, but it needs also to consider entries that are reinitialized. This matters for replication slot stats (as well as custom pgstats kinds in 18~), where concurrent drops and creates of a slot could cause incorrect stats to be locally referenced. This would lead to an assertion failure at shutdown when writing out the stats file, as the backend holding an outdated local reference would not be able to drop during its shutdown sequence the stats entry that should be dropped, as the last process holding a reference to the stats entry. The checkpointer was then complaining about such an entry late in the shutdown sequence, after the shutdown checkpoint is finished with the control file updated, causing the stats file to not be generated. In non-assert builds, the entry would just be skipped with the stats file written. Note that only logical replication slots use statistics. A test case based on TAP is added to test_decoding, where a persistent connection peeking at a slot's data is kept with concurrent drops and creates of the same slot. This is based on the isolation test case that Anton has sent. As it requires a node shutdown with a check to make sure that the stats file is written with this specific sequence of events, TAP is used instead. Reported-by: Anton A. Melnikov Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/56bf8ff9-dd8c-47b2-872a-748ede82af99@postgrespro.ru Backpatch-through: 15
2024-12-02Deprecate MD5 passwords.Nathan Bossart
MD5 has been considered to be unsuitable for use as a cryptographic hash algorithm for some time. Furthermore, MD5 password hashes in PostgreSQL are vulnerable to pass-the-hash attacks, i.e., knowing the username and hashed password is sufficient to authenticate. The SCRAM-SHA-256 method added in v10 is not subject to these problems and is considered to be superior to MD5. This commit marks MD5 password support in PostgreSQL as deprecated and to be removed in a future release. The documentation now contains several deprecation notices, and CREATE ROLE and ALTER ROLE now emit deprecation warnings when setting MD5 passwords. The warnings can be disabled by setting the md5_password_warnings parameter to "off". Reviewed-by: Greg Sabino Mullane, Jim Nasby Discussion: https://postgr.es/m/ZwbfpJJol7lDWajL%40nathan
2024-11-28Remove useless casts to (void *)Peter Eisentraut
Many of them just seem to have been copied around for no real reason. Their presence causes (small) risks of hiding actual type mismatches or silently discarding qualifiers Discussion: https://www.postgresql.org/message-id/flat/461ea37c-8b58-43b4-9736-52884e862820@eisentraut.org
2024-11-27file_fdw: Add regression tests for ON_ERROR and other options.Fujii Masao
This commit introduces regression tests to validate incorrect settings for the ON_ERROR, LOG_VERBOSITY, and REJECT_LIMIT options in file_fdw. Author: Atsushi Torikoshi Reviewed-by: Fujii Masao Suggested-by: Yugo Nagata Discussion: https://postgr.es/m/20241113231706.09e5b5ea9640289312835be8@sraoss.co.jp
2024-11-26Clean up newlines following left parenthesesÁlvaro Herrera
Most came in during the 17 cycle, so backpatch there. Some (particularly reorderbuffer.h) are very old, but backpatching doesn't seem useful. Like commits c9d297751959, c4f113e8fef9.
2024-11-22Add INT64_HEX_FORMAT and UINT64_HEX_FORMAT to c.h.Nathan Bossart
Like INT64_FORMAT and UINT64_FORMAT, these macros produce format strings for 64-bit integers. However, INT64_HEX_FORMAT and UINT64_HEX_FORMAT generate the output in hexadecimal instead of decimal. Besides introducing these macros, this commit makes use of them in several places. This was originally intended to be part of commit 5d6187d2a2, but I left it out because I felt there was a nonzero chance that back-patching these new macros into c.h could cause problems with third-party code. We tend to be less cautious with such changes in new major versions. Note that UINT64_HEX_FORMAT was originally added in commit ee1b30f128, but it was placed in test_radixtree.c, so it wasn't widely available. This commit moves UINT64_HEX_FORMAT to c.h. Discussion: https://postgr.es/m/ZwQvtUbPKaaRQezd%40nathan
2024-11-20file_fdw: Add REJECT_LIMIT option to file_fdw.Fujii Masao
Commit 4ac2a9bece introduced the REJECT_LIMIT option for the COPY command. This commit extends the support for this option to file_fdw. As well as REJECT_LIMIT option for COPY, this option limits the maximum number of erroneous rows that can be skipped. If the number of data type conversion errors exceeds this limit, accessing the file_fdw foreign table will fail with an error, even when on_error = 'ignore' is specified. Since the CREATE/ALTER FOREIGN TABLE commands require foreign table options to be single-quoted, this commit updates defGetCopyRejectLimitOption() to handle also string value for them, in addition to int64 value for COPY command option. Author: Atsushi Torikoshi Reviewed-by: Fujii Masao, Yugo Nagata, Kirill Reshke Discussion: https://postgr.es/m/bab68a9fc502b12693f0755b6f35f327@oss.nttdata.com
2024-11-14contrib/lo: Use SQL-standard function bodiesMichael Paquier
Author: Ronan Dunklau Discussion: https://postgr.es/m/3316564.aeNJFYEL58@aivenlaptop
2024-11-14xml2: Add tests for functions xpath_nodeset() and xpath_list()Michael Paquier
These two functions with their different argument lists have never been tested in this module, so let's add something. Author: Ronan Dunklau Discussion: https://postgr.es/m/ZzMSJkiNZhimjXWx@paquier.xyz