summaryrefslogtreecommitdiff
path: root/contrib/bloom
AgeCommit message (Collapse)Author
2017-08-30Accept plan changes in bloom contrib moduleTomas Vondra
The changes are fairly simple and generally expected due to distributing upstream queries, so adding either Remote Fast Query Execution or Remote Subquery Scan nodes.
2017-06-21Phase 3 of pgindent updates.Tom Lane
Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21Phase 2 of pgindent updates.Tom Lane
Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21Initial pgindent run with pg_bsd_indent version 2.0.Tom Lane
The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname *varname", as well as some similar cases for const- or volatile-qualified pointers. * Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-05-17Post-PG 10 beta1 pgindent runBruce Momjian
perltidy run not included.
2017-05-11Rename WAL-related functions and views to use "lsn" not "location".Tom Lane
Per discussion, "location" is a rather vague term that could refer to multiple concepts. "LSN" is an unambiguous term for WAL locations and should be preferred. Some function names, view column names, and function output argument names used "lsn" already, but others used "location", as well as yet other terms such as "wal_position". Since we've already renamed a lot of things in this area from "xlog" to "wal" for v10, we may as well incur a bit more compatibility pain and make these names all consistent. David Rowley, minor additional docs hacking by me Discussion: https://postgr.es/m/CAKJS1f8O0njDKe8ePFQ-LK5-EjwThsDws6ohJ-+c6nWK+oUxtg@mail.gmail.com
2017-02-15Add optimizer and executor support for parallel index scans.Robert Haas
In combination with 569174f1be92be93f5366212cc46960d28a5c5cd, which taught the btree AM how to perform parallel index scans, this allows parallel index scan plans on btree indexes. This infrastructure should be general enough to support parallel index scans for other index AMs as well, if someone updates them to support parallel scans. Amit Kapila, reviewed and tested by Anastasia Lubennikova, Tushar Ahuja, and Haribabu Kommi, and me.
2017-02-09Remove all references to "xlog" from SQL-callable functions in pg_proc.Robert Haas
Commit f82ec32ac30ae7e3ec7c84067192535b2ff8ec0e renamed the pg_xlog directory to pg_wal. To make things consistent, and because "xlog" is terrible terminology for either "transaction log" or "write-ahead log" rename all SQL-callable functions that contain "xlog" in the name to instead contain "wal". (Note that this may pose an upgrade hazard for some users.) Similarly, rename the xlog_position argument of the functions that create slots to be called wal_position. Discussion: https://www.postgresql.org/message-id/CA+Tgmob=YmA=H3DbW1YuOXnFVgBheRmyDkWcD9M8f=5bGWYEoQ@mail.gmail.com
2017-02-09Allow index AMs to cache data across aminsert calls within a SQL command.Tom Lane
It's always been possible for index AMs to cache data across successive amgettuple calls within a single SQL command: the IndexScanDesc.opaque field is meant for precisely that. However, no comparable facility exists for amortizing setup work across successive aminsert calls. This patch adds such a feature and teaches GIN, GIST, and BRIN to use it to amortize catalog lookups they'd previously been doing on every call. (The other standard index AMs keep everything they need in the relcache, so there's little to improve there.) For GIN, the overall improvement in a statement that inserts many rows can be as much as 10%, though it seems a bit less for the other two. In addition, this makes a really significant difference in runtime for CLOBBER_CACHE_ALWAYS tests, since in those builds the repeated catalog lookups are vastly more expensive. The reason this has been hard up to now is that the aminsert function is not passed any useful place to cache per-statement data. What I chose to do is to add suitable fields to struct IndexInfo and pass that to aminsert. That's not widening the index AM API very much because IndexInfo is already within the ken of ambuild; in fact, by passing the same info to aminsert as to ambuild, this is really removing an inconsistency in the AM API. Discussion: https://postgr.es/m/27568.1486508680@sss.pgh.pa.us
2017-02-06Fix typos in comments.Heikki Linnakangas
Backpatch to all supported versions, where applicable, to make backpatching of future fixes go more smoothly. Josh Soref Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com
2017-01-24Extend index AM API for parallel index scans.Robert Haas
This patch doesn't actually make any index AM parallel-aware, but it provides the necessary functions at the AM layer to do so. Rahila Syed, Amit Kapila, Robert Haas
2017-01-21Move some things from builtins.h to new header filesPeter Eisentraut
This avoids that builtins.h has to include additional header files.
2017-01-03Update copyright via script for 2017Bruce Momjian
2016-12-08Log the creation of an init fork unconditionally.Robert Haas
Previously, it was thought that this only needed to be done for the benefit of possible standbys, so wal_level = minimal skipped it. But that's not safe, because during crash recovery we might replay XLOG_DBASE_CREATE or XLOG_TBLSPC_CREATE record which recursively removes the directory that contains the new init fork. So log it always. The user-visible effect of this bug is that if you create a database or tablespace, then create an unlogged table, then crash without checkpointing, then restart, accessing the table will fail, because the it won't have been properly reset. This commit fixes that. Michael Paquier, per a report from Konstantin Knizhnik. Wording of the comments per a suggestion from me.
2016-09-30Remove unnecessary prototypesPeter Eisentraut
Prototypes for functions implementing V1-callable functions are no longer necessary. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com>
2016-09-30Fix use of offsetof()Peter Eisentraut
Using offsetof() with a run-time computed argument is not allowed in either C or C++. Apparently, gcc allows it, but g++ doesn't. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com>
2016-08-27Add macros to make AllocSetContextCreate() calls simpler and safer.Tom Lane
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls had typos in the context-sizing parameters. While none of these led to especially significant problems, they did create minor inefficiencies, and it's now clear that expecting people to copy-and-paste those calls accurately is not a great idea. Let's reduce the risk of future errors by introducing single macros that encapsulate the common use-cases. Three such macros are enough to cover all but two special-purpose contexts; those two calls can be left as-is, I think. While this patch doesn't in itself improve matters for third-party extensions, it doesn't break anything for them either, and they can gradually adopt the simplified notation over time. In passing, change TopMemoryContext to use the default allocation parameters. Formerly it could only be extended 8K at a time. That was probably reasonable when this code was written; but nowadays we create many more contexts than we did then, so that it's not unusual to have a couple hundred K in TopMemoryContext, even without considering various dubious code that sticks other things there. There seems no good reason not to let it use growing blocks like most other contexts. Back-patch to 9.6, mostly because that's still close enough to HEAD that it's easy to do so, and keeping the branches in sync can be expected to avoid some future back-patching pain. The bugs fixed by these changes don't seem to be significant enough to justify fixing them further back. Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-14Fix assorted bugs in contrib/bloom.Tom Lane
In blinsert(), cope with the possibility that a page we pull from the notFullPage list is marked BLOOM_DELETED. This could happen if VACUUM recently marked it deleted but hasn't (yet) updated the metapage. We can re-use such a page safely, but we *must* reinitialize it so that it's no longer marked deleted. Fix blvacuum() so that it updates the notFullPage list even if it's going to update it to empty. The previous "optimization" of skipping the update seems pretty dubious, since it means that the next blinsert() will uselessly visit whatever pages we left in the list. Uniformly treat PageIsNew pages the same as deleted pages. This should allow proper recovery if a crash occurs just after relation extension. Properly use vacuum_delay_point, not assorted ad-hoc CHECK_FOR_INTERRUPTS calls, in the blvacuum() main loop. Fix broken tuple-counting logic: blvacuum.c counted the number of live index tuples over again in each scan, leading to VACUUM VERBOSE reporting some multiple of the actual number of surviving index tuples after any vacuum that removed any tuples (since they'd be counted in blvacuum, maybe more than once, and then again in blvacuumcleanup, without ever zeroing the counter). It's sufficient to count them in blvacuumcleanup. stats->estimated_count is a boolean, not a counter, and we don't want to set it true, so don't add tuple counts to it. Add a couple of Asserts that we don't overrun available space on a bloom page. I don't think there's any bug there today, but the way the FreeBlockNumberArray size calculation is set up is scarily fragile, and BloomPageGetFreeSpace isn't much better. The Asserts should help catch any future mistakes. Per investigation of a report from Jeff Janes. I think the first item above may explain his report; the other changes were things I noticed while casting about for an explanation. Report: <CAMkU=1xEUuBphDwDmB1WjN4+td4kpnEniFaTBxnk1xzHCw8_OQ@mail.gmail.com>
2016-08-13Add SQL-accessible functions for inspecting index AM properties.Tom Lane
Per discussion, we should provide such functions to replace the lost ability to discover AM properties by inspecting pg_am (cf commit 65c5fcd35). The added functionality is also meant to displace any code that was looking directly at pg_index.indoption, since we'd rather not believe that the bit meanings in that field are part of any client API contract. As future-proofing, define the SQL API to not assume that properties that are currently AM-wide or index-wide will remain so unless they logically must be; instead, expose them only when inquiring about a specific index or even specific index column. Also provide the ability for an index AM to override the behavior. In passing, document pg_am.amtype, overlooked in commit 473b93287. Andrew Gierth, with kibitzing by me and others Discussion: <87mvl5on7n.fsf@news-spur.riddles.org.uk>
2016-08-11Trivial cosmetic cleanup in bloom/blutils.c.Tom Lane
Don't spell "InvalidOid" as "0". Initialize method fields in the same order as amapi.h declares them (and every other AM handler initializes them).
2016-06-14Minor fixes in contrib installation scripts.Tom Lane
Extension scripts should never use CREATE OR REPLACE for initial object creation. If there is a collision with a pre-existing (probably user-created) object, we want extension installation to fail, not silently overwrite the user's object. Bloom and sslinfo both violated this precept. Also fix a number of scripts that had no standard header (the file name comment and the \echo...\quit guard). Probably the \echo...\quit hack is less important now than it was in 9.1 days, but that doesn't mean that individual extensions get to choose whether to use it or not. And fix a couple of evident copy-and-pasteos in file name comments. No need for back-patch: the REPLACE bugs are both new in 9.6, and the rest of this is pretty much cosmetic. Andreas Karlsson and Tom Lane
2016-06-12Finish pgindent run for 9.6: Perl files.Noah Misch
2016-06-09pgindent run for 9.6Robert Haas
2016-06-07Fix loose ends for SQL ACCESS METHOD objectsAlvaro Herrera
COMMENT ON ACCESS METHOD was missing; add it, along psql tab-completion support for it. psql was also missing a way to list existing access methods; the new \dA command does that. Also add tab-completion support for DROP ACCESS METHOD. Author: Michael Paquier Discussion: https://www.postgresql.org/message-id/CAB7nPqTzdZdu8J7EF8SXr_R2U5bSUUYNOT3oAWBZdEoggnwhGA@mail.gmail.com
2016-06-03Measure Bloom index signature-length reloption in bits, not words.Tom Lane
Per discussion, this is a more understandable and future-proof way of exposing the setting to users. On-disk, we can still store it in words, so as to not break on-disk compatibility with beta1. Along the way, clean up the code associated with Bloom reloptions. Provide explicit macros for default and maximum lengths rather than having magic numbers buried in multiple places in the code. Drop the adjustBloomOptions() code altogether: it was useless in view of the fact that reloptions.c already performed default-substitution and range checking for the options. Rename a couple of macros and types for more clarity. Discussion: <23767.1464926580@sss.pgh.pa.us>
2016-05-25Fix contrib/bloom to work for unlogged indexes.Tom Lane
blbuildempty did not do even approximately the right thing: it tried to add a metapage to the relation's regular data fork, which already has one at that point. It should look like the ambuildempty methods for all the standard index types, ie, initialize a metapage image in some transient storage and then write it directly to the init fork. To support that, refactor BloomInitMetapage into two functions. In passing, fix BloomInitMetapage so it doesn't leave the rd_options field of the index's relcache entry pointing at transient storage. I'm not sure this had any visible consequence, since nothing much else is likely to look at a bloom index's rd_options, but it's certainly poor practice. Per bug #14155 from Zhou Digoal. Report: <20160524144146.22598.42558@wrigleys.postgresql.org>
2016-05-17Avoid possible crash in contrib/bloom's blendscan().Tom Lane
It's possible to begin and end an indexscan without ever calling amrescan. contrib/bloom, unlike every other index AM, allocated its "scan->opaque" storage at amrescan time, and thus would crash in amendscan if amrescan hadn't been called. We could fix this by putting in a null-pointer check in blendscan, but I see no very good reason why contrib/bloom should march to its own drummer in this respect. Let's move that initialization to blbeginscan instead. Per report from Jeff Janes.
2016-04-28Prevent to use magic constantsTeodor Sigaev
Use macroses for definition amstrategies/amsupport fields instead of hardcoded values. Author: Nikolay Shaplov with addition for contrib/bloom
2016-04-21PGDLLIMPORT-ify old_snapshot_threshold.Tom Lane
Revert commit 7cb1db1d9599f0a09d6920d2149d956ef6d88b0e, which represented a misunderstanding of the problem (if snapmgr.h weren't already included in bufmgr.h, things wouldn't compile anywhere). Instead install what I think is the real fix.
2016-04-21Include snapmgr.h in blscan.cKevin Grittner
Windows builds on buildfarm are failing because old_snapshot_threshold is not found in the bloom filter contrib module.
2016-04-20Revert no-op changes to BufferGetPage()Kevin Grittner
The reverted changes were intended to force a choice of whether any newly-added BufferGetPage() calls needed to be accompanied by a test of the snapshot age, to support the "snapshot too old" feature. Such an accompanying test is needed in about 7% of the cases, where the page is being used as part of a scan rather than positioning for other purposes (such as DML or vacuuming). The additional effort required for back-patching, and the doubt whether the intended benefit would really be there, have indicated it is best just to rely on developers to do the right thing based on comments and existing usage, as we do with many other conventions. This change should have little or no effect on generated executable code. Motivated by the back-patching pain of Tom Lane and Robert Haas
2016-04-12Improve API of GenericXLogRegister().Tom Lane
Rename this function to GenericXLogRegisterBuffer() to make it clearer what it does, and leave room for other sorts of "register" actions in future. Also, replace its "bool isNew" argument with an integer flags argument, so as to allow adding more flags in future without an API break. Alexander Korotkov, adjusted slightly by me
2016-04-12Add page id to bloom indexTeodor Sigaev
Added to ensure that bloom index pages can be distinguished from other pages by pg_filedump. Because there wasn't any public/production versions before, it doesn't pay attention to any compatibility issues. Per notice from Tom Lane
2016-04-10Improve contrib/bloom regression test using code coverage info.Tom Lane
Originally, this test created a 100000-row test table, which made it run rather slowly compared to other contrib tests. Investigation with gcov showed that we got no further improvement in code coverage after the first 700 or so rows, making the large table 99% a waste of time. Cut it back to 2000 rows to fix the runtime problem and still leave some headroom for testing behaviors that may appear later. A closer look at the gcov results showed that the main coverage omissions in contrib/bloom occurred because the test never filled more than one entry in the notFullPage array; which is unsurprising because it exercised index cleanup only in the scenario of complete table deletion, allowing every page in the index to become deleted rather than not-full. Add testing that allows the not-full path to be exercised as well. Also, test the amvalidate function, because blvalidate.c had zero coverage without that, and besides it's a good idea to check for mistakes in the bloom opclass definitions.
2016-04-09Get rid of blinsert()'s use of GenericXLogUnregister().Tom Lane
That routine is dangerous, and unnecessary once we get rid of this one caller. In passing, fix failure to clean up temp memory context, or switch back to caller's context, during slowest exit path.
2016-04-08Add the "snapshot too old" featureKevin Grittner
This feature is controlled by a new old_snapshot_threshold GUC. A value of -1 disables the feature, and that is the default. The value of 0 is just intended for testing. Above that it is the number of minutes a snapshot can reach before pruning and vacuum are allowed to remove dead tuples which the snapshot would otherwise protect. The xmin associated with a transaction ID does still protect dead tuples. A connection which is using an "old" snapshot does not get an error unless it accesses a page modified recently enough that it might not be able to produce accurate results. This is similar to the Oracle feature, and we use the same SQLSTATE and error message for compatibility.
2016-04-08Modify BufferGetPage() to prepare for "snapshot too old" featureKevin Grittner
This patch is a no-op patch which is intended to reduce the chances of failures of omission once the functional part of the "snapshot too old" patch goes in. It adds parameters for snapshot, relation, and an enum to specify whether the snapshot age check needs to be done for the page at this point. This initial patch passes NULL for the first two new parameters and BGP_NO_SNAPSHOT_TEST for the third. The follow-on patch will change the places where the test needs to be made.
2016-04-04Fix typoTeodor Sigaev
Michael Paquier
2016-04-03Fix contrib/bloom to not fail under CLOBBER_CACHE_ALWAYS.Tom Lane
The code was supposing that rd_amcache wouldn't disappear from under it during a scan; which is wrong. Copy the data out of the relcache rather than trying to reference it there.
2016-04-03Clean up some stuff in new contrib/bloom module.Tom Lane
Coverity complained about implicit sign-extension in the BloomPageGetFreeSpace macro, probably because sizeOfBloomTuple isn't wide enough for size calculations. No overflow is really possible as long as maxoff and sizeOfBloomTuple are small enough to represent a realistic situation, but it seems like a good idea to declare sizeOfBloomTuple as Size not int32. Add missing check on BloomPageAddItem() result, again from Coverity. Avoid core dump due to not allocating so->sign array when scan->numberOfKeys is zero. Also thanks to Coverity. Use FLEXIBLE_ARRAY_MEMBER rather than declaring an array as size 1 when it isn't necessarily. Very minor beautification of related code. Unfortunately, none of the Coverity-detected mistakes look like they could account for the remaining buildfarm unhappiness with this module. It's barely possible that the FLEXIBLE_ARRAY_MEMBER mistake does account for that, if it's enabling bogus compiler optimizations; but I'm not terribly optimistic. We probably still have bugs to find here.
2016-04-02Add missing "static".Tom Lane
Per buildfarm member pademelon.
2016-04-02Fix condition in e9e441c9fac6cbc0510cded6abb9d0e6b646ecafTeodor Sigaev
Comment is right, but if - not.
2016-04-02Prevent mark as deleted and as 'has free space' page in bloom moduleTeodor Sigaev
Vacuum might put page into list of pages with some free space and mark as deleted at the same time.
2016-04-02Fixes in bloom contrib moduleTeodor Sigaev
Looking at result of buildfarm member jaguarundi it seems to me that BloomOptions isn't inited sometime, but I don't see yet how it's possible. Nevertheless, check of signature length's is missed, so, add a limit of it. Also add missed GenericXLogAbort() in case of already deleted page in vacuum + minor code refactoring.
2016-04-01Fixes in bloom contrib module missed during reviewTeodor Sigaev
- macroses llike (var & FLAG) are changed to ((var & FLAG) != 0) - do not copy uninitialized part of notFullPage array to page
2016-04-01Bloom index contrib moduleTeodor Sigaev
Module provides new access method. It is actually a simple Bloom filter implemented as pgsql's index. It could give some benefits on search with large number of columns. Module is a single way to test generic WAL interface committed earlier. Author: Teodor Sigaev, Alexander Korotkov Reviewers: Aleksander Alekseev, Michael Paquier, Jim Nasby