summaryrefslogtreecommitdiff
path: root/contrib/fuzzystrmatch
AgeCommit message (Collapse)Author
2017-06-21Phase 2 of pgindent updates.Tom Lane
Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21Initial pgindent run with pg_bsd_indent version 2.0.Tom Lane
The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname *varname", as well as some similar cases for const- or volatile-qualified pointers. * Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-03-12Use wrappers of PG_DETOAST_DATUM_PACKED() more.Noah Misch
This makes almost all core code follow the policy introduced in the previous commit. Specific decisions: - Text search support functions with char* and length arguments, such as prsstart and lexize, may receive unaligned strings. I doubt maintainers of non-core text search code will notice. - Use plain VARDATA() on values detoasted or synthesized earlier in the same function. Use VARDATA_ANY() on varlenas sourced outside the function, even if they happen to always have four-byte headers. As an exception, retain the universal practice of using VARDATA() on return values of SendFunctionCall(). - Retain PG_GETARG_BYTEA_P() in pageinspect. (Page images are too large for a one-byte header, so this misses no optimization.) Sites that do not call get_page_from_raw() typically need the four-byte alignment. - For now, do not change btree_gist. Its use of four-byte headers in memory is partly entangled with storage of 4-byte headers inside GBT_VARKEY, on disk. - For now, do not change gtrgm_consistent() or gtrgm_distance(). They incorporate the varlena header into a cache, and there are multiple credible implementation strategies to consider.
2017-02-25Remove useless duplicate inclusions of system header files.Tom Lane
c.h #includes a number of core libc header files, such as <stdio.h>. There's no point in re-including these after having read postgres.h, postgres_fe.h, or c.h; so remove code that did so. While at it, also fix some places that were ignoring our standard pattern of "include postgres[_fe].h, then system header files, then other Postgres header files". While there's not any great magic in doing it that way rather than system headers last, it's silly to have just a few files deviating from the general pattern. (But I didn't attempt to enforce this globally, only in files I was touching anyway.) I'd be the first to say that this is mostly compulsive neatnik-ism, but over time it might save enough compile cycles to be useful.
2017-01-21Move some things from builtins.h to new header filesPeter Eisentraut
This avoids that builtins.h has to include additional header files.
2017-01-03Update copyright via script for 2017Bruce Momjian
2016-06-07Update fuzzystrmatch extension for parallel query.Robert Haas
All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson
2016-01-22Remove new coupling between NAMEDATALEN and MAX_LEVENSHTEIN_STRLEN.Tom Lane
Commit e529cd4ffa605c6f introduced an Assert requiring NAMEDATALEN to be less than MAX_LEVENSHTEIN_STRLEN, which has been 255 for a long time. Since up to that instant we had always allowed NAMEDATALEN to be substantially more than that, this was ill-advised. It's debatable whether we need MAX_LEVENSHTEIN_STRLEN at all (versus putting a CHECK_FOR_INTERRUPTS into the loop), or whether it has to be so tight; but this patch takes the narrower approach of just not applying the MAX_LEVENSHTEIN_STRLEN limit to calls from the parser. Trusting the parser for this seems reasonable, first because the strings are limited to NAMEDATALEN which is unlikely to be hugely more than 256, and second because the maximum distance is tightly constrained by MAX_FUZZY_DISTANCE (though we'd forgotten to make use of that limit in one place). That means the cost is not really O(mn) but more like O(max(m,n)). Relaxing the limit for user-supplied calls is left for future research; given the lack of complaints to date, it doesn't seem very high priority. In passing, fix confusion between lengths-in-bytes and lengths-in-chars in comments and error messages. Per gripe from Kevin Day; solution suggested by Robert Haas. Back-patch to 9.5 where the unwanted restriction was introduced.
2016-01-02Update copyright for 2016Bruce Momjian
Backpatch certain files through 9.1
2015-05-24pgindent run for 9.5Bruce Momjian
2015-02-03Remove dead code.Heikki Linnakangas
Commit 13629df changed metaphone() function to return an empty string on empty input, but it left the old error message in place. It's now dead code. Michael Paquier, per Coverity warning.
2015-01-24Replace a bunch more uses of strncpy() with safer coding.Tom Lane
strncpy() has a well-deserved reputation for being unsafe, so make an effort to get rid of nearly all occurrences in HEAD. A large fraction of the remaining uses were passing length less than or equal to the known strlen() of the source, in which case no null-padding can occur and the behavior is equivalent to memcpy(), though doubtless slower and certainly harder to reason about. So just use memcpy() in these cases. In other cases, use either StrNCpy() or strlcpy() as appropriate (depending on whether padding to the full length of the destination buffer seems useful). I left a few strncpy() calls alone in the src/timezone/ code, to keep it in sync with upstream (the IANA tzcode distribution). There are also a few such calls in ecpg that could possibly do with more analysis. AFAICT, none of these changes are more than cosmetic, except for the four occurrences in fe-secure-openssl.c, which are in fact buggy: an overlength source leads to a non-null-terminated destination buffer and ensuing misbehavior. These don't seem like security issues, first because no stack clobber is possible and second because if your values of sslcert etc are coming from untrusted sources then you've got problems way worse than this. Still, it's undesirable to have unpredictable behavior for overlength inputs, so back-patch those four changes to all active branches.
2015-01-06Update copyright for 2015Bruce Momjian
Backpatch certain files through 9.0
2015-01-04Add missing va_end() call to a early exit in dmetaphone.c's StringAt().Andres Freund
Pointed out by Coverity. Backpatch to all supported branches, the code has been that way for a long while.
2014-11-13Move the guts of our Levenshtein implementation into core.Robert Haas
The hope is that we can use this to produce better diagnostics in some cases. Peter Geoghegan, reviewed by Michael Paquier, with some further changes by me.
2014-09-26Define META_FREE in a way that doesn't cause -Wempty-body warnings.Andres Freund
That get rids of the only -Wempty-body warning when compiling postgres with gcc 4.8/9. As 6550b901f shows, it's useful to be able to use that option routinely. Without asserts there's many more warnings, but that's food for another commit.
2014-08-25Fix typos in some error messages thrown by extension scripts when fed to psql.Andres Freund
Some of the many error messages introduced in 458857cc missed 'FROM unpackaged'. Also e016b724 and 45ffeb7e forgot to quote extension version numbers. Backpatch to 9.1, just like 458857cc which introduced the messages. Do so because the error messages thrown when the wrong command is copy & pasted aren't easy to understand.
2014-07-14Add file version information to most installed Windows binaries.Noah Misch
Prominent binaries already had this metadata. A handful of minor binaries, such as pg_regress.exe, still lack it; efforts to eliminate such exceptions are welcome. Michael Paquier, reviewed by MauMau.
2014-05-06pgindent run for 9.4Bruce Momjian
This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
2014-04-18Create function prototype as part of PG_FUNCTION_INFO_V1 macroPeter Eisentraut
Because of gcc -Wmissing-prototypes, all functions in dynamically loadable modules must have a separate prototype declaration. This is meant to detect global functions that are not declared in header files, but in cases where the function is called via dfmgr, this is redundant. Besides filling up space with boilerplate, this is a frequent source of compiler warnings in extension modules. We can fix that by creating the function prototype as part of the PG_FUNCTION_INFO_V1 macro, which such modules have to use anyway. That makes the code of modules cleaner, because there is one less place where the entry points have to be listed, and creates an additional check that functions have the right prototype. Remove now redundant prototypes from contrib and other modules.
2014-01-07Update copyright for 2014Bruce Momjian
Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
2013-09-11fuzzystrmatch: replace broken link in C commentBruce Momjian
Albe Laurenz
2013-01-01Update copyrights for 2013Bruce Momjian
Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
2012-05-02Even more duplicate word removal, in the spirit of the seasonPeter Eisentraut
2012-04-24Lots of doc corrections.Robert Haas
Josh Kupershmidt
2012-01-01Update copyright notices for year 2012.Bruce Momjian
2011-12-27Standardize treatment of strcmp() return valuePeter Eisentraut
Always compare the return value to 0, don't use cute tricks like if (!strcmp(...)).
2011-10-12Throw a useful error message if an extension script file is fed to psql.Tom Lane
We have seen one too many reports of people trying to use 9.1 extension files in the old-fashioned way of sourcing them in psql. Not only does that usually not work (due to failure to substitute for MODULE_PATHNAME and/or @extschema@), but if it did work they'd get a collection of loose objects not an extension. To prevent this, insert an \echo ... \quit line that prints a suitable error message into each extension script file, and teach commands/extension.c to ignore lines starting with \echo. That should not only prevent any adverse consequences of loading a script file the wrong way, but make it crystal clear to users that they need to do it differently now. Tom Lane, following an idea of Andrew Dunstan's. Back-patch into 9.1 ... there is not going to be much value in this if we wait till 9.2.
2011-09-01Remove unnecessary #include references, per pgrminclude script.Bruce Momjian
2011-07-19Put inline declaration before return typePeter Eisentraut
gcc -Wextra complains that the other way around is obsolescent, and this was the only place where it was written in this order.
2011-04-10pgindent run before PG 9.1 beta 1.Bruce Momjian
2011-02-20Minor logic fix for new levenshtein implementation.Tom Lane
Alexander Korotkov
2011-02-14Assorted fixups for "unpackaged" conversion scripts.Tom Lane
From first pass of testing. Notably, there seems to be no need for adminpack--unpackaged--1.0.sql because none of the objects that the old module creates would ever be dumped by pg_dump anyway (they are all in pg_catalog).
2011-02-14Avoid use of CREATE OR REPLACE FUNCTION in extension installation files.Tom Lane
It was never terribly consistent to use OR REPLACE (because of the lack of comparable functionality for data types, operators, etc), and experimentation shows that it's now positively pernicious in the extension world. We really want a failure to occur if there are any conflicts, else it's unclear what the extension-ownership state of the conflicted object ought to be. Most of the time, CREATE EXTENSION will fail anyway because of conflicts on other object types, but an extension defining only functions can succeed, with bad results.
2011-02-14Convert contrib modules to use the extension facility.Tom Lane
This isn't fully tested as yet, in particular I'm not sure that the "foo--unpackaged--1.0.sql" scripts are OK. But it's time to get some buildfarm cycles on it. sepgsql is not converted to an extension, mainly because it seems to require a very nonstandard installation process. Dimitri Fontaine and Tom Lane
2011-01-01Stamp copyrights for year 2011.Bruce Momjian
2010-11-23Remove useless whitespace at end of linesPeter Eisentraut
2010-10-22Correct a mistake in levenshtein_less_equal() multibyte character handling.Robert Haas
Spotted by Alexander Korotkov. Along the way, remove a misleading comment line.
2010-10-19Add levenshtein_less_equal, optimized version for small distances.Robert Haas
Alexander Korotkov, heavily revised by me.
2010-09-24In levenshtein_internal(), describe algorithm a bit more clearly.Robert Haas
2010-09-22Convert cvsignore to gitignore, and add .gitignore for build targets.Magnus Hagander
2010-09-20Remove cvs keywords from all files.Magnus Hagander
2010-08-02Teach levenshtein() about multi-byte characters.Robert Haas
Based on a patch by, and further ideas from, Alexander Korotkov.
2010-07-29Avoid using text_to_cstring() in levenshtein functions.Robert Haas
Operating directly on the underlying varlena saves palloc and memcpy overhead, which testing shows to be significant. Extracted from a larger patch by Alexander Korotkov.
2010-07-06pgindent run for 9.0, second runBruce Momjian
2010-04-05Make dmetaphone.c safe for pgindent and fussy compilers. Still to do: make ↵Andrew Dunstan
it properly encoding aware w.r.t. chars U+00C7 and U+00D1.
2010-01-02Update copyright for the year 2010.Bruce Momjian
2009-12-10Fix levenshtein with costs. The previous code multiplied by the cost in onlyRobert Haas
3 of the 7 relevant locations. Marcin Mank, slightly adjusted by me.
2009-06-118.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef listBruce Momjian
provided by Andrew.
2009-04-07Defend against non-ASCII letters in fuzzystrmatch code. The functionsTom Lane
still don't behave very sanely for multibyte encodings, but at least they won't be indexing off the ends of static arrays.