From fba8113c1b74b9508cf2e6b7a18b0fb3637d9ba0 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Thu, 29 Mar 2007 00:15:39 +0000 Subject: Teach CLUSTER to skip writing WAL if not needed (ie, not using archiving) --- Simon. Also, code review and cleanup for the previous COPY-no-WAL patches --- Tom. --- doc/src/sgml/perform.sgml | 116 +++++++++++++++++++++++++++++++++------------- 1 file changed, 85 insertions(+), 31 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml index 5c2a9599455..dcd0d1d2d33 100644 --- a/doc/src/sgml/perform.sgml +++ b/doc/src/sgml/perform.sgml @@ -1,4 +1,4 @@ - + Performance Tips @@ -801,7 +801,7 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse; EXECUTE as many times as required. This avoids some of the overhead of repeatedly parsing and planning INSERT. Different interfaces provide this facility - in different ways; look for Prepared Statements in the interface + in different ways; look for prepared statements in the interface documentation. @@ -815,14 +815,12 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse; COPY is fastest when used within the same transaction as an earlier CREATE TABLE or - TRUNCATE command. In those cases, no WAL - needs to be written because in case of an error, the files - containing the newly loaded data will be removed automatically. - CREATE TABLE AS SELECT is also optimized - to avoid writing WAL. COPY and - CREATE TABLE AS SELECT will write WAL - when is set and will not - therefore be optimized in that case. + TRUNCATE command. In such cases no WAL + needs to be written, because in case of an error, the files + containing the newly loaded data will be removed anyway. + However, this consideration does not apply when + is set, as all commands + must write WAL in that case. @@ -897,23 +895,51 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse; Turn off <varname>archive_command</varname> - When loading large amounts of data you might want to unset the - before loading. It might be - faster to take a new base backup once the load has completed - than to allow a large archive to accumulate. + When loading large amounts of data into an installation that uses + WAL archiving, you might want to disable archiving (unset the + configuration variable) + while loading. It might be + faster to take a new base backup after the load has completed + than to process a large amount of incremental WAL data. - This is particularly important advice because certain commands - will perform more slowly when archive_command - is set, as a result of their needing to write large amounts of WAL. + Aside from avoiding the time for the archiver to process the WAL data, + doing this will actually make certain commands faster, because they + are designed not to write WAL at all if archive_command + is unset. (They can guarantee crash safety more cheaply by doing an + fsync at the end than by writing WAL.) This applies to the following commands: - CREATE TABLE AS SELECT, - CREATE INDEX and also COPY, when - it is executed in the same transaction as a prior - CREATE TABLE or TRUNCATE command. + + + + CREATE TABLE AS SELECT + + + + + CREATE INDEX (and variants such as + ALTER TABLE ADD PRIMARY KEY) + + + + + ALTER TABLE SET TABLESPACE + + + + + CLUSTER + + + + + COPY FROM, when the target table has been + created or truncated earlier in the same transaction + + + - @@ -950,15 +976,43 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse; By default, pg_dump uses COPY, and when it is generating a complete schema-and-data dump, it is careful to load data before creating indexes and foreign keys. So in this case - the first several guidelines are handled automatically. What is left - for you to do is to set appropriate (i.e., larger than normal) values - for maintenance_work_mem and - checkpoint_segments, as well as unsetting - archive_command before loading the dump script, - and then to run ANALYZE afterwards and resetting - archive_command if required. All of the - parameters can be reset once the load has completed without needing - to restart the server, as described in . + several guidelines are handled automatically. What is left + for you to do is to: + + + + Set appropriate (i.e., larger than normal) values for + maintenance_work_mem and + checkpoint_segments. + + + + + If using WAL archiving, consider disabling it during the restore. + To do that, unset archive_command before loading the + dump script, and afterwards restore archive_command + and take a fresh base backup. + + + + + Consider whether the whole dump should be restored as a single + transaction. To do that, pass the + + + + Run ANALYZE afterwards. + + + -- cgit v1.2.3