1 files changed, 34 insertions, 7 deletions
diff --git a/src/man/cluster.l b/src/man/cluster.l
index 73a2fde926a..e76796e61f8 100644
--- a/src/man/cluster.l
+++ b/src/man/cluster.l
@@ -1,6 +1,6 @@
 .\" This is -*-nroff-*-
 .\" XXX standard disclaimer belongs here....
-.\" $Header: /cvsroot/pgsql/src/man/Attic/cluster.l,v 1.4 1998/01/11 22:17:10 momjian Exp $
+.\" $Header: /cvsroot/pgsql/src/man/Attic/cluster.l,v 1.5 1998/03/14 21:57:56 momjian Exp $
 .TH CLUSTER SQL 01/23/93 PostgreSQL PostgreSQL
 .SH NAME
 cluster - give storage clustering advice to Postgres
@@ -17,14 +17,42 @@ The index must already have been defined on
 .IR classname.
 .PP
 When a class is clustered, it is physically reordered based on the index
-information.  The clustering is static.  In other words, if the class is
-updated, it may become unclustered.  No attempt is made to keep new
+information.  The clustering is static.  In other words, as the class is
+updated, the changes are not clusterd.  No attempt is made to keep new
 instances or updated tuples clustered.  If desired, the user can
 recluster manually by issuing the command again.
 .PP
-The table is actually copied to temporary table in index order,
-then renamed back to the original name.  For this reason, all
-grant permissions and other indexes are lost when cluster is performed.
+The table is actually copied to temporary table in index order, then
+renamed back to the original name.  For this reason, all grant
+permissions and other indexes are lost when cluster is performed.
+.PP
+In cases where you are accessing single rows randomly within a table,
+the actual order of the data in the heap table unimportant.  However, if
+you tend to access some data more than others, and there is an index
+that groups them together, you will benefit from using the CLUSTER
+command.
+.PP
+Another place CLUSTER is good is in cases where you use an index to pull
+out several rows from a table.  If you are requesting a range of indexed
+values from a table, or a single indexed value that has multiple rows
+that match, CLUSTER will help because once the index identifies the heap
+page for the first row that matches, all other rows that match are
+probably already on the same heap page, saving disk accesses and speeding up
+the query.
+.PP
+There are two ways to cluster data.  The first is with the CLUSTER
+command, which reoreders the original table with the ordering of the
+index you specify.  This can be slow on large tables because the rows
+are fetched from the heap in index order, and if the heap table is
+unordered, the entries are on random pages, so there is one disk page
+retrieved for every row moved.  PostgreSQL has a cache, but the majority
+of a big table will not fit in the cache.
+.PP
+Another way is to use SELECT ... INTO TABLE temp FROM ...  This uses the
+PostgreSQL sorting code, and is much faster for unordered data.  You
+then drop the old table, use ALTER TABLE RENAME to rename 'temp' to the
+old name, and recreate the indexes.  From then on, CLUSTER should be
+fast because most of the heap data is ordered.
 .SH EXAMPLE
 .nf
 /*
@@ -34,4 +62,3 @@ create index emp_ind on emp using btree (salary int4_ops);
 
 cluster emp_ind on emp
 .fi
-