Skip to content

Commit ff98a5e

Browse files
committed
hash: Immediately after a bucket split, try to clean the old bucket.
If it works, then we won't be storing two copies of all the tuples that were just moved. If not, VACUUM will still take care of it eventually. Per a report from AP and analysis from Amit Kapila, it seems that a bulk load can cause splits fast enough that VACUUM won't deal with the problem in time to prevent bloat. Amit Kapila; I rewrote the comment. Discussion: http://postgr.es/m/20170704105728.mwb72jebfmok2nm2@zip.com.au
1 parent 03378c4 commit ff98a5e

File tree

1 file changed

+34
-11
lines changed

1 file changed

+34
-11
lines changed

src/backend/access/hash/hashpage.c

+34-11
Original file line numberDiff line numberDiff line change
@@ -956,9 +956,9 @@ _hash_expandtable(Relation rel, Buffer metabuf)
956956
buf_oblkno, buf_nblkno, NULL,
957957
maxbucket, highmask, lowmask);
958958

959-
/* all done, now release the locks and pins on primary buckets. */
960-
_hash_relbuf(rel, buf_oblkno);
961-
_hash_relbuf(rel, buf_nblkno);
959+
/* all done, now release the pins on primary buckets. */
960+
_hash_dropbuf(rel, buf_oblkno);
961+
_hash_dropbuf(rel, buf_nblkno);
962962

963963
return;
964964

@@ -1068,10 +1068,11 @@ _hash_alloc_buckets(Relation rel, BlockNumber firstblock, uint32 nblocks)
10681068
* while a split is in progress.
10691069
*
10701070
* In addition, the caller must have created the new bucket's base page,
1071-
* which is passed in buffer nbuf, pinned and write-locked. That lock and
1072-
* pin are released here. (The API is set up this way because we must do
1073-
* _hash_getnewbuf() before releasing the metapage write lock. So instead of
1074-
* passing the new bucket's start block number, we pass an actual buffer.)
1071+
* which is passed in buffer nbuf, pinned and write-locked. The lock will be
1072+
* released here and pin must be released by the caller. (The API is set up
1073+
* this way because we must do _hash_getnewbuf() before releasing the metapage
1074+
* write lock. So instead of passing the new bucket's start block number, we
1075+
* pass an actual buffer.)
10751076
*/
10761077
static void
10771078
_hash_splitbucket(Relation rel,
@@ -1280,8 +1281,9 @@ _hash_splitbucket(Relation rel,
12801281

12811282
/*
12821283
* After the split is finished, mark the old bucket to indicate that it
1283-
* contains deletable tuples. Vacuum will clear split-cleanup flag after
1284-
* deleting such tuples.
1284+
* contains deletable tuples. We will clear split-cleanup flag after
1285+
* deleting such tuples either at the end of split or at the next split
1286+
* from old bucket or at the time of vacuum.
12851287
*/
12861288
oopaque->hasho_flag |= LH_BUCKET_NEEDS_SPLIT_CLEANUP;
12871289

@@ -1314,6 +1316,28 @@ _hash_splitbucket(Relation rel,
13141316
}
13151317

13161318
END_CRIT_SECTION();
1319+
1320+
/*
1321+
* If possible, clean up the old bucket. We might not be able to do this
1322+
* if someone else has a pin on it, but if not then we can go ahead. This
1323+
* isn't absolutely necessary, but it reduces bloat; if we don't do it now,
1324+
* VACUUM will do it eventually, but maybe not until new overflow pages
1325+
* have been allocated. Note that there's no need to clean up the new
1326+
* bucket.
1327+
*/
1328+
if (IsBufferCleanupOK(bucket_obuf))
1329+
{
1330+
LockBuffer(bucket_nbuf, BUFFER_LOCK_UNLOCK);
1331+
hashbucketcleanup(rel, obucket, bucket_obuf,
1332+
BufferGetBlockNumber(bucket_obuf), NULL,
1333+
maxbucket, highmask, lowmask, NULL, NULL, true,
1334+
NULL, NULL);
1335+
}
1336+
else
1337+
{
1338+
LockBuffer(bucket_nbuf, BUFFER_LOCK_UNLOCK);
1339+
LockBuffer(bucket_obuf, BUFFER_LOCK_UNLOCK);
1340+
}
13171341
}
13181342

13191343
/*
@@ -1434,8 +1458,7 @@ _hash_finish_split(Relation rel, Buffer metabuf, Buffer obuf, Bucket obucket,
14341458
nbucket, obuf, bucket_nbuf, tidhtab,
14351459
maxbucket, highmask, lowmask);
14361460

1437-
_hash_relbuf(rel, bucket_nbuf);
1438-
LockBuffer(obuf, BUFFER_LOCK_UNLOCK);
1461+
_hash_dropbuf(rel, bucket_nbuf);
14391462
hash_destroy(tidhtab);
14401463
}
14411464

0 commit comments

Comments
 (0)