Don't bother to lock bufmgr partitions in pg_buffercache.
authorHeikki Linnakangas <heikki.linnakangas@iki.fi>
Thu, 29 Sep 2016 10:16:30 +0000 (13:16 +0300)
committerHeikki Linnakangas <heikki.linnakangas@iki.fi>
Thu, 29 Sep 2016 10:16:30 +0000 (13:16 +0300)
That makes the view a lot less disruptive to use on a production system.
Without the locks, you don't get a consistent snapshot across all buffers,
but that's OK. It wasn't a very useful guarantee in practice.

Ivan Kartyshov, reviewed by Tomas Vondra and Robert Haas.

Discusssion: <f9d6cab2-73a7-7a84-55a8-07dcb8516ae5@postgrespro.ru>

contrib/pg_buffercache/pg_buffercache_pages.c

index da13bde3595c79ff03c99daf38c8f04a4d320d5d..8bebf2384dffdb33d1acc046c68eae682baea35b 100644 (file)
@@ -135,18 +135,13 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
                /* Return to original context when allocating transient memory */
                MemoryContextSwitchTo(oldcontext);
 
-               /*
-                * To get a consistent picture of the buffer state, we must lock all
-                * partitions of the buffer map.  Needless to say, this is horrible
-                * for concurrency.  Must grab locks in increasing order to avoid
-                * possible deadlocks.
-                */
-               for (i = 0; i < NUM_BUFFER_PARTITIONS; i++)
-                       LWLockAcquire(BufMappingPartitionLockByIndex(i), LW_SHARED);
-
                /*
                 * Scan through all the buffers, saving the relevant fields in the
                 * fctx->record structure.
+                *
+                * We don't hold the partition locks, so we don't get a consistent
+                * snapshot across all buffers, but we do grab the buffer header
+                * locks, so the information of each buffer is self-consistent.
                 */
                for (i = 0; i < NBuffers; i++)
                {
@@ -179,16 +174,6 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 
                        UnlockBufHdr(bufHdr, buf_state);
                }
-
-               /*
-                * And release locks.  We do this in reverse order for two reasons:
-                * (1) Anyone else who needs more than one of the locks will be trying
-                * to lock them in increasing order; we don't want to release the
-                * other process until it can get all the locks it needs. (2) This
-                * avoids O(N^2) behavior inside LWLockRelease.
-                */
-               for (i = NUM_BUFFER_PARTITIONS; --i >= 0;)
-                       LWLockRelease(BufMappingPartitionLockByIndex(i));
        }
 
        funcctx = SRF_PERCALL_SETUP();