summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTatsuo Ishii2017-09-22 02:50:28 +0000
committerTatsuo Ishii2017-09-22 02:50:28 +0000
commitbe3712a31628a4378f0844c9181463c42fac5dd3 (patch)
treee9f89cc2b63baded2c73c73436f80602506a13e8
parente18632ebed7069243d919293cc4e3aa760c3d91a (diff)
Fix bug mistakenly overriding global backend status right after failover.
In [pgpool-general: 5728] it is reported that even if failover disconnects a backend, the status is changed from "down" to "up" in certain timing. After debugging I found that the backend status in pgpool_status was changed to down, then changed again by the first connection from a client after the failover. This happened in new_connection(), which in charge of creating a new connection to backend. It checks the local cached status of the backend and if it's up, then it tries to connect to the backend. In the particular case, the failover is triggered by failover_if_affected_tuples_mismatch, so actually the backend is alive and new_connection() succeeds in establishing connection to the disconnected backend. Then it override the global status and pgpool_status file. Fix is, check if the local backend status is obsoleted. If the global status does not agree the local status, skip the effort to establish the connection. In this report the user uses native replication mode, but I think similar situation can happen in other mode.
-rw-r--r--src/protocol/pool_connection_pool.c16
1 files changed, 16 insertions, 0 deletions
diff --git a/src/protocol/pool_connection_pool.c b/src/protocol/pool_connection_pool.c
index a84134cd1..320f76d17 100644
--- a/src/protocol/pool_connection_pool.c
+++ b/src/protocol/pool_connection_pool.c
@@ -845,6 +845,22 @@ static POOL_CONNECTION_POOL *new_connection(POOL_CONNECTION_POOL *p)
continue;
}
+ /*
+ * Make sure that the global backend status in the shared memory
+ * agrees the local status checked by VALID_BACKEND. It is possible
+ * that the local status is up, while the global status has been
+ * changed to down by failover.
+ */
+ if (BACKEND_INFO(i).backend_status != CON_UP &&
+ BACKEND_INFO(i).backend_status != CON_CONNECT_WAIT)
+ {
+ ereport(DEBUG1,
+ (errmsg("creating new connection to backend"),
+ errdetail("skipping backend slot %d because global backend_status = %d",
+ i, BACKEND_INFO(i).backend_status)));
+ continue;
+ }
+
s = palloc(sizeof(POOL_CONNECTION_POOL_SLOT));
if (create_cp(s, i) == NULL)