Tatsuo Ishii [Fri, 18 Jul 2025 05:45:42 +0000 (14:45 +0900)]
Fix coding issue regarding shift operation.
Per Coverity.
Tatsuo Ishii [Fri, 18 Jul 2025 05:40:14 +0000 (14:40 +0900)]
Fix memory leak.
Fix resource leak in pool_push_pending_data pointed out by Coverity.
Backpatch-through: v4.2
Tatsuo Ishii [Thu, 17 Jul 2025 10:22:49 +0000 (19:22 +0900)]
Add .git-blame-ignore-revs.
.git-blame-ignore-revs lists commits to be ignored by git blame
command. Any indentation fix commit using pgindent should be added to
the file. How to add an entry to the file is explained in the beginning
of the file.
Also add
468573ad3 to the file as the first entry.
Tatsuo Ishii [Thu, 17 Jul 2025 10:15:48 +0000 (19:15 +0900)]
Run pgindent.
Tatsuo Ishii [Thu, 17 Jul 2025 09:54:13 +0000 (18:54 +0900)]
Fix method to run pgindent.
Commit
fd190f7ea imported pgindent but the method explained in
README.pgpool was wrong. typedefs.list can be generated by using
PostgreSQL's find_typedef. So import find_typedef and remove
unnecessary files. Proper way to run pgindent is explained in
README.pgpool.
Bo Peng [Thu, 17 Jul 2025 05:31:02 +0000 (14:31 +0900)]
Feature: Make online recovery database configurable
Prior to version 4.6, the online recovery database was hardcoded to "template1".
This commit introduces a new configuration parameter, "recovery_database",
which allows users to specify the database used for online recovery.
The default value is "postgres".
Taiki Koshino [Wed, 16 Jul 2025 05:27:47 +0000 (14:27 +0900)]
Doc: Fix example script link at master.
Modified the sample script in the section "8.2. Pgpool-II + Watchdog Setup Example"
Tatsuo Ishii [Tue, 15 Jul 2025 05:30:47 +0000 (14:30 +0900)]
Feature: implement protocol version 3.2 BackendKeyData and query cancel message.
Starting from PostgreSQL 18, frontend/backend protocol has been
changed to 3.2. In the changes the BackendKeyData and query cancel
message are modified to allow variable length cancel key.
This commit implements the changes and now we can connect to
PostgreSQL frontend and backend using 3.2 protocol.
Example session is:
PGMAXPROTOCOLVERSION="3.2" psql -p 11000 test
Author: Tatsuo Ishii <ishii@postgresql.org>
Discussion: https://www.postgresql.org/message-id/
20250714.155710.
1706961744888449986.ishii%40postgresql.org
Tatsuo Ishii [Tue, 15 Jul 2025 02:01:11 +0000 (11:01 +0900)]
Fix bug with pcp_proc_info.
When pcp_proc_info was invoked without "-v" option, pcp_proc_info did
not print the "statement" field. This was due to oversight in the
frontend side of the command: forgot to add one more format siring
"%s".
This bug was in only master branch: when some new fields were added to
pcp_proc_info.
Reported-by: Bo Peng <pengbo@sraoss.co.jp>
Author: Tatsuo Ishii <ishii@postgresql.org>
Taiki Koshino [Tue, 15 Jul 2025 00:16:37 +0000 (09:16 +0900)]
Doc: fix documentation for enum parameters reported as strings
Fix documentations for 6 parameters.
Japanese docs too.
"log_standby_delay"
"log_backend_messages"
"wd_lifecheck_method"
"memqcache_method"
"disable_load_balance_on_write"
"backend_clustering_mode"
Taiki Koshino [Tue, 8 Jul 2025 06:06:36 +0000 (15:06 +0900)]
Doc: fix documentation for parameters that are not reflected by reload.
"authentication_timeout" and "memqcache_oiddir" is not reflected by reload.
The documentation is changed to "This parameter can only be set at server start.".
Japanese doc too.
Tatsuo Ishii [Thu, 10 Jul 2025 11:04:06 +0000 (20:04 +0900)]
Import pgindent.
Import PostgreSQL's pgindent.
This commit not only imports PostgreSQL's pgindent, but generates the
important file: typedefs.list. For this purpose followings are added:
- README.pgpool: How to generate typedefs.list.
- doxygen.list: Pgpool-II's typedefs extracted by doxygen. Plus
manually added typedefs that were not detected by doxygen.
- enums.list: Pgpool-II's enums manually extracted from source code.
- exclude_files: files that should not be touched pgindent.
- run_pgindent: handy script to run pgindent. Should be run at src
directory.
- typedefs.list.PostgreSQL: PostgreSQL's typedefs. To prepare for that
doxygen misses some typedefs.
- make_typedefs.list: handy script to generate typedefs.list.
Tatsuo Ishii [Wed, 9 Jul 2025 06:58:16 +0000 (15:58 +0900)]
Feature: implement NegotiateProtocolVersion message.
Implementing the message is necessary when frontend requests the
protocol version 3.2 (i.e. PostgreSQL 18+ or compatible clients),
while backend still only supports 3.0 (i.e. backend is PostgreSQL 17
or before).
This commit handles the message so that the message is forwarded from
backend to frontend when there's no connection cache exists.
If connection cache exists, pgpool sends the message, which has been
saved at the time when the connection cache was created, to frontend.
Note that the frontend/backend protocol 3.2 changes the BackendKeyData
message format, but it's not implemented in this commit yet. This
means that still pgpool cannot handle 3.2 protocol.
Discussion: https://www.postgresql.org/message-id/
20250708.112133.
1324153277751075866.ishii%40postgresql.org
Bo Peng [Mon, 30 Jun 2025 02:52:28 +0000 (11:52 +0900)]
Fix broken scram-sha-256 authentication on big-endian machies.
When scram-sha-256 authentication is performed, a hash function
pg_sha_256_final is used. It was imported from PostgreSQL and it uses
preprocessor define WORDS_BIGENDIAN to judge host machine's
endianness. Although WORDS_BIGENDIAN should be defined while
configure, this part was missed when pg_sha_256_final (and others) was
imported from PostgreSQL. As a result, scram-sha-256 worked only in
little endian machines. This commit fixes the issue by adding
AC_C_BIGENDIAN macro to configure.ac.
Author: Tatsuo Ishii
Reported-by: Christoph Berg
Reviewed-by: pranavkaruvally
Discussion: https://github.com/pgpool/pgpool2/issues/106
Backpatch-through: v4.2
Tatsuo Ishii [Fri, 27 Jun 2025 06:50:33 +0000 (15:50 +0900)]
Test: more fix to 038.pcp_commands regression test.
Commit
04e09df17 was not enough fix. The test calls pcp_proc_info()
pgpool_adm function along with user name and password (in the test
password is the same string as user name). Problem is, the user name
is obtained from a user name that runs the test, and we use psql -a to
submit the SQL, which prints the user name. Of course the user name
can vary depending on the environment, and it makes the test fail. To
fix the issue, run psql without -a option.
Tatsuo Ishii [Wed, 25 Jun 2025 10:32:32 +0000 (19:32 +0900)]
Test: fix 038.pcp_commands regression test.
The result of the test showed local host IP. Although the IP can be
either IPv4 or IPv6, the test script hadn't considered it. To fix
this, now test.sh converts IPv4 and IPv6 IP to "localhost".
Tatsuo Ishii [Tue, 24 Jun 2025 10:16:15 +0000 (19:16 +0900)]
Feature: add pgpool_adm_pcp_proc_info.
This commit adds new pgpool_adm extension function:
pcp_proc_info. Also add new fields: client_host, client_port and SQL
statement to pcp_proc_info and "show pool_pools". With these additions
now it is possible to track the relationship among clients of pgpool,
pgpool itself and PostgreSQL.
Moreover the commit allows to know what commands (statements) are last
executed by using pcp_proc_info. Previously it was not possible unless
looking into the pgpool log.
lipcp.so version is bumped from 2.0 to 2.1.
Bo Peng [Mon, 23 Jun 2025 02:54:30 +0000 (11:54 +0900)]
Fix source code typos.
Tatsuo Ishii [Wed, 18 Jun 2025 08:04:13 +0000 (17:04 +0900)]
Doc: fix load balance explanation missed logical replication mode.
Backpatch-through: v4.2
Bo Peng [Wed, 18 Jun 2025 09:12:13 +0000 (18:12 +0900)]
Doc: enhance pcp_node_info document.
Clarify that each backend_application_nameX must match the value specified
in the application_name of primary_conninfo to correctly display
"replication_state" and "replication_sync_state".
Tatsuo Ishii [Sat, 14 Jun 2025 11:54:13 +0000 (20:54 +0900)]
Enhance lifecheck log.
Previously when wd_lifecheck_method = 'query', life checking prints
SQL without application name if "%a" is specified in
log_line_prefix. This commit add application_name "lifecheck_ping" to
make the log looks better. Since this changes user visible behavior,
I do not apply this to stable branches.
Discussion: [pgpool-hackers: 4603] life check log is not nice
https://www.pgpool.net/pipermail/pgpool-hackers/2025-June/004604.html
Tatsuo Ishii [Sat, 14 Jun 2025 11:12:57 +0000 (20:12 +0900)]
Fix heartbeat device treatment.
wd_create_hb_recv_socket() and wd_create_hb_send_socket() called
setsockopt(2) with wrong argument.
struct ifreq i;
strlcpy(i.ifr_name, hb_if->if_name, sizeof(i.ifr_name));
if (setsockopt(sock, SOL_SOCKET, SO_BINDTODEVICE, &i, sizeof(i)) == -1)
:
This is not quite correct since the 4th argument should be
just a null terminated string (device name), not struct ifreq.
Discussion: [pgpool-hackers: 4602] heartbeat and SO_BINDTODEVICE
https://www.pgpool.net/pipermail/pgpool-hackers/2025-May/004603.html
Backpatch-through: v4.6
Tatsuo Ishii [Sat, 14 Jun 2025 07:15:59 +0000 (16:15 +0900)]
Fix resource leak in hearbeat receiver process.
Pointed out by Coverity.
Backpatch-through: v4.6
Tatsuo Ishii [Sat, 14 Jun 2025 06:13:30 +0000 (15:13 +0900)]
Remove or downgrade inappropriate log messages at pgpool startup.
- Log regarding total shared memory allocation size was redundant.
- Other logs were too verbose and downgraded to DEBUG1.
Tatsuo Ishii [Fri, 13 Jun 2025 01:08:12 +0000 (10:08 +0900)]
Enhance connecting process to backend.
In certain environment (especially k8s), DNS look up is unstable and
connecting to backend process fails. This occurs in call to
getaddrinfo() in connect_inet_domain_socket_by_port(). To enhance the
situation, retry up to 5 times (at each retry, sleep 1 second) if
getaddrinfo() fails with EAI_AGAIN. Note that if
connect_inet_domain_socket_by_port() is called with "retry" argument
is false, the retry will not happen. Health check calls
connect_inet_domain_socket_by_port() with the retry flag to false so
that retrying is controlled health check's own parameters.
Since up to now there's no similar issue reported, back patch to only
4.6 to make backpatching minimal.
Discussion: https://github.com/pgpool/pgpool2/issues/104
Backpatch-through: v4.6
Tatsuo Ishii [Mon, 9 Jun 2025 03:49:36 +0000 (12:49 +0900)]
Fix heartbeat_device treatment.
While processing pgpool.conf, heartbeat_device was mistakenly treated
and the first device was ignored. For example:
heartbeat_device0 = 'eth0'
the configuration process disregarded 'eth0' and acted as if no device
was set. Another example:
heartbeat_device0 = 'eth0;eth1'
"eth0" was simply ignored.
Reviewed-by: Bo Peng <pengbo@sraoss.co.jp>
Backpatch-through: v4.2
Tatsuo Ishii [Sun, 8 Jun 2025 11:25:48 +0000 (20:25 +0900)]
Test: stabilize 029.cert_passphrase regression test.
When ssl_passphrase_command is not valid, the error message is
typically "bad decrypt" but it seems sometimes "wrong tag".
Tatsuo Ishii [Sun, 8 Jun 2025 06:32:00 +0000 (15:32 +0900)]
Doc: add section of kernel resources.
Pgpool-II uses System V shared memory and semaphores. It's better to
describe the requirements in the docs.
Backpatch-through: v4.2
Tatsuo Ishii [Sat, 7 Jun 2025 07:18:33 +0000 (16:18 +0900)]
Doc: add description for --with-ldap option of configure.
It was missed when LDAP support was introduced in v4.2
Backpatch-through: v4.2
Tatsuo Ishii [Thu, 5 Jun 2025 09:59:25 +0000 (18:59 +0900)]
Replace random() with pg_prng random function.
Previously we used random() for choosing load balancing node. However
PostgreSQL has better random number generator: pg_prng.c. This commit
imports the file and use pg_prng_double() to generate random number in
range [0.0, 1.0). The seed is generated using pg_strong_random().
Other notes regarding the port:
- Some of functions in the file were not ported because they require
additional library: pg_bitutils.c. In the future we may revisit and
import pg_bitutils.c.
- All conditional compiling regarding "sun" or "_sun" are removed. It
seems the platform is not used for running pgpool anymore.
- Since srandom() is not necessary any more, related code are removed
from pgpool_main.c, child.c and pcp_worker.c.
Author: Martijn van Duren <pgpool@list.imperialat.at>, Tatsuo Ishii <ishii@postgresql.org>
Discussion: [pgpool-hackers: 4588] Shuffle random functions and use better random numbers
https://www.pgpool.net/pipermail/pgpool-hackers/2025-May/004589.html
Tatsuo Ishii [Thu, 5 Jun 2025 10:42:40 +0000 (19:42 +0900)]
Fix heartbeat receiver not working.
65dbbe7a0 added IPv6 support for heartbeat in 4.6. However it
mistakenly bound to only loopback addresses in heartbeat receive
process. Thus heartbeat messages from other watchdog heartbeat sender
were never received. To fix this add AI_PASSIVE flag to hints argument
to getaddrinfo(), which results in binding all network
interfaces. Note that before 4.6, heartbeat receive process uses
INADDR_ANY for bind(), which resulted in binding all network
interfaces too. So there's no big difference between 4.6 and pre-4.6.
Reviewed-by: Bo Peng <pengbo@sraoss.co.jp>
Backpatch-through: v4.6
Bo Peng [Thu, 5 Jun 2025 03:32:08 +0000 (12:32 +0900)]
Doc: fix command in "8.2. Pgpool-II + Watchdog Setup Example" to escape $PGDATA.
Tatsuo Ishii [Wed, 4 Jun 2025 11:10:34 +0000 (20:10 +0900)]
Doc: clarify supported platforms for Pgpool-II.
Backpatch-through: v4.2
Tatsuo Ishii [Wed, 4 Jun 2025 07:32:30 +0000 (16:32 +0900)]
Enhance log message in creating watchdog receive socket.
This is a follow up commit to:
cea80281d Retry bind on watchdog receive socket.
Use getnameinfo() so that log messages contain hostname, rather just
"TCP".
Tatsuo Ishii [Tue, 3 Jun 2025 10:33:09 +0000 (19:33 +0900)]
Doc: enhance child_life_time document.
Backpatch-through: v4.2
Tatsuo Ishii [Tue, 3 Jun 2025 03:40:42 +0000 (12:40 +0900)]
Retry bind on watchdog receive socket.
Occasionally 028.watchdog_enable_consensus_with_half_votes times out
due to failure on binding watchdog receive socket. This commit tries
to mitigate the issue by retrying bind. Currently the retry is
performed up to 5 times and each retry is with 1 second sleep.
Tatsuo Ishii [Mon, 2 Jun 2025 10:37:39 +0000 (19:37 +0900)]
Fix typo in pgpool.conf.
Backpatch-through: v4.3
Bo Peng [Thu, 29 May 2025 00:28:04 +0000 (09:28 +0900)]
Doc: add release note.
Tatsuo Ishii [Wed, 28 May 2025 12:19:25 +0000 (21:19 +0900)]
Import likely/unlikely from PostgreSQL.
These macros are not only useful to enhance performance (if correctly
used) but make porting codes from PostgreSQL to pgpool easier since
the macros occasionally used in the code.
Discussion: [pgpool-hackers: 4599] Porting likely/unlikely
https://www.pgpool.net/pipermail/pgpool-hackers/2025-May/004600.html
Tatsuo Ishii [Tue, 27 May 2025 10:19:57 +0000 (19:19 +0900)]
Revert "Replace random() with pg_prng random function."
This reverts commit
66fcd561d74c8f00326bad94300053bd7ea13566.
It was accidentally committed.
Tatsuo Ishii [Tue, 27 May 2025 10:15:54 +0000 (19:15 +0900)]
Fix watchdog receive socket creation without IPv6.
When IPv6 network is not available, it was possible that watchdog
process won't start. Previously wd_create_recv_socket() issued
elog(ERROR) if creation or handling IPv6 socket failed. Unfortunately
at the time when wd_create_recv_socket() is called, the exception
stack is not established, and elog happily converts ERROR to FATAL,
which causes exiting watchdog process, thus exiting pgpool process.
To fix this, the elog(ERROR) calls are changed to elog(LOG).
Reported-by: Bo Peng (pengbo@sraoss.co.jp)
Discussion: https://github.com/pgpool/pgpool2/issues/99
Backpatch-through: v4.6
Tatsuo Ishii [Wed, 21 May 2025 06:24:52 +0000 (15:24 +0900)]
Replace random() with pg_prng random function.
Previously we used random() for choosing load balancing node. However
PostgreSQL has better random number generator: pg_prng.c. This commit
imports the file and use pg_prng_double() to generate random number in
range [0.0, 1.0).
Other notes regarding the port:
- pg_prng needs to be initialized using pg_prng_strong_seed() per
process. Currently the only caller is child.c (per session
process). If other process needs to use pg_prng, it needs the same
initialization as child.c.
- Some of functions in the file were not ported because they require
additional library: pg_bitutils.c. In the future we may revisit and
import pg_bitutils.c.
- likely/unlikely are ignored. In the future we may revisit import
them.
- All conditional compiling regarding "sun" or "_sun" are removed. It
seems the platform is not used for running pgpool anymore.
- Since srandom() is not necessary any more, related code are removed
from pgpool_main.c, child.c and pcp_worker.c.
Discussion: [pgpool-hackers: 4588] Shuffle random functions and use better random numbers
https://www.pgpool.net/pipermail/pgpool-hackers/2025-May/004589.html
Tatsuo Ishii [Tue, 20 May 2025 06:04:28 +0000 (15:04 +0900)]
Fix oversight in pg_strong_random commit.
In the commit I forgot to test without SSL case, which requires to
include <errno.h>.
Author: Bo Peng <pengbo@sraoss.co.jp>
Tatsuo Ishii [Tue, 20 May 2025 00:47:02 +0000 (09:47 +0900)]
Replace PostmasterRandom() with pg_strong_random().
Our PostmasterRandmon() was imported from PostgreSQL long time ago (in
2016). In the same year PostgreSQL replaced PostmasterRandmon() with
pg_strong_random()(src/port/pg_strong_random.c). This commit follows
it.
pg_strong_random() looks better than PostmasterRandmon(), since it's
more secure and portable. Moreover no initialization is necessary.
Reviewed-by: Martijn van Duren <pgpool@list.imperialat.at>
Discussion: [pgpool-hackers: 4588] Shuffle random functions and use better random numbers
https://www.pgpool.net/pipermail/pgpool-hackers/2025-May/004589.html
Tatsuo Ishii [Sat, 17 May 2025 06:24:23 +0000 (15:24 +0900)]
Suppress unnecessary information upon authentication failure.
Previously a message "password size does not match" was displayed when
client authentication failed. This could help an attacker to guess
password. Replace it just "password does not match".
Backpatch-through: v4.2
Tatsuo Ishii [Thu, 15 May 2025 09:03:50 +0000 (18:03 +0900)]
Allow pcp clients to connect to IPv6 addresses.
We have already allowed pcp server to connect to IPv6 addresses, but
pcp clients were not allowed to connect to them until today. This
commit allows pcp clients to connect to IPv6 addresses.
Discussion: [pgpool-general: 9481] Does pgpool 4.6.0 support pure ipv6 configuration?
https://www.pgpool.net/pipermail/pgpool-general/2025-May/009484.html
Backpatch-through: v4.6
Bo Peng [Thu, 15 May 2025 07:07:26 +0000 (16:07 +0900)]
Doc: Update release notes to include details of the vulnerability fix.
Bo Peng [Thu, 15 May 2025 02:28:29 +0000 (11:28 +0900)]
This commit is a follow-up to commit
d92a7e2.
Bo Peng [Tue, 13 May 2025 09:29:54 +0000 (18:29 +0900)]
Doc: update release note.
Bo Peng [Tue, 13 May 2025 09:06:36 +0000 (18:06 +0900)]
Doc: update release note.
Bo Peng [Tue, 13 May 2025 08:37:06 +0000 (17:37 +0900)]
Fix incorrect client authentication in some cases.
If enable_pool_hba = on, it's auth method is "password", no password
is registered in pool_passwd, and auth method in pg_hba.conf is
"scram-sha-256" or "md5", for the first time when a client connects to
pgpool, authentication is performed as expected. But if a client
connects to the cached connection, any password from the client is
accepted.
authenticate_frontend() asks password to the client and stores it in
frontend->password. When pgpool authenticate backend,
authenticate_frontend_SCRAM() or authenticate_frontend_md5() is called
depending on pg_hba.conf setting. authenticate_frontend_*() calls
get_auth_password() to get backend cached password but it mistakenly
returned frontend->password if pool_passwd does not have an entry for
the user. Then authenticate_frontend_*() tries to challenge based on
frontend->password. As a result, they compared frontend->password
itself, which always succeed. To fix this, when get_auth_password() is
called with reauth parameter being non 0, return backend->password.
Also if enable_pool_hba = off, in some cases a client is not asked
password for the first time, or when a client connects to cached
connection, even if it should be.
If pool_hba.conf is disabled, get_backend_connection() does not call
Client_authentication(), thus frontend->password is not set. Then
pool_do_reauth() calls do_clear_text_password(). It should have called
authenticate_frontend_clear_text() to get a password from the client,
but a mistake in a if statement prevented it. The mistake was fixed in
this commit.
Pgpool-II versions affected: v4.0 or later.
Also this commit does followings:
- Remove single PostgreSQL code path to simplify the authentication
code. As a result, following cases are no more Ok.
- Remove crypt authentication support for frontend and backend. The
feature had not been documented and never tested. Moreover crypt
authentication was removed long time ago in PostgreSQL (8.4, 2009).
- Add new regression test "040.client_auth". The test performs
exhaustive client authentication tests using a test specification
file formatted in CSV.
The csv files have 7 fields:
username: the username used for the test case
pool_hba.conf: takes "scram", "md5", "password", "pam", "ldap" or
"off". If "scram", "md5" , "password", "pam" or "ldap", the user
will have an entry in pool_hba.conf accordingly. If "off",
enable_pool_hba.conf will be off.
allow_clear_text_frontend_auth: takes "on" or "off".
pool_passwd: takes "AES", "md5" or "off". If "AES" or "md5" the
user's password will be stored in pool_passwd using ASE256 or md5
encryption method accordingly. If "off" is specified, no entry will
be created.
pg_hba.conf: almost same as pool_hba.conf except this is for
pg_hba.conf.
expected: takes "ok" or "fail". If ok, the authentication is
expected to be succeeded. If failed, the test is regarded as
failed. "fail" is opposite. The authentication is expected to be
failed. If succeeds, the test regarded as failed.
comment: arbitrary comment
By changing these fields, we can easily modify or add test
cases. The merit of this method is possible higher test
coverage. For human, it is easier to find uncovered test cases in a
table than in a program code.
Backpatch-through: v4.2
The patch was created by Tatsuo Ishii.
Taiki Koshino [Tue, 13 May 2025 06:00:43 +0000 (15:00 +0900)]
Doc: add release notes.
Tatsuo Ishii [Fri, 9 May 2025 01:55:38 +0000 (10:55 +0900)]
Doc: enhance query cache doc.
Pgpool refuses to cache a query calling functions returning TIMESTAMP
WITH TIMEZONE, TIME WITH TIMEZONE. If there are multiple functions
having same name and one of them returns TIMESTAMP WITH TIMEZONE, TIME
WITH TIMEZONE, pgpool refuses to cache even if one of them does not
return the data types. So add a note on this along with workaround.
Tatsuo Ishii [Thu, 8 May 2025 10:49:10 +0000 (19:49 +0900)]
Fix long standing bind bug with query cache.
When a named statement is prepared, it is possible to bind then
execute without a parse message. Problem is, table oids which are
necessary to invalidate query cache at execute or COMMIT was collected
only in parse messages process (Parse()). Thus if bind is executed
without parse after previous execute, no table oids were collected,
and pgpool failed to invalidate query cache.
Fix is collecting table oids at bind time too.
Add regression test to 006.memqcache.
Problem reported by and test program provided by Achilleas Mantzios
<a.mantzios@cloud.gatewaynet.com>.
Discussion: [pgpool-general: 9427] Clarification on query results cache visibility
https://www.pgpool.net/pipermail/pgpool-general/2025-April/009430.html
Backpatch-through: v4.2
Bo Peng [Thu, 8 May 2025 06:13:10 +0000 (15:13 +0900)]
Fall back to prompting for password if reading from .pcppass file fails.
If reading password from .pcppass file fails, it should fall back to prompting the user for input,
similar to how PostgreSQL handles .pgpass.
This commit also changes the following messages to be displayed without requiring the -d option:
WARNING: password file \"%s\" is not a plain file
WARNING: password file \"%s\" has group or world access; permissions should be u=rw (0600) or less
Discussion: [pgpool-hackers: 4589] If reading password from .pcppass file fails, try to read it from prompt.
https://www.pgpool.net/pipermail/pgpool-hackers/2025-May/004590.html
Tatsuo Ishii [Thu, 1 May 2025 23:35:33 +0000 (08:35 +0900)]
Fix query cache invalidation bug.
When an execute message is received, pgpool checks its max number of
rows paramter. If it's not zero, pgpool sets "partial_fetch" flag to
instruct pool_handle_query_cache() to not create query cache. Problem
is, commit
2a99aa5d1 missed that even INSERT/UPDATE/DELETE sets the
execute message parameter to non 0 (mostly 1) and pgpool set the flag
for even none SELECTs. This resulted in failing to invalidate query
cache because if the flag is true, subsequent code in
pool_handle_query_cache() skips cache invalidation. It was an
oversight in this commit (my fault):
https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=
2a99aa5d1910f1fd4855c8eb6751a26cbaa5e48d
To fix this change Execute() to check if the query is read only SELECT
before setting the flag.
Also add test to 006.memqcache.
Problem reported by and a test program provided by Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com>.
Discussion: [pgpool-general: 9427] Clarification on query results cache visibility
https://www.pgpool.net/pipermail/pgpool-general/2025-April/009430.html
Backpatch-through: v4.2
Tatsuo Ishii [Mon, 5 May 2025 03:40:56 +0000 (12:40 +0900)]
Fix portability to OpenBSD.
- va_list is defined stdarg.h[0]
- pthread_t is defined in pthread.h / sys/types.h[1]
On OpenBSD sys/types.h doesn't suffice, so include pthread.h.
- LibreSSL has removed HMAC_CTX_init(), and has support for HMAC_CTX_new
since 2018. I've talked to Theo Buehler of LibreSSL and he said that he'd
prefer to simply remove the LIBRESSL_VERSION_NUMBER, but if desired by
upstream the LIBRESSL_VERSION_NUMBER should be 0x2070100fL.
- WIFEXITED is defined in sys/wait.h[2]
Author: Martijn van Duren (pgpool@list.imperialat.at)
Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2025-May/004583.html
Backpatch-through: v4.2
Bo Peng [Fri, 2 May 2025 07:49:53 +0000 (16:49 +0900)]
Add major version information to the configuration file.
Bo Peng [Thu, 1 May 2025 02:36:55 +0000 (11:36 +0900)]
Fix json_writer did not properly encode special characters.
Pgpool would crash when the watchdog was enabled if wd_authkey contained special characters (e.g., a backslash).
The patch was originally created by Martijn van Duren and revised by Bo Peng.
Tatsuo Ishii [Sun, 27 Apr 2025 13:11:20 +0000 (22:11 +0900)]
Fix IPv6 in heatbeat process.
From Pgpool-II 4.6.0, heartbeat process can handle IPv6 receiver
sockets. However, the process does not work normally if IPv6 is
disabled in the system. Like Pgpool-II main process and PostgreSQL, I
think it should work normally if IPv4 is available.
Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2025-April/004579.html
Backpatch-through: 4.6
Tatsuo Ishii [Thu, 24 Apr 2025 10:11:43 +0000 (19:11 +0900)]
Doc: enhance the description on connection_life_time
connection_life_time is a config value to determine the life time of
cached connections to PostgreSQL backend. Current document lacks a
description that the expiration calculation is actually done at the
time when the client disconnects to the process which holds the cached
connections.
Discussion: [pgpool-hackers: 4577] Doc: enhance the description on connection_life_time
https://www.pgpool.net/pipermail/pgpool-hackers/2025-April/004578.html
Backpatch-through: v4.2
Tatsuo Ishii [Tue, 1 Apr 2025 06:45:40 +0000 (15:45 +0900)]
Test: skip inaccessible Unix socket directories.
Commit
182b65bfc allows to use multiple Unix socket directories: /tmp
and /var/run/postgresql. However if the system does not have
accessible /var/run/postgresql, pgpool_setup fails unless
$PGSOCKET_DIR is explicitly set. Instead of failing, this commit
allows pgpool_setup to skip inaccessible directories.
Backpatch-through: v4.5
Taiki Koshino [Thu, 27 Mar 2025 05:43:53 +0000 (14:43 +0900)]
Allow regression tests to use multiple socket directories.
Author: Bo Peng
Tested by Taiki Koshino
Backpatch-through: V4.5
Tatsuo Ishii [Wed, 5 Mar 2025 10:55:11 +0000 (19:55 +0900)]
Doc: enhance the explanation on sr_check_user.
It must be a superuser or in the pg_monitor group.
Backpatch-through: v4.2.
Tatsuo Ishii [Tue, 4 Mar 2025 12:27:34 +0000 (21:27 +0900)]
Fix sr check and health check to reopen pool_passwd upon reload.
The streaming replication check and health check process forgot to
reopen pool_passwd upon reload. If sr_check_passwd or
health_check_passwd is empty string, the password is obtained from
pool_passwd. Thus those process read outdated content of pool_passwd
upon reload.
Backpatch-through: v4.2
Bo Peng [Tue, 4 Mar 2025 02:37:53 +0000 (11:37 +0900)]
Start 4.7 development.
Bo Peng [Thu, 27 Feb 2025 07:09:05 +0000 (16:09 +0900)]
Update sample script comment.
Bo Peng [Thu, 27 Feb 2025 06:04:25 +0000 (15:04 +0900)]
Doc: update release date
Bo Peng [Thu, 27 Feb 2025 04:27:44 +0000 (13:27 +0900)]
Doc: add release notes.
Bo Peng [Wed, 26 Feb 2025 12:59:18 +0000 (21:59 +0900)]
Remove pg_basebackup from the sample follow primary script.
If pg_rewind fails, the safest way for users is to recover manually.
Bo Peng [Wed, 19 Feb 2025 09:58:43 +0000 (18:58 +0900)]
Doc: update copyright
Bo Peng [Wed, 19 Feb 2025 08:57:05 +0000 (17:57 +0900)]
Doc: update installation document to 4.6
Bo Peng [Tue, 11 Feb 2025 04:39:32 +0000 (13:39 +0900)]
Enable AM_MAINTAINER_MODE on master branch.
Bo Peng [Tue, 11 Feb 2025 04:35:39 +0000 (13:35 +0900)]
Disable AM_MAINTAINER_MODE.
Tatsuo Ishii [Mon, 10 Feb 2025 09:28:51 +0000 (18:28 +0900)]
Fix too many log lines produced by streaming replication check.
The process started to call
get_pg_backend_status_from_leader_wd_node() which unconditionally emits
log message:
LOG: received the get data request from local pgpool-II on IPC interface
LOG: get data request from local pgpool-II node received on IPC interface is forwarded to leader watchdog node
every sr_check_period seconds, which is annoying. To fix this, an elog
line in process_IPC_data_request_from_leader() is downgraded from LOG
to DEBUG1.
Reported-by: Bo Peng.
Tatsuo Ishii [Mon, 10 Feb 2025 09:24:49 +0000 (18:24 +0900)]
Fix bug in heartbeat.
Following error message was recorded every wd_heartbeat_deadtime since
65dbbe7a0 was committed.
2025-02-10 10:50:37.990: heart_beat_receiver pid
1060625: ERROR: failed to get socket data from heartbeat receive socket list
2025-02-10 10:50:37.990: heart_beat_receiver pid
1060625: DETAIL: select() got timeout, exceed 30 sec(s)
The heartbeat receiver waits for heartbeart packet arrives in
select(2) until wd_heartbeat_deadtime is expired. I believe the logic
is wrong: it should wait forever until the packet arrives. In v4.5 or
earlier, the hearbeart receiver waits in recvfrom() without
timeout. So give NULL to select's timeout parameter so that it waits
forever. Since
65dbbe7a0 is only in master branch, no backpatch is
made.
Reported by: Peng Bo
Bo Peng [Mon, 10 Feb 2025 09:12:56 +0000 (18:12 +0900)]
Update sample scripts.
This commit includes:
- update sample scripts to PostgreSQL 17
- remove archive settings to disable archive mode
Bo Peng [Mon, 10 Feb 2025 09:12:03 +0000 (18:12 +0900)]
Doc: Update configuration example to 4.6 and PostgreSQL 17.
This commit includes:
- update configuration example to 4.6 and PostgreSQL 17
- update OS to Rocky Linux 9
Tatsuo Ishii [Mon, 3 Feb 2025 05:02:52 +0000 (14:02 +0900)]
Doc: the first cut of v4.6 release notes.
Bo Peng [Fri, 31 Jan 2025 00:43:57 +0000 (09:43 +0900)]
Fix per_node_error_log() error message that is printed with two colons.
Patch is created by Umar Hayat.
Tatsuo Ishii [Fri, 17 Jan 2025 05:22:05 +0000 (14:22 +0900)]
Doc: enhance client authentication chapter.
Add intro about pool_passwd. Previously there was only description on
pool_hba.conf in the overview page. A general guide to pool_passwd
will help users to understand this chapter.
Tatsuo Ishii [Tue, 14 Jan 2025 13:44:19 +0000 (22:44 +0900)]
Test: stabilize 032.dml_adaptive_loadbalance
Occasionally the test failed due to:
ERROR: relation "t2" does not exist
LINE 1: SELECT i, 'QUERY ID T1-1' FROM t2;
It seems the cause is that newly created table t2 takes sometime to
get replicated to standby. So insert "sleep 1" after the table
creation.
Backpatch-through: v4.2
Tatsuo Ishii [Sun, 12 Jan 2025 05:22:37 +0000 (14:22 +0900)]
Fix pool_signal.
Previously pool_signal did not set SA_RESTART flag. Thus any system
calls interrupted by a signal does not restart. Some of our code are
prepared so that they restart if a system call is interrupted by a
signal. But not sure all places are prepared too. So add the
flag. Note, PostgreSQL always uses the flag.
Bo Peng [Sun, 5 Jan 2025 12:53:37 +0000 (21:53 +0900)]
Update pgpool.spec.
Bo Peng [Sun, 5 Jan 2025 12:49:39 +0000 (21:49 +0900)]
Fix compiler warning:
warning: ‘delete_all_cache_on_memcached’ declared ‘static’ but never defined[-Wunused-function]
Bo Peng [Thu, 2 Jan 2025 07:56:56 +0000 (16:56 +0900)]
Update src/tools/pcp/.gitignore
Bo Peng [Mon, 16 Dec 2024 08:03:55 +0000 (17:03 +0900)]
Feature: Allow logging_collector related parameters to be changed by reloading the Pgpool-II configurations.
The following logging_collector related parameters can now be changed by reloading:
- log_truncate_on_rotation
- log_directory
- log_filename
- log_rotation_age
- log_rotation_size
- log_file_mode
Tatsuo Ishii [Wed, 11 Dec 2024 09:31:02 +0000 (18:31 +0900)]
Fix yet another query cache bug in streaming replication mode.
If query cache is enabled and query is operated in extended query mode
and pgpool is running in streaming replication mode, an execute
message could return incorrect results.
This could happen when an execute message comes with a non 0 row
number parameter. In this case it fetches up to the specified number of
rows and returns "PortalSuspended" message. Pgpool-II does not create
query cache for this. But if another execute message with 0 row
number parameter comes in, it fetches rest of rows (if any) and
creates query cache with the number of rows which the execute messages
fetched.
Obviously this causes unwanted results later on: another execute
messages returns result from query cache which has only number of rows
captured by the previous execute message with limited number of rows.
Another trouble is when multiple execute messages are sent
consecutively. In this case Pgpool-II returned exactly the same
results from query cache for each execute message. This is wrong since
the second or subsequent executes should return 0 rows.
To fix this, new boolean fields "atEnd" and "partial_fetch" are
introduced in the query context. They are initialized to false when a
query context is created (also initialized when bind message is
received). If an execute message with 0 row number is executed, atEnd
is set to true upon receiving CommandComplete message. If an execute
message with non 0 row number is executed, partial_fetch is set to
true and never uses the cache result, nor creates query cache.
When atEnd is true, pgpool will return CommandComplete message with
"SELECT 0" as a result of the execute message.
Also tests for this case is added to the 006.memqcache regression
test.
Backpatch-through: v4.2
Discussion: [pgpool-hackers: 4547] Bug in query cache
https://www.pgpool.net/pipermail/pgpool-hackers/2024-December/004548.html
Bo Peng [Mon, 9 Dec 2024 08:37:38 +0000 (17:37 +0900)]
Doc: fix the documentation typos.
Bo Peng [Mon, 9 Dec 2024 07:56:13 +0000 (16:56 +0900)]
Fixed an issue where pg_md5 and pg_enc would not update the password file if a file other than the default value was specified in the pool_passwd parameter.
This issue is reported by Sadhuprasad Patro.
Tatsuo Ishii [Thu, 5 Dec 2024 09:04:44 +0000 (18:04 +0900)]
Test: fix 006.memqcache regression test.
4dd7371c2 added test cases. SQL syntax used in the test was not
compatible with PostgreSQL 15 or earlier.
Backpatch-through: v4.2
Tatsuo Ishii [Wed, 4 Dec 2024 12:38:23 +0000 (21:38 +0900)]
Fix query cache bug in streaming replication mode.
When query cache is enabled and an execute message is sent from
frontend, pgpool injects query cache data into backend message buffer
if query cache data is available. inject_cached_message() is
responsible for the task. But it had an oversight if the message
stream from frontend includes more than one sets of bind or describe
message before a sync message. It tried to determine the frontend
message end by finding a bind complete or a row description message
from backend. But in the case, it is possible that these messages do
not indicate the message stream end because there are one more bind
complete or row description message. As a result the cached message is
inserted at inappropriate positron and pgpool mistakenly raised "kind
mismatch" error.
This commit changes the algorithm to detect the message stream end:
compare the number of messages from backend with the pending message
queue length. When a message is read from backend, the counter for the
number of message is counted up if the message is one of parse
complete, bind complete, close complete, command compete, portal
suspended or row description. For other message type the counter is
not counted up. If the counter reaches to the pending message queue
length, we are at the end of message stream and inject the cahced
messages.
Test cases for 006.memqcache are added.
Backpatch-through: v4.2.
Tatsuo Ishii [Mon, 2 Dec 2024 05:49:08 +0000 (14:49 +0900)]
Test: add check using netstat.
Sometimes we see regression errors like:
2024-12-01 13:55:55.508: watchdog pid 27340: FATAL: failed to create watchdog receive socket
2024-12-01 13:55:55.508: watchdog pid 27340: DETAIL: bind on "TCP:50002" failed with reason: "Address already in use"
Before starting each regression test, we use "clean_all" script to
kill all remaining process. I suspect that this is not enough to
release bound ports. So I add netstat command to check whether some
ports are remain bound.
For not this commit is master branch only.
Tatsuo Ishii [Sun, 1 Dec 2024 07:53:28 +0000 (16:53 +0900)]
Test: fix 039.log_backend_messages.
Commit
6d4106f9c forgot to add pgproto data which is necessary in the
test.
Tatsuo Ishii [Mon, 25 Nov 2024 09:09:59 +0000 (18:09 +0900)]
Feature: add log_backend_messages.
When enabled, log protocol messages from each backend. Possible
options are "none", "terse" and "verbose". "none" disables the feature
and is the default. "verbose" prints the log each time pgpool receives
a message from backend. "terse" is similar to verbose except it does
not print logs for repeated message to save log lines. If different
kind of message received, pgpool prints a log message including the
number of the message. One downside of "terse" is, the repeated
message will not be printed if the pgpool child process is killed
before different kind of message arrives.
For testing, 039.log_backend_messages is added.
Discussion: [pgpool-hackers: 4535] New feature: log_backend_messages
https://www.pgpool.net/pipermail/pgpool-hackers/2024-November/004536.html
Bo Peng [Wed, 27 Nov 2024 05:01:54 +0000 (14:01 +0900)]
Doc: add release notes.
Tatsuo Ishii [Mon, 18 Nov 2024 06:40:53 +0000 (15:40 +0900)]
Abort SSL negotiation if backend sends an error message.
In the client side implementation of SSL negotiation
(pool_ssl_negotiate_clientserver()), it was possible for a
man-in-the-middle attacker to send a long error message to confuse
Pgpool-II or client while in the SSL negotiation phase. This commit
rejects the negotiation immediately (issue a FATAL error) and exits
the session to prevent such an attack.
This resembles PostgreSQL's CVE-2024-10977.
Backpatch-through: v4.1
Tatsuo Ishii [Mon, 25 Nov 2024 09:01:34 +0000 (18:01 +0900)]
Test: adapt 024.cert_auth test to OpenSSL 3.2.
In the test we check the error message when the target certificate is
revoked. Unfortunately the error message from OpenSSL seems to be
changed from v3.0 to v3.2.
v3.0 or before: "sslv3 alert certificate revoked"
v3.2: "ssl/tls alert certificate revoked"
So fix is checking only "alert certificate revoked" part.
Bo Peng [Mon, 25 Nov 2024 07:53:54 +0000 (16:53 +0900)]
Fix the watchdog process not reloading configurations.
The reload_config() function in Pgpool-II should send a SIGHUP signal to the watchdog process.
Tatsuo Ishii [Sun, 24 Nov 2024 12:02:02 +0000 (21:02 +0900)]
Test: another attempt to fix 024.cert_auth failure on RockyLinux9.
Renew cert.sh using examples in PostgreSQL docs.