In Memory Query Cache
In memory query cache can be used with all modes of
Pgpool-II.
The in memory query cache keeps the SELECT results and the results are reused.
When the underlying table gets updated, corresponding cache entries are deleted (in case is on. This parameter is on by default), thus restarting Pgpool-II is not necessary.
In memory cache saves the pair of user name, database name, SELECT statement
and its result
(along with the Bind parameters, if the SELECT is an
extended query). If the same user name, database name and SELECTs comes in,
Pgpool-II returns the value from
cache. Since no SQL parsing nor access
to PostgreSQL are involved, the serving
of results from the in memory cache is extremely fast.
First of all if the query starts with SQL comment:
/*FORCE QUERY CACHE*/
is checked (case insensitive). If so, the result of the query is
cached unconditionally as long as it is not SELECT or WITH + SELECT.
However you must be very careful to use this feature. For example,
/*FORCE QUERY CACHE*/SELECT now();
will return the same timestamp until pgpool restarts, once the
query is cached. The query cache will not be discarded even with a
query:
/*NO QUERY CACHE*/SELECT now();
Because it just prevents to create a cache entry for the query, and
does not affect the query using the FORCE QUERY CACHE comment. You
can delete such a query cache by
using command,
or command.
Note that for following query:
/*FORCE QUERY CACHE*/SELECT * FROM t1;
usual cache validation due to an update to the table (in this case
t1) works.
If the query does not start with FORCE QUERY CACHE comment,
following checks are performed. If one of followings is satisfied,
SELECT will not be cached.
SELECT including non immutable functions
SELECT including temp tables, unlogged tables
SELECT including TIMESTAMP WITH TIMEZONE or TIME WITH TIMEZONE
SELECT including CAST to TIMESTAMP WITH TIMEZONE or TIME WITH TIMEZONE
SELECT including SQLValueFunction (CURRENT_TIME, CURRENT_USER etc.)
SELECT result is too large (memqcache_maxcache)
SELECT FOR SHARE/UPDATE
SELECT starting with "/*NO QUERY CACHE*/" comment (case insensitive)
SELECT including system catalogs
SELECT using TABLESAMPLE
SELECT uses row security enabled tables
However, VIEWs and SELECTs accessing unlogged tables can be
cached by specifying in
the .
Since consistency between the query cache and database content can
be lost by following commands, the query cache is all deleted if
they are executed:
ALTER DATABASE
ALTER ROLE or USER (except WITH CONNECTION LIMIT and WITH [ENCRYPTED] PASSWORD)
ALTER TABLE
REVOKE
Moreover, if SET ROLE or SET SESSION AUTHORIZATION are executed,
the query cache will not be used and new cache will not be created
in the session.
Pgpool-II assumes that two query cache
are different if their session users are different. On the other
hand, PostgreSQL manages access control
depending on current user. Thus these commands may break the
consistency between these two access controls. However, cache
invalidation by update commands are performed in the same way in
that these commands are not used.
If multiple functions having the same name are registered and one
of them returns TIMESTAMP WITH TIMEZONE or TIME WITH
TIMEZONE, Pgpool-II thinks that the
query cannot be cached even if it uses a version of function which
does not return TIMESTAMP WITH TIMEZONE or TIME WITH TIMEZONE. In
this case you can create a wrapper function returning TIMESTAMP
WITHOUT TIMEZONE or TIME WITHOUT TIMEZONE, or use /*FORCE QUERY
CACHE*/ comment so that you can use the query cache.
On the other hand, it might be slower than the normal path
in some cases, because it adds some overhead to store cache.
Moreover when a table is updated, Pgpool-II
automatically deletes all the caches related to the
table. Therefore, the performance will be degraded by a system with
a lot of updates. If the query cache hit ratio (it can be checked
by using ) is lower than 70%,
you might want to disable in memory cache.
A SELECT result is basically registered when
SELECT normally finishes. If an explicit transaction is executing,
SELECT result is not registered until the
transaction is committed. Also in extended query mode, the timing
of cache registration varies depending on clustering mode. In
streaming
replication mode and logical
replication mode, cache is registered when
Sync message is sent from frontend, and the
response (Ready for query message) is returned
from backend. Thus even if commands are sent from frontend, the
second Execute (SELECT 1) will not use query cache.
Parse (SELECT 1)
Bind (SELECT 1)
Execute (SELECT 1)
Parse (SELECT 1)
Bind (SELECT 1)
Execute (SELECT 1)
Sync
On the other hand, in other clustering mode, since the result of
first Execute (SELECT 1) is registered, the second Execute (SELECT
1) will use the query cache.
Enabling in memory query cachememory_cache_enabled (boolean)
memory_cache_enabled configuration parameter
Setting to on enables the memory cache.
Default is off.
This parameter can only be set at server start.
The query cache will also be used by shared relation cache if
is set to on. Moreover the
query cache is used even if
parameter is set to off. See for more details to relation cache.
Choosing cache storagememqcache_method (enum)
memqcache_method configuration parameter
Specifies the storage type to be used for the cache.
Below table contains the list of all valid values for the parameter.
In general shmem is faster
than memcached because the former does not
involve any network access. However, the upper limit size of
memqcache_total_size is limited by the total
shared memory size defined by the system. The size is large
enough in modern Linux systems, but in other systems there maybe
smaller upper limit. In this case you need to change the system
settings regarding the maximum shared memory size. There's a
good explanation about shared memory
in PostgreSQL
documentation.
If you are not sure which memqcache_method to be used, start with shmem.
Default is shmem.
This parameter can only be set at server start.
Common configurations
These below parameter are valid for both shmem
and memcached type query cache.
memqcache_expire (integer)
memqcache_expire configuration parameter
Specifies the life time of query cache in seconds.
Default is 0. which means no cache expiration and cache remains
valid until the table is updated.
This parameter can only be set at server start.
memqcache_expire and
are orthogonal to each other.
memqcache_auto_cache_invalidation (boolean)
memqcache_auto_cache_invalidation configuration parameter
Setting to on, automatically deletes the cache related to the updated tables.
When off, cache is not deleted.
Default is on.
This parameters
and are orthogonal to each other.
This parameter can be changed by reloading the Pgpool-II configurations.
memqcache_maxcache (integer)
memqcache_maxcache configuration parameter
Specifies the maximum size in bytes of the SELECT query result to be cached.
The result with data size larger than this value will not be cached by
Pgpool-II.
When the caching of data is rejected because of the size constraint the following
message is shown.
LOG: pid 13756: pool_add_temp_query_cache: data size exceeds memqcache_maxcache. current:4095 requested:111 memq_maxcache:4096
For the shared memory query('shmem') cache the
memqcache_maxcache must be set lower than
and for 'memcached'
it must be lower than the size of slab (default is 1 MB).
This parameter can only be set at server start.
cache_safe_memqcache_table_list (string)
cache_safe_memqcache_table_list configuration parameter
Specifies a comma separated list of table names whose
SELECT results should be cached by
Pgpool-II. This parameter only
applies to VIEWs and SELECTs accessing unlogged tables.
Regular tables can be cached unless specified
by .
You can use regular expression into the list to match table name
(to which ^ and $ are automatically added).
If the queries can refer the table with and without the schema
qualification then you must add both entries(with and without
schema name) in the list.
#For example:
#If the queries sometime use "table1" and other times "public.table1"
#to refer the table1 then the cache_safe_memqcache_table_list
#would be configured as follows.
cache_safe_memqcache_table_list = "table1,public.table1"
This parameter can be changed by reloading the Pgpool-II configurations.
cache_unsafe_memqcache_table_list (string)
cache_unsafe_memqcache_table_list configuration parameter
Specifies a comma separated list of table names whose SELECT
results should NOT be cached by the
Pgpool-II.
You can use regular expression into the list to match table name
(to which ^ and $ are automatically added),
If the queries can refer the table with and without the schema
qualification then you must add both entries(with and without
schema name) in the list.
#For example:
#If the queries sometime use "table1" and other times "public.table1"
#to refer the table1 then the cache_unsafe_memqcache_table_list
#would be configured as follows.
cache_unsafe_memqcache_table_list = "table1,public.table1"
This parameter can be changed by reloading the Pgpool-II configurations.
cache_unsafe_memqcache_table_list
precedence over memqcache_oiddir (string)
memqcache_oiddir configuration parameter
Specifies the full path to the directory for storing the
oids of tables used by SELECT queries.
memqcache_oiddir directory contains the sub directories
for the databases. The directory name is the OID of the database. In addition, each
database directory contains the files for each table used by SELECT statement.
Again the name of the file is the OID of the table.
These files contains the pointers to query cache which are used as key for
deleting the caches.
Normal restart of Pgpool-II does not clear the
contents of memqcache_oiddir.
This parameter can only be set at server start.
Configurations to use shared memory
These are the parameters used with shared memory as the cache storage.
memqcache_total_size (integer)
memqcache_total_size configuration parameter
Specifies the shared memory cache size.
The default is 64MB.
Each cache is stored in fixed-size blocks specified
by . The number
of blocks can be calculated
by
/ . Any decimal
places are truncated so that the number of blocks is an
integer. If the number becomes 0, an error is raised. That
is, memqcache_total_size must be greater
than . Query
results and their management data are not stored across multiple
blocks, so if the query result data length + 64 bytes is greater
than , it cannot
be stored in a block and will not be cached.
When one block is filled with cache, the next block is used.
When all blocks are full, the oldest block is flushed and reused
(Note that the oldest block does not necessarily holds the
oldest cache data because new cache data is registered whenever
the first fitting space is found). While
smaller memqcache_total_size does not raise
an error, performance decreases because the cache hit ratio
decreases. The cache hit ratio can be checked by
consulting cache_hit_ratio
of .
This parameter can only be set at server start.
memqcache_max_num_cache (integer)
memqcache_max_num_cache configuration parameter
Specifies the number of cache entries. The default is 1,000,000.
If you attempt to register more
than memqcache_max_num_cache caches, a block
containing cached data is flushed and reused. While
smaller memqcache_max_num_cache does not
raise an error, performance decreases because the cache hit
ratio decreases. The cache hit ratio can be checked by
consulting cache_hit_ratio
of .
The cache is managed by a hash table in shared memory for fast
access. The hash table space size can be calculated by:
memqcache_max_num_cache * 64 bytes. Number
of hash entries can be found
in used_hash_entries
of . Number of the hash
table entries usually
matches memqcache_max_num_cache, but may not
match if memqcache_max_num_cache is not a
power of 2. In this case it is rounded to a power of 2 greater
than memqcache_max_num_cache. Number of
hash entries in use can be found
at used_hash_entries.
This parameter can only be set at server start.
memqcache_cache_block_size (integer)
memqcache_cache_block_size configuration parameter
Specifies the cache block size.
The default is 1MB.
Query results and their management data are not stored across
multiple blocks, so if the query result data length + 64 bytes
is greater than ,
it cannot be stored in a block and will not be cached.
memqcache_cache_block_size must be set to at least 512.
This parameter can only be set at server start.
Configurations to use memcached
These are the parameters used with memcached as the cache storage.
memqcache_memcached_host (string)
memqcache_memcached_host configuration parameter
Specifies the host name or the IP address on which memcached
works. You can use 'localhost' if memcached
and Pgpool-II resides on same server.
This parameter can only be set at server start.
memqcache_memcached_port (integer)
memqcache_memcached_port configuration parameter
Specifies the port number of memcached.
Default is 11211.
This parameter can only be set at server start.