In Memory Query Cache

In Memory Query Cache In memory query cache can be used with all modes of Pgpool-II. The in memory query cache keeps the SELECT results and the results are reused. When the underlying table gets updated, corresponding cache entries are deleted (in case is on. This parameter is on by default), thus restarting Pgpool-II is not necessary. In memory cache saves the pair of user name, database name, SELECT statement and its result (along with the Bind parameters, if the SELECT is an extended query). If the same user name, database name and SELECTs comes in, Pgpool-II returns the value from cache. Since no SQL parsing nor access to PostgreSQL are involved, the serving of results from the in memory cache is extremely fast. First of all if the query starts with SQL comment: /*FORCE QUERY CACHE*/ is checked (case insensitive). If so, the result of the query is cached unconditionally as long as it is not SELECT or WITH + SELECT. However you must be very careful to use this feature. For example, /*FORCE QUERY CACHE*/SELECT now(); will return the same timestamp until pgpool restarts, once the query is cached. The query cache will not be discarded even with a query: /*NO QUERY CACHE*/SELECT now(); Because it just prevents to create a cache entry for the query, and does not affect the query using the FORCE QUERY CACHE comment. You can delete such a query cache by using command, or command. Note that for following query: /*FORCE QUERY CACHE*/SELECT * FROM t1; usual cache validation due to an update to the table (in this case t1) works. If the query does not start with FORCE QUERY CACHE comment, following checks are performed. If one of followings is satisfied, SELECT will not be cached. SELECT including non immutable functions SELECT including temp tables, unlogged tables SELECT including TIMESTAMP WITH TIMEZONE or TIME WITH TIMEZONE SELECT including CAST to TIMESTAMP WITH TIMEZONE or TIME WITH TIMEZONE SELECT including SQLValueFunction (CURRENT_TIME, CURRENT_USER etc.) SELECT result is too large (memqcache_maxcache) SELECT FOR SHARE/UPDATE SELECT starting with "/*NO QUERY CACHE*/" comment (case insensitive) SELECT including system catalogs SELECT using TABLESAMPLE SELECT uses row security enabled tables However, VIEWs and SELECTs accessing unlogged tables can be cached by specifying in the . Since consistency between the query cache and database content can be lost by following commands, the query cache is all deleted if they are executed: ALTER DATABASE ALTER ROLE or USER (except WITH CONNECTION LIMIT and WITH [ENCRYPTED] PASSWORD) ALTER TABLE REVOKE Moreover, if SET ROLE or SET SESSION AUTHORIZATION are executed, the query cache will not be used and new cache will not be created in the session. Pgpool-II assumes that two query cache are different if their session users are different. On the other hand, PostgreSQL manages access control depending on current user. Thus these commands may break the consistency between these two access controls. However, cache invalidation by update commands are performed in the same way in that these commands are not used. If multiple functions having the same name are registered and one of them returns TIMESTAMP WITH TIMEZONE or TIME WITH TIMEZONE, Pgpool-II thinks that the query cannot be cached even if it uses a version of function which does not return TIMESTAMP WITH TIMEZONE or TIME WITH TIMEZONE. In this case you can create a wrapper function returning TIMESTAMP WITHOUT TIMEZONE or TIME WITHOUT TIMEZONE, or use /*FORCE QUERY CACHE*/ comment so that you can use the query cache. On the other hand, it might be slower than the normal path in some cases, because it adds some overhead to store cache. Moreover when a table is updated, Pgpool-II automatically deletes all the caches related to the table. Therefore, the performance will be degraded by a system with a lot of updates. If the query cache hit ratio (it can be checked by using ) is lower than 70%, you might want to disable in memory cache. A SELECT result is basically registered when SELECT normally finishes. If an explicit transaction is executing, SELECT result is not registered until the transaction is committed. Also in extended query mode, the timing of cache registration varies depending on clustering mode. In streaming replication mode and logical replication mode, cache is registered when Sync message is sent from frontend, and the response (Ready for query message) is returned from backend. Thus even if commands are sent from frontend, the second Execute (SELECT 1) will not use query cache. Parse (SELECT 1) Bind (SELECT 1) Execute (SELECT 1) Parse (SELECT 1) Bind (SELECT 1) Execute (SELECT 1) Sync On the other hand, in other clustering mode, since the result of first Execute (SELECT 1) is registered, the second Execute (SELECT 1) will use the query cache. Enabling in memory query cache memory_cache_enabled (boolean) memory_cache_enabled configuration parameter Setting to on enables the memory cache. Default is off. This parameter can only be set at server start. The query cache will also be used by shared relation cache if is set to on. Moreover the query cache is used even if parameter is set to off. See for more details to relation cache. Choosing cache storage memqcache_method (enum) memqcache_method configuration parameter Specifies the storage type to be used for the cache. Below table contains the list of all valid values for the parameter. Memqcache method options Value Description shmem Use shared memory memcached Use memcached

In general shmem is faster than memcached because the former does not involve any network access. However, the upper limit size of memqcache_total_size is limited by the total shared memory size defined by the system. The size is large enough in modern Linux systems, but in other systems there maybe smaller upper limit. In this case you need to change the system settings regarding the maximum shared memory size. There's a good explanation about shared memory in PostgreSQL documentation. If you are not sure which memqcache_method to be used, start with shmem. Default is shmem. This parameter can only be set at server start. Common configurations These below parameter are valid for both shmem and memcached type query cache. memqcache_expire (integer) memqcache_expire configuration parameter Specifies the life time of query cache in seconds. Default is 0. which means no cache expiration and cache remains valid until the table is updated. This parameter can only be set at server start. memqcache_expire and are orthogonal to each other. memqcache_auto_cache_invalidation (boolean) memqcache_auto_cache_invalidation configuration parameter Setting to on, automatically deletes the cache related to the updated tables. When off, cache is not deleted. Default is on. This parameters and are orthogonal to each other. This parameter can be changed by reloading the Pgpool-II configurations. memqcache_maxcache (integer) memqcache_maxcache configuration parameter Specifies the maximum size in bytes of the SELECT query result to be cached. The result with data size larger than this value will not be cached by Pgpool-II. When the caching of data is rejected because of the size constraint the following message is shown. LOG: pid 13756: pool_add_temp_query_cache: data size exceeds memqcache_maxcache. current:4095 requested:111 memq_maxcache:4096 For the shared memory query('shmem') cache the memqcache_maxcache must be set lower than and for 'memcached' it must be lower than the size of slab (default is 1 MB). This parameter can only be set at server start. cache_safe_memqcache_table_list (string) cache_safe_memqcache_table_list configuration parameter Specifies a comma separated list of table names whose SELECT results should be cached by Pgpool-II. This parameter only applies to VIEWs and SELECTs accessing unlogged tables. Regular tables can be cached unless specified by . You can use regular expression into the list to match table name (to which ^ and $ are automatically added). If the queries can refer the table with and without the schema qualification then you must add both entries(with and without schema name) in the list. #For example: #If the queries sometime use "table1" and other times "public.table1" #to refer the table1 then the cache_safe_memqcache_table_list #would be configured as follows. cache_safe_memqcache_table_list = "table1,public.table1" This parameter can be changed by reloading the Pgpool-II configurations. cache_unsafe_memqcache_table_list (string) cache_unsafe_memqcache_table_list configuration parameter Specifies a comma separated list of table names whose SELECT results should NOT be cached by the Pgpool-II. You can use regular expression into the list to match table name (to which ^ and $ are automatically added), If the queries can refer the table with and without the schema qualification then you must add both entries(with and without schema name) in the list. #For example: #If the queries sometime use "table1" and other times "public.table1" #to refer the table1 then the cache_unsafe_memqcache_table_list #would be configured as follows. cache_unsafe_memqcache_table_list = "table1,public.table1" This parameter can be changed by reloading the Pgpool-II configurations. cache_unsafe_memqcache_table_list precedence over memqcache_oiddir (string) memqcache_oiddir configuration parameter Specifies the full path to the directory for storing the oids of tables used by SELECT queries. memqcache_oiddir directory contains the sub directories for the databases. The directory name is the OID of the database. In addition, each database directory contains the files for each table used by SELECT statement. Again the name of the file is the OID of the table. These files contains the pointers to query cache which are used as key for deleting the caches. Normal restart of Pgpool-II does not clear the contents of memqcache_oiddir. This parameter can only be set at server start. Configurations to use shared memory These are the parameters used with shared memory as the cache storage. memqcache_total_size (integer) memqcache_total_size configuration parameter Specifies the shared memory cache size. The default is 64MB. Each cache is stored in fixed-size blocks specified by . The number of blocks can be calculated by / . Any decimal places are truncated so that the number of blocks is an integer. If the number becomes 0, an error is raised. That is, memqcache_total_size must be greater than . Query results and their management data are not stored across multiple blocks, so if the query result data length + 64 bytes is greater than , it cannot be stored in a block and will not be cached. When one block is filled with cache, the next block is used. When all blocks are full, the oldest block is flushed and reused (Note that the oldest block does not necessarily holds the oldest cache data because new cache data is registered whenever the first fitting space is found). While smaller memqcache_total_size does not raise an error, performance decreases because the cache hit ratio decreases. The cache hit ratio can be checked by consulting cache_hit_ratio of . This parameter can only be set at server start. memqcache_max_num_cache (integer) memqcache_max_num_cache configuration parameter Specifies the number of cache entries. The default is 1,000,000. If you attempt to register more than memqcache_max_num_cache caches, a block containing cached data is flushed and reused. While smaller memqcache_max_num_cache does not raise an error, performance decreases because the cache hit ratio decreases. The cache hit ratio can be checked by consulting cache_hit_ratio of . The cache is managed by a hash table in shared memory for fast access. The hash table space size can be calculated by: memqcache_max_num_cache * 64 bytes. Number of hash entries can be found in used_hash_entries of . Number of the hash table entries usually matches memqcache_max_num_cache, but may not match if memqcache_max_num_cache is not a power of 2. In this case it is rounded to a power of 2 greater than memqcache_max_num_cache. Number of hash entries in use can be found at used_hash_entries. This parameter can only be set at server start. memqcache_cache_block_size (integer) memqcache_cache_block_size configuration parameter Specifies the cache block size. The default is 1MB. Query results and their management data are not stored across multiple blocks, so if the query result data length + 64 bytes is greater than , it cannot be stored in a block and will not be cached. memqcache_cache_block_size must be set to at least 512. This parameter can only be set at server start. Configurations to use memcached These are the parameters used with memcached as the cache storage. memqcache_memcached_host (string) memqcache_memcached_host configuration parameter Specifies the host name or the IP address on which memcached works. You can use 'localhost' if memcached and Pgpool-II resides on same server. This parameter can only be set at server start. memqcache_memcached_port (integer) memqcache_memcached_port configuration parameter Specifies the port number of memcached. Default is 11211. This parameter can only be set at server start.