diff options
author | Greg Smith | 2010-03-19 02:03:21 +0000 |
---|---|---|
committer | Greg Smith | 2010-03-19 02:03:21 +0000 |
commit | 49631992d2c9ee88b21f8cc0cb4ea011a6a0f8d6 (patch) | |
tree | 2029153d94ea45ed7db2b8ca9327f185f7cb6452 | |
parent | 7f0eeb86d7a33eb77a77b257282e8eced0536cd9 (diff) |
Add documentation on multi-worker feature
-rw-r--r-- | README | 81 |
1 files changed, 74 insertions, 7 deletions
@@ -1,4 +1,4 @@ -pgbench-tools Setup +pgbench-tools setup =================== * Create databases for your test and for the results:: @@ -10,7 +10,9 @@ pgbench-tools Setup cache churn in that case. Some amount of cache disruption is unavoidable unless the result database is remote, because of the OS cache. The recommended and default configuration - is to have a pgbench database and a results database. + is to have a pgbench database and a results database. This also + keeps the size of the result dataset from being included in the + total database size figure recorded by the test. * Initialize the results database by executing:: @@ -18,7 +20,9 @@ pgbench-tools Setup Make sure to reference the correct database. This will create a default test set entry with a blank description. - You may want to rename this. + You may want to rename that using something like this:: + + psql -c "UPDATE testset SET info='better name' WHERE set=1" -d results Running tests ============= @@ -38,7 +42,11 @@ Results psql -d results -f report.sql + This is unlikely to disrupte the test results very much unless you've + run an enormous number of tests already. + * Other useful reports you can run include: + * fastest.sql * summary.sql * bufreport.sql * bufsummary.sql @@ -47,15 +55,15 @@ Results a HTML subdirectory for each test giving its results, in addition to the summary information in the results database. -* The results directory will also include its own index file that +* The results directory will also include its own index HTML file that shows summary information and plots for all the tests. * If you manually adjust the test result database, you can - manually regenerate the summary graphs by running:: + then manually regenerate the summary graphs by running:: ./webreport -Version Compatibility +Version compatibility ===================== The default configuration now aims to support the pgbench that ships with @@ -68,7 +76,56 @@ Support for PostgreSQL versions before 8.3 is not possible, because a change was made to the pgbench client in that version that is needed by the program to work properly. It is possible to use the PostgreSQL 8.3 pgbench client against a newer database server, or to copy the pgbench.c -program from 8.3 into a 8.2 source code build and use it instead. +program from 8.3 into a 8.2 source code build and use it instead (with +some fixes--it won't compile unless you comment out code that refers to +optional newer features added in 8.3). + +Multiple worker support +----------------------- + +Starting in PostgreSQL 9.0, pgbench allows splitting up the work pgbench +does into multiple worker threads or processes (which depends on whether +the database client libraries haves been compiled with thread-safe +behavior or not). + +This feature is extremely valuable, as it's likely to give at least +a 15% speedup on common hardware. And it can more than double throughput +on operating systems that are particularly hostile to running the +pgbench client. One known source of this problem is Linux kernels +using the Completely Fair Scheduler introduced in 2.6.23, +which does not schedule the pgbench program very well when it's connecting +to the database using the default method, Unix-domain sockets. + +(Note that pgbench-tools doesn't suffer greatly from this problem itself, as +it connects over TCP/IP using the "-H" parameter. Manual pgbench runs that +do not specify a host, and therefore connect via a local socket can be +extremely slow on recent Linux kernels.) + +Taking advantage of this feature is done in pgbench-tools by increasing the +MAX_WORKERS setting in the configuration file. It defaults to blank, which +avoids using this feature altogether--therefore remaining +compatible with PostgreSQL/pgbench versions before this capability was added. + +When using multiple workers, each must be allocated an equal number of +clients. That means that client counts that are not a multiple of the +worker count will result in pgbench not running at all. + +According, if you set MAX_WORKERS to a number to enable this capability, +pgbench-tools picks the maximum integer of that value or lower that the +client count is evenly divisible by. For example, if MAX_WORKERS is 4, +running with 8 clients will use 4 workers, while 9 clients will shift +downward to 3 workers as the best option. + +A reasonable setting for MAX_WORKERS is the number of physical cores +on the server, typically giving best performance. And when using this feature, +it's better to tweak test client counts toward ones that are divisible by as +many factors as possible. For example, if you wanted approximately 15 +clients, it would be best to use 16, allowing worker counts of 2, 4, or 8, +all likely to match common core counts. Second choice would be 14, +compatible with 2 workers. Third is 15, which would allow 3 workers--not +improving upon a single worker on common dual-core systems. The worst +choices would be 13 or 17 clients, which are prime and therefore cannot +be usefully allocated more than one worker on common hardware. Known issues ============ @@ -78,3 +135,13 @@ Known issues * On Solaris, where the benchwarmer script calls tail it may need to use /usr/xpg4/bin/tail instead + +Planned features +================ + +* Currently none of the graphs break their display down based on the + test set. Each set could be mapped into a separate data set, and + therefore the graph used to compare sets. + +* The client+scale data table used to generate the 3D report would be + useful to generate in tabular text format as well. |