doc/howto/londiste3_partitioning_howto.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194

= Setting up Londiste3 replication to partitions =

== Introduction ==

This sample shows how to use Londiste `part` handler module to split
one big table between two databases.

The target databases will have `partconf` schema which is usually
used to drive PL/Proxy.  Here it is used simply to provide
configuration to `part` handler.

== Prerequisites ==

Obviously skytools must be installed but in addition we need pghashlib and 
pgbench.

== Setting up the Root Database ==

=== Create databases ===

Create root database that will contain all data and two shard databases.

Run the following SQL: 
----
psql -c "CREATE DATABASE rootdb;"
psql -c "CREATE DATABASE sharddb_0;"
psql -c "CREATE DATABASE sharddb_1;"
----

Deploy hash function everywhere. This is needed because internal hashtext
function was changed between 8.3 and 8.4 versions and may be changed again
in future without consideration for its users.

---
psql rootdb < /usr/share/postgresql/8.4/contrib/hashlib.sql 
psql sharddb_0 < /usr/share/postgresql/8.4/contrib/hashlib.sql 
psql sharddb_1 < /usr/share/postgresql/8.4/contrib/hashlib.sql 
---

=== Set up pgbench schema ===

In this HowTo we are using pgbench for setting up the schema,
populating it with sample data and later running SQL loads to be replicated.

This command will create pgbench tables and fill them with data:
----
/usr/lib/postgresql/8.4/bin/pgbench -i -s 2 -F 80 rootdb
----

Write partconf.sql that will be deployed to all databases:
----
CREATE SCHEMA partconf;
CREATE TABLE partconf.conf (
    part_nr integer,
    max_part integer,
    db_code bigint,
    is_primary boolean,
    max_slot integer,
    cluster_name text
);

CREATE FUNCTION partconf.get_hash_raw
( i_input integer
) RETURNS integer
AS
$_$
-- used to wrap hashtext so that we can replace it in 8.4 with 
-- older implementation to keep compatibility
select hash_string($1::text, 'lookup2');
$_$ 
LANGUAGE sql;
----

Populate shard configuration tables. These values are used inside part.py.

----
psql rootdb < partconf.sql
psql sharddb_0 < partconf.sql
psql sharddb_1 < partconf.sql
psql sharddb_0 -c "insert into partconf.conf(part_nr, max_part) values(0,1);"
psql sharddb_1 -c "insert into partconf.conf(part_nr, max_part) values(1,1);"
----

Next create configuration files for root node and both partitions.

st3partsplit/st3_rootdb.ini
----
[londiste3]
job_name = st3_rootdb
db = dbname=rootdb
queue_name = replika
logfile = st3partsplit/log/st3_rootdb.log
pidfile = st3partsplit/pid/st3_rootdb.pid
----

st3partsplit/st3_sharddb_0.ini
----
[londiste3]
job_name = st3_sharddb_0
db = dbname=sharddb_0
queue_name = replika
logfile = st3partsplit/log/st3_sharddb_0.log
pidfile = st3partsplit/pid/st3_sharddb_0.pid
----

st3partsplit/st3_sharddb_1.ini
----
[londiste3]
job_name = st3_sharddb_1
db = dbname=sharddb_1
queue_name = replika
logfile = st3partsplit/log/st3_sharddb_1.log
pidfile = st3partsplit/pid/st3_sharddb_1.pid
----

Then create root node:

----
londiste3 st3partsplit/st3_rootdb.ini create-root node1 dbname=rootdb
----

And start the worker on root:
----
londiste3 -d st3partsplit/st3_rootdb.ini worker
----

And create leaf nodes and start the workers on partitions:

----
londiste3 st3partsplit/st3_sharddb_0.ini create-leaf node2_0 dbname=sharddb_0 --provider=dbname=rootdb
londiste3 -d st3partsplit/st3_sharddb_0.ini worker
----

Second node:
----
londiste3 st3partsplit/st3_sharddb_1.ini create-leaf node2_1 dbname=sharddb_1 --provider=dbname=rootdb
londiste3 -d st3partsplit/st3_sharddb_1.ini worker
----

Create config file st3partsplit/pgqd.ini for `pgqd` ("the ticker"):
----
[pgqd]

logfile = st3partsplit/log/pgqd.log
pidfile = st3partsplit/pid/pgqd.pid

----

Start the ticker process:
----
pgqd -d st3partsplit/pgqd.ini
----

Now add the replicated tables to root and partitions.
Here we use `--create` switch to add them to partition,
which means Londiste takes schema from root node and
creates tables on target nodes automatically.

The `--handler=part` tells londiste to use the `part` handler for replication,
the `--handler-arg=key=*id` specifies which key field to partition on.


Run command the following commands :
----
londiste3 st3partsplit/st3_rootdb.ini add-table pgbench_accounts --handler=part --handler-arg=key=aid
londiste3 st3partsplit/st3_sharddb_0.ini add-table pgbench_accounts --create --handler=part --handler-arg=key=aid
londiste3 st3partsplit/st3_sharddb_1.ini add-table pgbench_accounts --create --handler=part --handler-arg=key=aid

londiste3 st3partsplit/st3_rootdb.ini add-table pgbench_branches --handler=part --handler-arg=key=bid
londiste3 st3partsplit/st3_sharddb_0.ini add-table pgbench_branches --create --handler=part --handler-arg=key=bid
londiste3 st3partsplit/st3_sharddb_1.ini add-table pgbench_branches --create --handler=part --handler-arg=key=bid

londiste3 st3partsplit/st3_rootdb.ini add-table pgbench_tellers --handler=part --handler-arg=key=tid
londiste3 st3partsplit/st3_sharddb_0.ini add-table pgbench_tellers --create --handler=part --handler-arg=key=tid
londiste3 st3partsplit/st3_sharddb_1.ini add-table pgbench_tellers --create --handler=part --handler-arg=key=tid
----

The following command will run pgbench full speed with 5 parallel
database connections for 10 seconds.

----
/usr/lib/postgresql/8.4/bin/pgbench -T 10 -c 5 rootdb
----

After this is done, you can check that the tables on both sides have the same data:

----
londiste3 st3partsplit/st3_sharddb_0.ini compare
londiste3 st3partsplit/st3_sharddb_0.ini compare
----

Except of course that they don't - each partition will only have roughly half
the data from the root. But the row counts and checksums of the partitions
should both add up to the numbers on the master.