TODO


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118

ToDo List for Slony-I
-----------------------------------------


Short Term Items
---------------------------

Improve script that tries to run UPDATE FUNCTIONS across versions to
verify that upgrades work properly.

- Clone Node - use pg_dump/PITR to populate a new subscriber node

  Jan working on this

- UPDATE FUNCTIONS needs to be able to reload version-specific
  functions in v2.0, so that if we do an upgrade via:
    "pg_dump -p $OLDVPORT dbname | psql -p $NEWVERPORT -d dbname"
  we may then run "UPDATE FUNCTIONS" to tell the instance to know
  about the new PostgreSQL version.

  This probably involves refactoring the code that loads the
  version-specific SQL into a function that is called by both STORE
  NODE and UPDATE FUNCTIONS.

- Need to draw some "ducttape" tests into NG tests

   - Need to add a MERGE SET test; should do a pretty mean torture of
     this!

   - Duplicate duct tape test #6 - create 6 nodes:
          - #2 and #3 subscribe to #1
	  - #4 to #3
          - #5 and #6 subscribe to #4

   - Have a test that does a bunch of subtransactions

- Need upgrade path

Longer Term Items
---------------------------

- Windows-compatible version of tools/slony1_dump.sh

- Consider pulling the lexer from psql

  http://developer.postgresql.org/cvsweb.cgi/pgsql/src/bin/psql/psqlscan.l?rev=1.21;content-type=text%2Fx-cvsweb-markup

Wishful Thinking
----------------------------

SYNC pipelining

  - the notion here is to open two connections to the source DB, and
    to start running the queries to generate the next LOG cursor while
    the previous request is pushing INSERT/UPDATE/DELETE requests to
    the subscriber.

COPY pipelining

  - the notion here is to try to parallelize the data load at
    SUBSCRIBE time.  Suppose we decide we can process 4 tables at a
    time, we set up 4 threads.  We then iterate thus:

    For each table
       - acquire a thread (waiting as needed)
       - submit COPY TO stdout to the provider, and feed to 
         COPY FROM stdin on the subscriber
       - Submit the REINDEX request on the subscriber

    Even with a fairly small number of threads, we should be able to
    process the whole subscription in as long as it takes to process
    the single largest table.

    This introduces a risk of locking problems not true at present
    (alas) in that, at present, the subscription process is able to
    demand exclusive locks on all tables up front; that is no longer
    possible if the subscriptions are split across multiple tables.
    In addition, the updates will COMMIT across some period of time on
    the subscriber rather than appearing at one instant in time.

    The timing improvement is probably still worthwhile.

    http://lists.slony.info/pipermail/slony1-hackers/2007-April/000000.html

Slonik ALTER TABLE event

    This would permit passing through changes targeted at a single
    table, and require much less extensive locking than traditional
    EXECUTE SCRIPT.

Compress DELETE/UPDATE/INSERT requests

    Some performance benefits could be gotten by compressing sets of
    DELETEs on the same table into a single DELETE statement.  This
    doesn't help the time it takes to fire triggers on the origin, but
    can speed the process of "mass" deleting records on subscribers.

    <http://lists.slony.info/pipermail/slony1-general/2007-July/006249.html>

    Unfortunately, this would complicate the application code, which
    people agreed would be a net loss...

    <http://lists.slony.info/pipermail/slony1-general/2007-July/006267.html>

Data Transformations on Subscriber

    Have an alternative "logtrigger()" scheme which permits creating a
    custom logtrigger function that can read both OLD.* and NEW.* and
    assortedly:

    - Omit columns on a subscriber
    - Omit tuples

SL-Set

- Could it have some policy in it as to preferred failover targets?