summaryrefslogtreecommitdiff
path: root/doc/TODO.GEQO
blob: 3e5b9f4f76f6230ace8b52475e933a81a558ecfb (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59

=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
*         Things left to done for the PostgreSQL                    *
=           Genetic Query Optimization (GEQO)                       =
*              module implementation                                *
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
* Martin Utesch		      * Institute of Automatic Control      *
=                             = University of Mining and Technology =
* utesch@aut.tu-freiberg.de   * Freiberg, Germany                   *
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=


1.) Basic Improvements
===============================================================

a) improve freeing of memory when query is already processed:
-------------------------------------------------------------
with large JOIN queries the computing time spent for the genetic query
optimization seems to be a mere *fraction* of the time Postgres
needs for freeing memory via routine 'MemoryContextFree',
file 'backend/utils/mmgr/mcxt.c';
debugging showed that it get stucked in a loop of routine
'OrderedElemPop', file 'backend/utils/mmgr/oset.c';
the same problems arise with long queries when using the normal
Postgres query optimization algorithm;

b) improve genetic algorithm parameter settings:
------------------------------------------------
file 'backend/optimizer/geqo/geqo_params.c', routines
'gimme_pool_size' and 'gimme_number_generations';
we have to find a compromise for the parameter settings
to satisfy two competing demands:
1.  optimality of the query plan
2.  computing time

c) find better solution for integer overflow:
---------------------------------------------
file 'backend/optimizer/geqo/geqo_eval.c', routine
'geqo_joinrel_size';
the present hack for MAXINT overflow is to set the Postgres integer
value of 'rel->size' to its logarithm;
modifications of 'struct Rel' in 'backend/nodes/relation.h' will
surely have severe impacts on the whole PostgreSQL implementation.

d) find solution for exhausted memory:
--------------------------------------
that may occur with more than 10 relations involved in a query,
file 'backend/optimizer/geqo/geqo_eval.c', routine
'gimme_tree' which is recursively called;
maybe I forgot something to be freed correctly, but I dunno what;
of course the 'rel' data structure of the JOIN keeps growing and
growing the more relations are packed into it;
suggestions are welcome :-(


2.) Further Improvements
===============================================================
Enable bushy query tree processing within PostgreSQL;
that may improve the quality of query plans.