1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722
2723
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746
2747
2748
2749
2750
2751
2752
2753
2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765
2766
2767
2768
2769
2770
2771
2772
2773
2774
2775
2776
2777
2778
2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
2807
2808
2809
2810
2811
2812
2813
2814
2815
2816
2817
2818
2819
2820
2821
2822
2823
2824
2825
2826
2827
2828
2829
2830
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
2850
2851
2852
2853
2854
2855
2856
2857
2858
2859
2860
2861
2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872
2873
2874
2875
2876
2877
2878
2879
2880
2881
2882
2883
2884
2885
2886
2887
2888
2889
2890
2891
2892
2893
2894
2895
2896
2897
2898
2899
2900
2901
2902
2903
2904
2905
2906
2907
2908
2909
2910
2911
2912
2913
2914
2915
2916
2917
2918
2919
2920
2921
2922
2923
2924
2925
2926
2927
2928
2929
2930
2931
2932
2933
2934
2935
2936
2937
2938
2939
2940
2941
2942
2943
2944
2945
2946
2947
2948
2949
2950
2951
2952
2953
2954
2955
2956
2957
2958
2959
2960
2961
2962
2963
2964
2965
2966
2967
2968
2969
2970
2971
2972
2973
2974
2975
2976
2977
2978
2979
2980
2981
2982
2983
2984
2985
2986
2987
2988
2989
2990
2991
2992
2993
2994
2995
2996
2997
2998
2999
3000
3001
3002
3003
3004
3005
3006
3007
3008
3009
3010
3011
3012
3013
3014
3015
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
3026
3027
3028
3029
3030
3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
3045
3046
3047
3048
3049
3050
3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
3082
3083
3084
3085
3086
3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
3135
3136
3137
3138
3139
3140
3141
3142
3143
3144
3145
3146
3147
3148
3149
3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
3208
3209
3210
3211
3212
3213
3214
3215
3216
3217
3218
3219
3220
3221
3222
3223
3224
3225
3226
3227
3228
3229
3230
3231
3232
3233
3234
3235
3236
3237
3238
3239
3240
3241
3242
3243
3244
3245
3246
3247
3248
3249
3250
3251
3252
3253
3254
3255
3256
3257
3258
3259
3260
3261
3262
3263
3264
3265
3266
3267
3268
3269
3270
3271
3272
3273
3274
3275
3276
3277
3278
3279
3280
3281
3282
3283
3284
3285
3286
3287
3288
3289
3290
3291
3292
3293
3294
3295
3296
3297
3298
3299
3300
3301
3302
3303
3304
3305
3306
3307
3308
3309
3310
3311
3312
3313
3314
3315
3316
3317
3318
3319
3320
3321
3322
3323
3324
3325
3326
3327
3328
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
|
From pgsql-hackers-owner+M77861=pgman=candle.pha.pa.us@postgresql.org Fri Dec 23 05:19:20 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Simon Riggs <simon@2ndquadrant.com>
To: Stephen Frost <sfrost@snowman.net>
cc: Martijn van Oosterhout <kleptog@svana.org>,
Jim C. Nasby <jnasby@pervasive.com>,
bizgres-general <bizgres-general@pgfoundry.org>,
pgsql-hackers@postgresql.org
In-Reply-To: <20051222223625.GC6026@ns.snowman.net>
References: <1135261893.2964.502.camel@localhost.localdomain>
<20051222183751.GG72143@pervasive.com> <20051222201826.GH21783@svana.org>
<1135289583.2964.536.camel@localhost.localdomain>
<20051222223625.GC6026@ns.snowman.net>
Date: Fri, 23 Dec 2005 10:18:43 +0000
Message-ID: <1135333123.2964.589.camel@localhost.localdomain>
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.1 required=5 tests=[AWL=0.100]
X-Spam-Score: 0.1
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 4728
On Thu, 2005-12-22 at 17:36 -0500, Stephen Frost wrote:
> * Simon Riggs (simon@2ndquadrant.com) wrote:
> > On Thu, 2005-12-22 at 21:18 +0100, Martijn van Oosterhout wrote:
> > > Considering "WAL bypass" is code for "breaks PITR"
> >
> > No it isn't. All of the WAL bypass logic does *not* operate when PITR is
> > active. The WAL bypass logic is aimed at Data Warehouses, which
> > typically never operate in PITR mode for performance reasons, however
> > the choice is yours.
OK, thanks for saying all of that; you probably speak for many in
raising these concerns. I'll answer each bit as we come to it. Suffice
to say, your concerns are good and so are the answers:
> Eh? PITR mode is bad for performance? Maybe I missed something but I
> wouldn't have thought PITR would degrade regular performance all that
> badly.
PITR mode is *not* bad for performance. On a very heavily loaded
write-intensive test system, the general PITR overhead on regular
performance was around 1% - so almost negligible.
We have been discussing a number of optimizations to specific commands
that would allow them to avoid writing WAL and thus speed up their
performance. If archive_command is set then WAL will always be written;
if it is not set then these commands will (or could) go faster:
- CREATE TABLE AS SELECT (in 8.1)
- COPY LOCK (patch submitted)
- COPY in same transaction as CREATE TABLE (patch submitted)
- INSERT SELECT in same transaction as CREATE TABLE (this discussion)
(There are a number of other conditions also, such as there must be no
indexes on a table. All of which now documented with the patch)
> So long as it doesn't take 15 minutes or some such to move the
> WAL to somewhere else (and I'm not sure that'd even slow things down..).
> For a Data Warehouse, have you got a better way of doing backups such
> that you don't lose at minimum most of a day's work?
Yes. Don't just use the backup facilities on their own. Think about how
the architecture of your systems will work and see if there is a better
way when you look at very large systems.
> I'm not exactly a
> big fan do doing a pg_dump every night either given that the database is
> 360GB. Much nicer to take a weekly dump of the database and then do
> PITR for a week or two before taking another dump of the db.
e.g. Keep your reference data (low volume) in an Operational Data Store
(ODS) database, protected by archiving. Keep your main fact data (high
volume) in the Data Warehouse, but save the data in slices as you load
it, so that a recovery is simply a reload of the database: no PITR or
pg_dump required, so high performance data transformation and load work
is possible. This is a commonly used architectural design pattern.
> I like the idea of making COPY go faster, but please don't break my
> backup system while you're at it.
On a personal note, I would only add that I spent a long time working on
PITR and I would never design anything that would intentionally break it
(nor would patches be accepted that did that). That probably gives me
the confidence to approach designs that might look like I'm doing that,
but without actually straying over the edge.
> I'm honestly kind of nervous about
> what you mean by checking it PITR is active- how is that done, exactly?
> Check if you have a script set to rotate the logs elsewhere? Or is it
> checking if you're in the taking-a-full-database-backup stage? Or what?
Internally, we use XLogArchivingActive(). Externally this will be set
when the admin sets archive_command to a particular value.
My original preference was for a parameter called archive_mode= ON | OFF
which would allow us to more easily discuss this, but this does not
currently exist.
> What's the performance decrease when using PITR, and what's it from? Is
> it just that COPY isn't as fast? Honestly, I could live with COPY being
> not as fast as it could be if my backups work. :)
These commands will not be optimized for speed when archive_command is set:
- CREATE TABLE AS SELECT (in 8.1)
- COPY LOCK (patch submitted)
> Sorry for sounding concerned but, well, backups are very important and
> so is performance and I'm afraid either I've not read all the
> documentation about the issues being discussed here or there isn't
> enough out there to make sense of it all yet. :)
If you choose PITR, then you are safe. If you do not, the crash recovery
of the database is not endangered by these optimizations.
Hope that covers all of your concerns?
I'm just writing a course that explains many of these techniques,
available in the New Year.
Best Regards, Simon Riggs
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
From pgsql-hackers-owner+M78004=pgman=candle.pha.pa.us@postgresql.org Wed Dec 28 20:59:03 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200512290158.jBT1wEK28785@candle.pha.pa.us>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <20051226122206.GA12934@svana.org>
To: Martijn van Oosterhout <kleptog@svana.org>
Date: Wed, 28 Dec 2005 20:58:14 -0500 (EST)
cc: Simon Riggs <simon@2ndquadrant.com>, Tom Lane <tgl@sss.pgh.pa.us>,
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.122 required=5 tests=[AWL=0.122]
X-Spam-Score: 0.122
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 3461
Having read through this thread, I would like to propose a
syntax/behavior.
I think we all now agree that the logging is more part of the table than
the command itself. Right now we have a COPY LOCK patch, but people are
going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
and all sorts of other things, so I think we are best adding an ALTER
TABLE capability. I am thinking of this syntax:
ALTER TABLE name RELIABILITY option
where "option" is:
DROP [ TABLE ON CRASH ]
DELETE [ ROWS ON CRASH ]
EXCLUSIVE
SHARE
Let me explain each option. DROP would drop the table on a restart
after a non-clean shutdown. It would do _no_ logging on the table and
allow concurrent access, plus index access. DELETE is the same as DROP,
but it just truncates the table (perhaps TRUNCATE is a better word).
EXCLUSIVE would allow only a single session to modify the table, and
would do all changes by appending to the table, similar to COPY LOCK.
EXCLUSIVE would also not allow indexes because those can not be isolated
like appending to the heap. EXCLUSIVE would write all dirty shared
buffers for the table and fsync them before committing. SHARE is the
functionality we have now, with full logging.
Does this get us any closer to a TODO item? It isn't great, but I think
it is pretty clear, and I assume pg_dump would use ALTER to load each
table. The advanage is that the COPY statements themselves are
unchanged so they would work in loading into older versions of
PostgreSQL.
---------------------------------------------------------------------------
Martijn van Oosterhout wrote:
-- Start of PGP signed section.
> On Mon, Dec 26, 2005 at 12:03:27PM +0000, Simon Riggs wrote:
> > I would not be against such a table-level switch, but the exact
> > behaviour would need to be specified more closely before this became a
> > TODO item, IMHO.
>
> Well, I think at a per table level is the only sensible level. If a
> table isn't logged, neither are the indexes. After an unclean shutdown
> the data could be anywhere between OK and rubbish, with no way of
> finding out which way.
>
> > If someone has a 100 GB table, they would not appreciate the table being
> > truncated if a transaction to load 1 GB of data aborts, forcing recovery
> > of the 100 GB table.
>
> Ah, but wouldn't such a large table be partitioned in such a way that
> you could have the most recent partition having the loaded data.
> Personally, I think these "shared temp tables" have more applications
> than meet the eye. I've had systems with cache tables which could be
> wiped on boot. Though I think my preference would be to TRUNCATE rather
> than DROP on unclean shutdown.
>
> Have a nice day,
> --
> Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> > tool for doing 5% of the work and then sitting around waiting for someone
> > else to do the other 95% so you can sue them.
-- End of PGP section, PGP failed!
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
From pgsql-hackers-owner+M78007=pgman=candle.pha.pa.us@postgresql.org Wed Dec 28 22:06:13 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
Message-ID: <43B3527A.4040709@commandprompt.com>
Date: Wed, 28 Dec 2005 19:05:30 -0800
From: Joshua D. Drake <jd@commandprompt.com>
Organization: Command Prompt, Inc.
User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
X-Accept-Language: en-us, en
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Martijn van Oosterhout <kleptog@svana.org>,
Simon Riggs <simon@2ndquadrant.com>, Tom Lane <tgl@sss.pgh.pa.us>,
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
In-Reply-To: <200512290158.jBT1wEK28785@candle.pha.pa.us>
X-Greylist: Sender succeded SMTP AUTH authentication, not delayed by milter-greylist-1.6 (hosting.commandprompt.com [192.168.1.101]); Wed, 28 Dec 2005 18:57:25 -0800 (PST)
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.05 required=5 tests=[AWL=0.050, UPPERCASE_25_50=0]
X-Spam-Score: 0.05
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 725
now agree that the logging is more part of the table than
> the command itself. Right now we have a COPY LOCK patch, but people are
> going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
> and all sorts of other things, so I think we are best adding an ALTER
> TABLE capability. I am thinking of this syntax:
>
> ALTER TABLE name RELIABILITY option
>
> where "option" is:
>
> DROP [ TABLE ON CRASH ]
> DELETE [ ROWS ON CRASH ]
> EXCLUSIVE
> SHARE
I would say ON FAILURE (Crash just seems way to scary :))
Joshua D. Drake
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
From pgsql-hackers-owner+M78008=pgman=candle.pha.pa.us@postgresql.org Wed Dec 28 23:09:58 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200512290409.jBT49LD13611@candle.pha.pa.us>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <43B3527A.4040709@commandprompt.com>
To: Joshua D. Drake <jd@commandprompt.com>
Date: Wed, 28 Dec 2005 23:09:21 -0500 (EST)
cc: Martijn van Oosterhout <kleptog@svana.org>,
Simon Riggs <simon@2ndquadrant.com>, Tom Lane <tgl@sss.pgh.pa.us>,
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.122 required=5 tests=[AWL=0.122, UPPERCASE_25_50=0]
X-Spam-Score: 0.122
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 1111
Joshua D. Drake wrote:
> now agree that the logging is more part of the table than
> > the command itself. Right now we have a COPY LOCK patch, but people are
> > going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
> > and all sorts of other things, so I think we are best adding an ALTER
> > TABLE capability. I am thinking of this syntax:
> >
> > ALTER TABLE name RELIABILITY option
> >
> > where "option" is:
> >
> > DROP [ TABLE ON CRASH ]
> > DELETE [ ROWS ON CRASH ]
> > EXCLUSIVE
> > SHARE
>
> I would say ON FAILURE (Crash just seems way to scary :))
Agreed, maybe ON RECOVERY.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
From simon@2ndquadrant.com Thu Dec 29 08:19:47 2005
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Simon Riggs <simon@2ndquadrant.com>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>,
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
In-Reply-To: <200512290158.jBT1wEK28785@candle.pha.pa.us>
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
Date: Thu, 29 Dec 2005 13:19:45 +0000
Message-ID: <1135862385.2964.804.camel@localhost.localdomain>
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
Content-Length: 7026
On Wed, 2005-12-28 at 20:58 -0500, Bruce Momjian wrote:
> Having read through this thread, I would like to propose a
> syntax/behavior.
>
> I think we all now agree that the logging is more part of the table than
> the command itself. Right now we have a COPY LOCK patch, but people are
> going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
> and all sorts of other things, so I think we are best adding an ALTER
> TABLE capability. I am thinking of this syntax:
>
> ALTER TABLE name RELIABILITY option
>
> where "option" is:
>
> DROP [ TABLE ON CRASH ]
> DELETE [ ROWS ON CRASH ]
> EXCLUSIVE
> SHARE
>
> Let me explain each option. DROP would drop the table on a restart
> after a non-clean shutdown. It would do _no_ logging on the table and
> allow concurrent access, plus index access. DELETE is the same as DROP,
> but it just truncates the table (perhaps TRUNCATE is a better word).
>
> EXCLUSIVE would allow only a single session to modify the table, and
> would do all changes by appending to the table, similar to COPY LOCK.
> EXCLUSIVE would also not allow indexes because those can not be isolated
> like appending to the heap. EXCLUSIVE would write all dirty shared
> buffers for the table and fsync them before committing. SHARE is the
> functionality we have now, with full logging.
>
> Does this get us any closer to a TODO item? It isn't great, but I think
> it is pretty clear, and I assume pg_dump would use ALTER to load each
> table. The advanage is that the COPY statements themselves are
> unchanged so they would work in loading into older versions of
> PostgreSQL.
First off, thanks for summarising a complex thread.
My view would be that this thread has been complex because everybody has
expressed a somewhat different requirement, which could be broken down
as:
1. The need for a multi-user-accessible yet temporary table
2. Loading data into a table immediately after it is created (i.e. in
same transaction), including but not limited to a reload from pg_dump
3. How to load data quickly into an existing table (COPY)
4. How to add/modify data quickly in an existing table (INSERT SELECT,
UPDATE)
I can see the need for all of those individually; my existing patch
submission covers (2) and (3) only. I very much like your thought to
coalesce these various requirements into a single coherent model.
For requirement (1), table level options make sense. We would:
- CREATE TABLE ALLTHINGS
- ALTER TABLE ALLTHINGS RELIABILITY DELETE ROWS ON RECOVERY
- lots of SQL, all fast because not logged
(2) is catered for adequately by the existing COPY patch i.e. it will
detect whether a table has just been created and then avoid writing WAL.
In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
pg_dump *will* work with any other version of PostgreSQL, which *would
not* be the case if we added ALTER TABLE ... RELIABILITY statements into
it. Also, a pg_dump created at an earlier version could also be loaded
faster using the patch. The only requirement is to issue all SQL as part
of the same transaction - which is catered for by the
--single-transaction option on pg_restore and psql. So (2) is catered
for fully without the need for an ALTER TABLE ... RELIABILITY statement
or COPY LOCK.
For requirement (3), I would use table level options like this:
(the table already exists and is reasonably big; we should not assume
that everybody can and does use partitioning)
- ALTER TABLE RELIABILITY ALLTHINGS2 EXCLUSIVE
- COPY
- ALTER TABLE RELIABILITY ALLTHINGS2 SHARE
For a load into an existing table I would always do all three actions
together. COPY LOCK does exactly that *and* does it atomically.
The two ways of doing (3) have a few pros/cons either way:
Pro for ALTER TABLE:
- same syntax as req (1)
- doesn't need the keyword LOCK
- allows INSERT SELECT, UPDATE operations also (req 4)
Cons:
- existing programs have to add additional statements to take advantage
of this; with COPY LOCK we would add just a single keyword
- operation is not atomic, which might lead to some operations waiting
for a lock to operate as unlogged, since they would execute before the
second ALTER TABLE gets there
- operation will be understood by some, but not others. They will forget
to switch the RELIABILITY back on and then lose their whole table when
the database crashes. (watch...)
...but would it be a problem to have both?
So, my thinking would be to separate things into two:
a) Add a TODO item "shared temp tables" that caters for (1) and (4)
ALTER TABLE name RELIABILITY
{DELETE ROWS AT RECOVERY | FULL RECOVERY}
(syntax TBD)
which would
- truncate all rows and remove all index entries during recovery
- use shared_buffers, not temp_buffers
- never write xlog records, even when in PITR mode
- would avoid writing WAL for both heap *and* index tuples
b) Leave the COPY patch as is, since it caters for reqs (2) and (3) as
*separate* optimizations (but using a common infrastructure in code).
[This work was based upon discussions on -hackers only 6 months ago, so
its not like its been snuck in or anything
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00075.php ]
These two thoughts are separable. There is no need to
have-both-or-neither within PostgreSQL.
Eventually, I'd like all of these options, as a database designer.
Best Regards, Simon Riggs
> --------------------------------------------------------------------------
>
> Martijn van Oosterhout wrote:
> -- Start of PGP signed section.
> > On Mon, Dec 26, 2005 at 12:03:27PM +0000, Simon Riggs wrote:
> > > I would not be against such a table-level switch, but the exact
> > > behaviour would need to be specified more closely before this became a
> > > TODO item, IMHO.
> >
> > Well, I think at a per table level is the only sensible level. If a
> > table isn't logged, neither are the indexes. After an unclean shutdown
> > the data could be anywhere between OK and rubbish, with no way of
> > finding out which way.
> >
> > > If someone has a 100 GB table, they would not appreciate the table being
> > > truncated if a transaction to load 1 GB of data aborts, forcing recovery
> > > of the 100 GB table.
> >
> > Ah, but wouldn't such a large table be partitioned in such a way that
> > you could have the most recent partition having the loaded data.
> > Personally, I think these "shared temp tables" have more applications
> > than meet the eye. I've had systems with cache tables which could be
> > wiped on boot. Though I think my preference would be to TRUNCATE rather
> > than DROP on unclean shutdown.
> >
> > Have a nice day,
> > --
> > Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> > > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> > > tool for doing 5% of the work and then sitting around waiting for someone
> > > else to do the other 95% so you can sue them.
> -- End of PGP section, PGP failed!
>
From pgsql-hackers-owner+M78019=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 08:20:11 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Simon Riggs <simon@2ndquadrant.com>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>,
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
In-Reply-To: <200512290158.jBT1wEK28785@candle.pha.pa.us>
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
Date: Thu, 29 Dec 2005 13:19:45 +0000
Message-ID: <1135862385.2964.804.camel@localhost.localdomain>
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.112 required=5 tests=[AWL=0.112]
X-Spam-Score: 0.112
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 7139
On Wed, 2005-12-28 at 20:58 -0500, Bruce Momjian wrote:
> Having read through this thread, I would like to propose a
> syntax/behavior.
>
> I think we all now agree that the logging is more part of the table than
> the command itself. Right now we have a COPY LOCK patch, but people are
> going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
> and all sorts of other things, so I think we are best adding an ALTER
> TABLE capability. I am thinking of this syntax:
>
> ALTER TABLE name RELIABILITY option
>
> where "option" is:
>
> DROP [ TABLE ON CRASH ]
> DELETE [ ROWS ON CRASH ]
> EXCLUSIVE
> SHARE
>
> Let me explain each option. DROP would drop the table on a restart
> after a non-clean shutdown. It would do _no_ logging on the table and
> allow concurrent access, plus index access. DELETE is the same as DROP,
> but it just truncates the table (perhaps TRUNCATE is a better word).
>
> EXCLUSIVE would allow only a single session to modify the table, and
> would do all changes by appending to the table, similar to COPY LOCK.
> EXCLUSIVE would also not allow indexes because those can not be isolated
> like appending to the heap. EXCLUSIVE would write all dirty shared
> buffers for the table and fsync them before committing. SHARE is the
> functionality we have now, with full logging.
>
> Does this get us any closer to a TODO item? It isn't great, but I think
> it is pretty clear, and I assume pg_dump would use ALTER to load each
> table. The advanage is that the COPY statements themselves are
> unchanged so they would work in loading into older versions of
> PostgreSQL.
First off, thanks for summarising a complex thread.
My view would be that this thread has been complex because everybody has
expressed a somewhat different requirement, which could be broken down
as:
1. The need for a multi-user-accessible yet temporary table
2. Loading data into a table immediately after it is created (i.e. in
same transaction), including but not limited to a reload from pg_dump
3. How to load data quickly into an existing table (COPY)
4. How to add/modify data quickly in an existing table (INSERT SELECT,
UPDATE)
I can see the need for all of those individually; my existing patch
submission covers (2) and (3) only. I very much like your thought to
coalesce these various requirements into a single coherent model.
For requirement (1), table level options make sense. We would:
- CREATE TABLE ALLTHINGS
- ALTER TABLE ALLTHINGS RELIABILITY DELETE ROWS ON RECOVERY
- lots of SQL, all fast because not logged
(2) is catered for adequately by the existing COPY patch i.e. it will
detect whether a table has just been created and then avoid writing WAL.
In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
pg_dump *will* work with any other version of PostgreSQL, which *would
not* be the case if we added ALTER TABLE ... RELIABILITY statements into
it. Also, a pg_dump created at an earlier version could also be loaded
faster using the patch. The only requirement is to issue all SQL as part
of the same transaction - which is catered for by the
--single-transaction option on pg_restore and psql. So (2) is catered
for fully without the need for an ALTER TABLE ... RELIABILITY statement
or COPY LOCK.
For requirement (3), I would use table level options like this:
(the table already exists and is reasonably big; we should not assume
that everybody can and does use partitioning)
- ALTER TABLE RELIABILITY ALLTHINGS2 EXCLUSIVE
- COPY
- ALTER TABLE RELIABILITY ALLTHINGS2 SHARE
For a load into an existing table I would always do all three actions
together. COPY LOCK does exactly that *and* does it atomically.
The two ways of doing (3) have a few pros/cons either way:
Pro for ALTER TABLE:
- same syntax as req (1)
- doesn't need the keyword LOCK
- allows INSERT SELECT, UPDATE operations also (req 4)
Cons:
- existing programs have to add additional statements to take advantage
of this; with COPY LOCK we would add just a single keyword
- operation is not atomic, which might lead to some operations waiting
for a lock to operate as unlogged, since they would execute before the
second ALTER TABLE gets there
- operation will be understood by some, but not others. They will forget
to switch the RELIABILITY back on and then lose their whole table when
the database crashes. (watch...)
...but would it be a problem to have both?
So, my thinking would be to separate things into two:
a) Add a TODO item "shared temp tables" that caters for (1) and (4)
ALTER TABLE name RELIABILITY
{DELETE ROWS AT RECOVERY | FULL RECOVERY}
(syntax TBD)
which would
- truncate all rows and remove all index entries during recovery
- use shared_buffers, not temp_buffers
- never write xlog records, even when in PITR mode
- would avoid writing WAL for both heap *and* index tuples
b) Leave the COPY patch as is, since it caters for reqs (2) and (3) as
*separate* optimizations (but using a common infrastructure in code).
[This work was based upon discussions on -hackers only 6 months ago, so
its not like its been snuck in or anything
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00075.php ]
These two thoughts are separable. There is no need to
have-both-or-neither within PostgreSQL.
Eventually, I'd like all of these options, as a database designer.
Best Regards, Simon Riggs
> --------------------------------------------------------------------------
>
> Martijn van Oosterhout wrote:
> -- Start of PGP signed section.
> > On Mon, Dec 26, 2005 at 12:03:27PM +0000, Simon Riggs wrote:
> > > I would not be against such a table-level switch, but the exact
> > > behaviour would need to be specified more closely before this became a
> > > TODO item, IMHO.
> >
> > Well, I think at a per table level is the only sensible level. If a
> > table isn't logged, neither are the indexes. After an unclean shutdown
> > the data could be anywhere between OK and rubbish, with no way of
> > finding out which way.
> >
> > > If someone has a 100 GB table, they would not appreciate the table being
> > > truncated if a transaction to load 1 GB of data aborts, forcing recovery
> > > of the 100 GB table.
> >
> > Ah, but wouldn't such a large table be partitioned in such a way that
> > you could have the most recent partition having the loaded data.
> > Personally, I think these "shared temp tables" have more applications
> > than meet the eye. I've had systems with cache tables which could be
> > wiped on boot. Though I think my preference would be to TRUNCATE rather
> > than DROP on unclean shutdown.
> >
> > Have a nice day,
> > --
> > Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> > > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> > > tool for doing 5% of the work and then sitting around waiting for someone
> > > else to do the other 95% so you can sue them.
> -- End of PGP section, PGP failed!
>
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
From pgsql-hackers-owner+M78021=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 09:35:58 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
From: Rod Taylor <pg@rbt.ca>
To: Simon Riggs <simon@2ndquadrant.com>
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>,
Greg Stark <gsstark@mit.edu>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
In-Reply-To: <1135862385.2964.804.camel@localhost.localdomain>
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
<1135862385.2964.804.camel@localhost.localdomain>
Date: Thu, 29 Dec 2005 09:35:27 -0500
Message-ID: <1135866927.61038.13.camel@home>
X-Mailer: Evolution 2.4.2.1 FreeBSD GNOME Team Port
X-SA-Exim-Mail-From: pg@rbt.ca
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
X-SA-Exim-Version: 3.1 (built Tue Feb 24 05:09:27 GMT 2004)
X-SA-Exim-Scanned: Yes
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.024 required=5 tests=[AWL=0.024, UPPERCASE_25_50=0]
X-Spam-Score: 0.024
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 506
> So, my thinking would be to separate things into two:
> a) Add a TODO item "shared temp tables" that caters for (1) and (4)
>
> ALTER TABLE name RELIABILITY
> {DELETE ROWS AT RECOVERY | FULL RECOVERY}
> (syntax TBD)
DELETE ROWS AT RECOVERY would need to be careful or disallowed when
referenced via a foreign key to ensure the database is not restored in
an inconsistent state.
--
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
From pg@rbt.ca Thu Dec 29 09:35:35 2005
From: Rod Taylor <pg@rbt.ca>
To: Simon Riggs <simon@2ndquadrant.com>
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>,
Greg Stark <gsstark@mit.edu>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
In-Reply-To: <1135862385.2964.804.camel@localhost.localdomain>
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
<1135862385.2964.804.camel@localhost.localdomain>
Date: Thu, 29 Dec 2005 09:35:27 -0500
Message-ID: <1135866927.61038.13.camel@home>
X-Mailer: Evolution 2.4.2.1 FreeBSD GNOME Team Port
X-SA-Exim-Mail-From: pg@rbt.ca
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on psi.look.ca
X-Spam-Level:
X-Spam-Status: No, hits=0.7 required=9.0 tests=UPPERCASE_25_50 autolearn=no
version=2.63
X-SA-Exim-Version: 3.1 (built Tue Feb 24 05:09:27 GMT 2004)
X-SA-Exim-Scanned: Yes
Content-Length: 393
> So, my thinking would be to separate things into two:
> a) Add a TODO item "shared temp tables" that caters for (1) and (4)
>
> ALTER TABLE name RELIABILITY
> {DELETE ROWS AT RECOVERY | FULL RECOVERY}
> (syntax TBD)
DELETE ROWS AT RECOVERY would need to be careful or disallowed when
referenced via a foreign key to ensure the database is not restored in
an inconsistent state.
--
From pgsql-hackers-owner+M78022=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 10:10:57 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Simon Riggs <simon@2ndquadrant.com>
To: Rod Taylor <pg@rbt.ca>
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>,
Greg Stark <gsstark@mit.edu>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
In-Reply-To: <1135866927.61038.13.camel@home>
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
<1135862385.2964.804.camel@localhost.localdomain>
<1135866927.61038.13.camel@home>
Date: Thu, 29 Dec 2005 15:10:40 +0000
Message-ID: <1135869040.2964.824.camel@localhost.localdomain>
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
X-Spam-Score: 0.113
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 888
On Thu, 2005-12-29 at 09:35 -0500, Rod Taylor wrote:
> > So, my thinking would be to separate things into two:
> > a) Add a TODO item "shared temp tables" that caters for (1) and (4)
> >
> > ALTER TABLE name RELIABILITY
> > {DELETE ROWS AT RECOVERY | FULL RECOVERY}
> > (syntax TBD)
>
> DELETE ROWS AT RECOVERY would need to be careful or disallowed when
> referenced via a foreign key to ensure the database is not restored in
> an inconsistent state.
I think we'd need to apply the same rule as we do for temp tables: they
cannot be referenced by a permanent table.
There are possibly some other restrictions also. Anyone?
Best Regards, Simon Riggs
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
From tgl@sss.pgh.pa.us Thu Dec 29 11:12:13 2005
To: Simon Riggs <simon@2ndquadrant.com>
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <1135862385.2964.804.camel@localhost.localdomain>
References: <200512290158.jBT1wEK28785@candle.pha.pa.us> <1135862385.2964.804.camel@localhost.localdomain>
Comments: In-reply-to Simon Riggs <simon@2ndquadrant.com>
message dated "Thu, 29 Dec 2005 13:19:45 +0000"
Date: Thu, 29 Dec 2005 11:12:11 -0500
Message-ID: <7273.1135872731@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Content-Length: 1963
Simon Riggs <simon@2ndquadrant.com> writes:
> My view would be that this thread has been complex because everybody has
> expressed a somewhat different requirement, which could be broken down
> as:
> 1. The need for a multi-user-accessible yet temporary table
> 2. Loading data into a table immediately after it is created (i.e. in
> same transaction), including but not limited to a reload from pg_dump
> 3. How to load data quickly into an existing table (COPY)
> 4. How to add/modify data quickly in an existing table (INSERT SELECT,
> UPDATE)
> I can see the need for all of those individually; my existing patch
> submission covers (2) and (3) only. I very much like your thought to
> coalesce these various requirements into a single coherent model.
However, you then seem to be arguing for still using the COPY LOCK
syntax, which I think Bruce intended would go away in favor of using
these ALTER commands. Certainly that's what I'd prefer --- COPY has
got too darn many options already.
> In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
> pg_dump *will* work with any other version of PostgreSQL, which *would
> not* be the case if we added ALTER TABLE ... RELIABILITY statements into
> it.
Wrong --- the good thing about ALTER TABLE is that an old version of
Postgres would simply reject it and keep going. Therefore we could get
the speedup in dumps without losing compatibility, which is not true
of COPY LOCK.
BTW, this is a perfect example of the use-case for not abandoning a
dump-file load simply because one command fails. (We have relied on
this sort of reasoning many times before, too, for example by using
"SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
I don't think that "wrap the whole load into begin/end" is really a very
workable answer, because there are far too many scenarios where you
can't do that. Another one where it doesn't help is a data-only dump.
regards, tom lane
From pgsql-hackers-owner+M78028=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 11:12:41 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
To: Simon Riggs <simon@2ndquadrant.com>
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <1135862385.2964.804.camel@localhost.localdomain>
References: <200512290158.jBT1wEK28785@candle.pha.pa.us> <1135862385.2964.804.camel@localhost.localdomain>
Comments: In-reply-to Simon Riggs <simon@2ndquadrant.com>
message dated "Thu, 29 Dec 2005 13:19:45 +0000"
Date: Thu, 29 Dec 2005 11:12:11 -0500
Message-ID: <7273.1135872731@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.053 required=5 tests=[AWL=0.053]
X-Spam-Score: 0.053
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 2075
Simon Riggs <simon@2ndquadrant.com> writes:
> My view would be that this thread has been complex because everybody has
> expressed a somewhat different requirement, which could be broken down
> as:
> 1. The need for a multi-user-accessible yet temporary table
> 2. Loading data into a table immediately after it is created (i.e. in
> same transaction), including but not limited to a reload from pg_dump
> 3. How to load data quickly into an existing table (COPY)
> 4. How to add/modify data quickly in an existing table (INSERT SELECT,
> UPDATE)
> I can see the need for all of those individually; my existing patch
> submission covers (2) and (3) only. I very much like your thought to
> coalesce these various requirements into a single coherent model.
However, you then seem to be arguing for still using the COPY LOCK
syntax, which I think Bruce intended would go away in favor of using
these ALTER commands. Certainly that's what I'd prefer --- COPY has
got too darn many options already.
> In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
> pg_dump *will* work with any other version of PostgreSQL, which *would
> not* be the case if we added ALTER TABLE ... RELIABILITY statements into
> it.
Wrong --- the good thing about ALTER TABLE is that an old version of
Postgres would simply reject it and keep going. Therefore we could get
the speedup in dumps without losing compatibility, which is not true
of COPY LOCK.
BTW, this is a perfect example of the use-case for not abandoning a
dump-file load simply because one command fails. (We have relied on
this sort of reasoning many times before, too, for example by using
"SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
I don't think that "wrap the whole load into begin/end" is really a very
workable answer, because there are far too many scenarios where you
can't do that. Another one where it doesn't help is a data-only dump.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
From pgsql-hackers-owner+M78025=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 10:57:46 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
Message-ID: <51082.68.143.134.146.1135872877.squirrel@www.dunslane.net>
Date: Thu, 29 Dec 2005 10:14:37 -0600 (CST)
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Andrew Dunstan <andrew@dunslane.net>
To: <pgman@candle.pha.pa.us>
In-Reply-To: <200512290158.jBT1wEK28785@candle.pha.pa.us>
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
X-Priority: 3
Importance: Normal
X-MSMail-Priority: Normal
cc: <kleptog@svana.org>, <simon@2ndquadrant.com>, <tgl@sss.pgh.pa.us>,
<gsstark@mit.edu>, <pg@rbt.ca>, <zhouqq@cs.toronto.edu>,
<pgsql-hackers@postgresql.org>
X-Mailer: SquirrelMail (version 1.2.5)
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.082 required=5 tests=[AWL=0.082]
X-Spam-Score: 0.082
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 1185
Bruce Momjian said:
> DROP would drop the table on a restart
> after a non-clean shutdown. It would do _no_ logging on the table and
> allow concurrent access, plus index access. DELETE is the same as
> DROP, but it just truncates the table (perhaps TRUNCATE is a better
> word).
>
> EXCLUSIVE would allow only a single session to modify the table, and
> would do all changes by appending to the table, similar to COPY LOCK.
> EXCLUSIVE would also not allow indexes because those can not be
> isolated like appending to the heap. EXCLUSIVE would write all dirty
> shared buffers for the table and fsync them before committing. SHARE
> is the functionality we have now, with full logging.
I an horribly scared that this will be used as a "performance boost" for
normal use. I would at least like to see some restrictions that make it
harder to mis-use. Perhaps restrict to superuser?
cheers
andrew
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
From tgl@sss.pgh.pa.us Thu Dec 29 11:24:30 2005
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <200512291605.jBTG5gi00396@candle.pha.pa.us>
References: <200512291605.jBTG5gi00396@candle.pha.pa.us>
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
message dated "Thu, 29 Dec 2005 11:05:42 -0500"
Date: Thu, 29 Dec 2005 11:24:28 -0500
Message-ID: <7966.1135873468@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Content-Length: 612
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Andrew Dunstan wrote:
>> I an horribly scared that this will be used as a "performance boost" for
>> normal use. I would at least like to see some restrictions that make it
>> harder to mis-use. Perhaps restrict to superuser?
> Certainly restrict to table owner.
I can see the argument for superuser-only: decisions about data
integrity tradeoffs should be reserved to the DBA, who is the one who
will get blamed if the database loses data, no matter how stupid his
users are.
But I'm not wedded to that. I could live with table-owner.
regards, tom lane
From pgsql-hackers-owner+M78031=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 11:38:17 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200512291637.jBTGbdC03848@candle.pha.pa.us>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <7273.1135872731@sss.pgh.pa.us>
To: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 29 Dec 2005 11:37:39 -0500 (EST)
cc: Simon Riggs <simon@2ndquadrant.com>,
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.122 required=5 tests=[AWL=0.122]
X-Spam-Score: 0.122
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 3932
Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > My view would be that this thread has been complex because everybody has
> > expressed a somewhat different requirement, which could be broken down
> > as:
> > 1. The need for a multi-user-accessible yet temporary table
> > 2. Loading data into a table immediately after it is created (i.e. in
> > same transaction), including but not limited to a reload from pg_dump
> > 3. How to load data quickly into an existing table (COPY)
> > 4. How to add/modify data quickly in an existing table (INSERT SELECT,
> > UPDATE)
>
> > I can see the need for all of those individually; my existing patch
> > submission covers (2) and (3) only. I very much like your thought to
> > coalesce these various requirements into a single coherent model.
>
> However, you then seem to be arguing for still using the COPY LOCK
> syntax, which I think Bruce intended would go away in favor of using
> these ALTER commands. Certainly that's what I'd prefer --- COPY has
> got too darn many options already.
>
> > In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
> > pg_dump *will* work with any other version of PostgreSQL, which *would
> > not* be the case if we added ALTER TABLE ... RELIABILITY statements into
> > it.
>
> Wrong --- the good thing about ALTER TABLE is that an old version of
> Postgres would simply reject it and keep going. Therefore we could get
> the speedup in dumps without losing compatibility, which is not true
> of COPY LOCK.
>
> BTW, this is a perfect example of the use-case for not abandoning a
> dump-file load simply because one command fails. (We have relied on
> this sort of reasoning many times before, too, for example by using
> "SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
> I don't think that "wrap the whole load into begin/end" is really a very
> workable answer, because there are far too many scenarios where you
> can't do that. Another one where it doesn't help is a data-only dump.
Yep, Tom is echoing my reaction. There is a temptation to add things up
onto existing commands, e.g. LOCK, and while it works, it makes for some
very complex user API's. Having COPY behave differently because it is
in a transaction is fine as long as it is user-invisible, but once you
require users to do that to get the speedup, it isn't user-invisible
anymore.
(I can see it now, "Why is pg_dump putting things in transactions?",
"Because it prevents it from being logged." "Oh, should I be doing that
in my code?" "Perhaps, if you want ..." You can see where that
discussion is going. Having them see "ATER TABLE ... RELIBILITY
TRUNCATE" is very clear, and very clear on how it can be used in user
code.)
I think there is great utility in giving users one API, namely
RELIABILITY (or some other keyword), and telling them that is where they
control logging. I realize adding one keyword, LOCK, to an existing
command isn't a big deal, but once you decentralize your API enough
times, you end up with a terribly complex database system. It is this
design rigidity that helps make PostgreSQL so much easier to use than
other database systems.
I do think it is valid concern about someone use the table between the
CREATE and the ALTER TABLE RELIABILITY. One solution would be to allow
the RELIABILITY as part of the CREATE TABLE, another is to tell users to
create the table inside a transaction.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
From pgsql-hackers-owner+M78036=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 12:21:12 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
To: Andrew Dunstan <andrew@dunslane.net>
cc: <pgman@candle.pha.pa.us>, <kleptog@svana.org>, <simon@2ndquadrant.com>,
<tgl@sss.pgh.pa.us>, <gsstark@mit.edu>, <pg@rbt.ca>,
<zhouqq@cs.toronto.edu>, <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
<51082.68.143.134.146.1135872877.squirrel@www.dunslane.net>
In-Reply-To: <51082.68.143.134.146.1135872877.squirrel@www.dunslane.net>
From: Greg Stark <gsstark@mit.edu>
Organization: The Emacs Conspiracy; member since 1992
Date: 29 Dec 2005 12:20:32 -0500
Message-ID: <87vex74y73.fsf@stark.xeocode.com>
Lines: 42
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.112 required=5 tests=[AWL=0.112]
X-Spam-Score: 0.112
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 1983
"Andrew Dunstan" <andrew@dunslane.net> writes:
> Bruce Momjian said:
> > DROP would drop the table on a restart
> > after a non-clean shutdown. It would do _no_ logging on the table and
> > allow concurrent access, plus index access. DELETE is the same as
> > DROP, but it just truncates the table (perhaps TRUNCATE is a better
> > word).
> >
> > EXCLUSIVE would allow only a single session to modify the table, and
> > would do all changes by appending to the table, similar to COPY LOCK.
> > EXCLUSIVE would also not allow indexes because those can not be
> > isolated like appending to the heap. EXCLUSIVE would write all dirty
> > shared buffers for the table and fsync them before committing. SHARE
> > is the functionality we have now, with full logging.
>
> I an horribly scared that this will be used as a "performance boost" for
> normal use. I would at least like to see some restrictions that make it
> harder to mis-use. Perhaps restrict to superuser?
Well that's its whole purpose. At least you can hardly argue that you didn't
realize the consequences of "DELETE ROWS ON RECOVERY"... :)
Some thoughts:
a) I'm not sure I understand the purpose of EXCLUSIVE. When would I ever want to
use it instead of DELETE ROWS?
b) It seems like the other feature people were talking about of not logging
for a table created within the same transaction should be handled by
having this flag implicitly set for any such newly created table.
Ie, the test for whether to log would look like:
if (!table->logged && table->xid != myxid) ...
c) Every option in ALTER TABLE should be in CREATE TABLE as well.
d) Yes as someone else mentioned, this should only be allowable on a table
with no foreign keys referencing it.
--
greg
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
From pgsql-hackers-owner+M78037=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 12:31:40 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200512291730.jBTHUnn09840@candle.pha.pa.us>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <87vex74y73.fsf@stark.xeocode.com>
To: Greg Stark <gsstark@mit.edu>
Date: Thu, 29 Dec 2005 12:30:49 -0500 (EST)
cc: Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
simon@2ndquadrant.com, tgl@sss.pgh.pa.us, pg@rbt.ca, zhouqq@cs.toronto.edu,
pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.122 required=5 tests=[AWL=0.122]
X-Spam-Score: 0.122
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 3304
Greg Stark wrote:
> "Andrew Dunstan" <andrew@dunslane.net> writes:
>
> > Bruce Momjian said:
> > > DROP would drop the table on a restart
> > > after a non-clean shutdown. It would do _no_ logging on the table and
> > > allow concurrent access, plus index access. DELETE is the same as
> > > DROP, but it just truncates the table (perhaps TRUNCATE is a better
> > > word).
> > >
> > > EXCLUSIVE would allow only a single session to modify the table, and
> > > would do all changes by appending to the table, similar to COPY LOCK.
> > > EXCLUSIVE would also not allow indexes because those can not be
> > > isolated like appending to the heap. EXCLUSIVE would write all dirty
> > > shared buffers for the table and fsync them before committing. SHARE
> > > is the functionality we have now, with full logging.
> >
> > I an horribly scared that this will be used as a "performance boost" for
> > normal use. I would at least like to see some restrictions that make it
> > harder to mis-use. Perhaps restrict to superuser?
>
> Well that's its whole purpose. At least you can hardly argue that you didn't
> realize the consequences of "DELETE ROWS ON RECOVERY"... :)
True. I think we are worried about non-owners using it, but the owner
had to grant permissions for others to modify it, so we might be OK.
> Some thoughts:
>
> a) I'm not sure I understand the purpose of EXCLUSIVE. When would I ever want to
> use it instead of DELETE ROWS?
Good question. The use case is doing COPY into a table that already had
data. EXCLUSIVE allows additions to the table but preserves the
existing data on a crash.
> b) It seems like the other feature people were talking about of not logging
> for a table created within the same transaction should be handled by
> having this flag implicitly set for any such newly created table.
> Ie, the test for whether to log would look like:
>
> if (!table->logged && table->xid != myxid) ...
Yes, the question is whether we want to limit users to having this
optimization _only_ when they have created the table in the same
transaction, and the short answer is we don't.
> c) Every option in ALTER TABLE should be in CREATE TABLE as well.
I looked into that and see that things like:
ALTER [ COLUMN ] column SET STATISTICS integer
ALTER [ COLUMN ] column SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN }
are not supported by CREATE TABLE, and probably shouldn't be because the
value can be changed after the table is created. I think the only
things we usually support in CREATE TABLE are those that cannot be
altered.
> d) Yes as someone else mentioned, this should only be allowable on a table
> with no foreign keys referencing it.
Right, and EXCLUSIVE can not have an index either.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
From simon@2ndquadrant.com Fri Dec 30 08:10:53 2005
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Simon Riggs <simon@2ndquadrant.com>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>,
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
In-Reply-To: <200512291637.jBTGbdC03848@candle.pha.pa.us>
References: <200512291637.jBTGbdC03848@candle.pha.pa.us>
Date: Fri, 30 Dec 2005 13:09:12 +0000
Message-ID: <1135948152.2862.113.camel@localhost.localdomain>
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
Content-Length: 6343
On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
> Tom Lane wrote:
> > Simon Riggs <simon@2ndquadrant.com> writes:
> > > My view would be that this thread has been complex because everybody has
> > > expressed a somewhat different requirement, which could be broken down
> > > as:
> > > 1. The need for a multi-user-accessible yet temporary table
> > > 2. Loading data into a table immediately after it is created (i.e. in
> > > same transaction), including but not limited to a reload from pg_dump
> > > 3. How to load data quickly into an existing table (COPY)
> > > 4. How to add/modify data quickly in an existing table (INSERT SELECT,
> > > UPDATE)
> > However, you then seem to be arguing for still using the COPY LOCK
> > syntax, which I think Bruce intended would go away in favor of using
> > these ALTER commands. Certainly that's what I'd prefer --- COPY has
> > got too darn many options already.
COPY LOCK was Tom's suggestion at the end of a long discussion thread on
this precise issue. Nobody objected to it at that point; I implemented
it *exactly* that way because I wanted to very visibly follow the
consensus of the community, after informed debate.
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00068.php
Please re-read the links to previous discussions.
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
There are points there, not made by me, that still apply and need to be
considered here, yet have not been.
Just to restate my current thinking:
- agree we should have ALTER TABLE ... RELIABILITY DELETE ROWS
- we should have COPY LOCK rather than
ALTER TABLE .... RELIABILITY EXCLUSIVE
(Though I welcome better wording and syntax in either case; it is the
behaviour only that I discuss).
It seems now that we have agreed approaches for (1), (2) and (4). Please
note that I have listened to the needs of others with regard to
requirement (1), as espoused by earlier by Hannu and again now by
Martijn. Some of the points about requirement (3) I made in my previous
post have not yet been addressed, IMHO.
My mind is not fixed. AFAICS there are valid points remaining on both
sides of the discussion about loading data quickly into an existing
table.
> I do think it is valid concern about someone use the table between the
> CREATE and the ALTER TABLE RELIABILITY. One solution would be to allow
> the RELIABILITY as part of the CREATE TABLE, another is to tell users to
> create the table inside a transaction.
Neither solution works for this use case:
> > 3. How to load data quickly into an existing table (COPY)
This is the only use case for which ALTER TABLE ... EXCLUSIVE makes
sense. That option means that any write lock held upon the table would
be an EXCLUSIVE table lock, so would never be a performance gain with
single row INSERT, UPDATE or DELETEs.
Following Andrew's concerns, I'd also note that ALTER TABLE requires a
much higher level of privilege to operate than does COPY. That sounds
like it will make things more secure, but all it does is open up the
administrative rights, since full ownership rights must be obtained
merely to load data.
> Having COPY behave differently because it is
> in a transaction is fine as long as it is user-invisible
Good
> I think there is great utility in giving users one API, namely
> RELIABILITY (or some other keyword), and telling them that is where they
> control logging. I realize adding one keyword, LOCK, to an existing
> command isn't a big deal, but once you decentralize your API enough
> times, you end up with a terribly complex database system. It is this
> design rigidity that helps make PostgreSQL so much easier to use than
> other database systems.
I do see the appeal of your suggestion...
TRUNCATE is a special command to delete quickly. There is no requirement
to do an ALTER TABLE statement before that command executes.
Balance would suggest that a special command to load data quickly would
be reasonably accepted by users.
Minor points below:
> > > In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
> > > pg_dump *will* work with any other version of PostgreSQL, which *would
> > > not* be the case if we added ALTER TABLE ... RELIABILITY statements into
> > > it.
> >
> > Wrong --- the good thing about ALTER TABLE is that an old version of
> > Postgres would simply reject it and keep going. Therefore we could get
> > the speedup in dumps without losing compatibility, which is not true
> > of COPY LOCK.
That was pointing out one of Bruce's objections was not relevant because
it assumed COPY LOCK was required to make pg_restore go faster; that was
not the case - so there is no valid objection either way now.
> > BTW, this is a perfect example of the use-case for not abandoning a
> > dump-file load simply because one command fails. (We have relied on
> > this sort of reasoning many times before, too, for example by using
> > "SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
> > I don't think that "wrap the whole load into begin/end" is really a very
> > workable answer, because there are far too many scenarios where you
> > can't do that. Another one where it doesn't help is a data-only dump.
Which is why --single-transaction is not the default, per the earlier
discussion on that point (on -patches).
> Yep, Tom is echoing my reaction. There is a temptation to add things up
> onto existing commands, e.g. LOCK, and while it works, it makes for some
> very complex user API's. Having COPY behave differently because it is
> in a transaction is fine as long as it is user-invisible, but once you
> require users to do that to get the speedup, it isn't user-invisible
> anymore.
>
> (I can see it now, "Why is pg_dump putting things in transactions?",
> "Because it prevents it from being logged." "Oh, should I be doing that
> in my code?" "Perhaps, if you want ..." You can see where that
> discussion is going. Having them see "ATER TABLE ... RELIBILITY
> TRUNCATE" is very clear, and very clear on how it can be used in user
> code.)
The above case is not an argument against COPY LOCK. Exactly what you
say above would still occur even when we have ALTER TABLE ...
RELIABILITY statement, since COPY LOCK and
COPY-optimized-within-same-transaction are different things.
Best Regards, Simon Riggs
From pgsql-hackers-owner+M78064=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 11:50:49 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200512301649.jBUGnxn21488@candle.pha.pa.us>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <1135948152.2862.113.camel@localhost.localdomain>
To: Simon Riggs <simon@2ndquadrant.com>
Date: Fri, 30 Dec 2005 11:49:59 -0500 (EST)
cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>,
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
X-Spam-Score: 0.12
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 8888
Simon Riggs wrote:
> On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
> > Tom Lane wrote:
> > > Simon Riggs <simon@2ndquadrant.com> writes:
> > > > My view would be that this thread has been complex because everybody has
> > > > expressed a somewhat different requirement, which could be broken down
> > > > as:
> > > > 1. The need for a multi-user-accessible yet temporary table
> > > > 2. Loading data into a table immediately after it is created (i.e. in
> > > > same transaction), including but not limited to a reload from pg_dump
> > > > 3. How to load data quickly into an existing table (COPY)
> > > > 4. How to add/modify data quickly in an existing table (INSERT SELECT,
> > > > UPDATE)
>
> > > However, you then seem to be arguing for still using the COPY LOCK
> > > syntax, which I think Bruce intended would go away in favor of using
> > > these ALTER commands. Certainly that's what I'd prefer --- COPY has
> > > got too darn many options already.
>
> COPY LOCK was Tom's suggestion at the end of a long discussion thread on
> this precise issue. Nobody objected to it at that point; I implemented
> it *exactly* that way because I wanted to very visibly follow the
> consensus of the community, after informed debate.
> http://archives.postgresql.org/pgsql-hackers/2005-06/msg00068.php
>
> Please re-read the links to previous discussions.
> http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
> There are points there, not made by me, that still apply and need to be
> considered here, yet have not been.
Yes, I know we agreed to the COPY LOCK, but new features now being
requested, so we have to re-evaluate where we are going with COPY LOCK
to get a more consistent solution.
> Just to restate my current thinking:
> - agree we should have ALTER TABLE ... RELIABILITY DELETE ROWS
> - we should have COPY LOCK rather than
> ALTER TABLE .... RELIABILITY EXCLUSIVE
> (Though I welcome better wording and syntax in either case; it is the
> behaviour only that I discuss).
>
> It seems now that we have agreed approaches for (1), (2) and (4). Please
> note that I have listened to the needs of others with regard to
> requirement (1), as espoused by earlier by Hannu and again now by
> Martijn. Some of the points about requirement (3) I made in my previous
> post have not yet been addressed, IMHO.
>
> My mind is not fixed. AFAICS there are valid points remaining on both
> sides of the discussion about loading data quickly into an existing
> table.
>
> > I do think it is valid concern about someone use the table between the
> > CREATE and the ALTER TABLE RELIABILITY. One solution would be to allow
> > the RELIABILITY as part of the CREATE TABLE, another is to tell users to
> > create the table inside a transaction.
>
> Neither solution works for this use case:
>
> > > 3. How to load data quickly into an existing table (COPY)
>
> This is the only use case for which ALTER TABLE ... EXCLUSIVE makes
> sense. That option means that any write lock held upon the table would
> be an EXCLUSIVE table lock, so would never be a performance gain with
> single row INSERT, UPDATE or DELETEs.
Ah, but people wanted fast INSERT INTO ... SELECT, and that would use
EXCLUSIVE too. What about a massive UPDATE? Perhaps that could use
EXCLUSIVE? We don't want to add "LOCK" to every command that might use
EXCLUSIVE. ALTER is much better for this.
I agree if we thought EXCLUSIVE would only be used for COPY, we could
use LOCK, but I am thinking it will be used for other commands as well.
> Following Andrew's concerns, I'd also note that ALTER TABLE requires a
> much higher level of privilege to operate than does COPY. That sounds
> like it will make things more secure, but all it does is open up the
> administrative rights, since full ownership rights must be obtained
> merely to load data.
True, but as pointed out by others, I don't see that happening too
often.
> > Having COPY behave differently because it is
> > in a transaction is fine as long as it is user-invisible
>
> Good
>
> > I think there is great utility in giving users one API, namely
> > RELIABILITY (or some other keyword), and telling them that is where they
> > control logging. I realize adding one keyword, LOCK, to an existing
> > command isn't a big deal, but once you decentralize your API enough
> > times, you end up with a terribly complex database system. It is this
> > design rigidity that helps make PostgreSQL so much easier to use than
> > other database systems.
>
> I do see the appeal of your suggestion...
>
> TRUNCATE is a special command to delete quickly. There is no requirement
> to do an ALTER TABLE statement before that command executes.
The TRUNCATE happens during recovery. There is no user interaction. It
happens because we can't restore the contents of the table in a
consistent state because no logging was used. Basically, a table marked
RELIABILITY TRUNCATE would be truncated on a recovery start of the
postmaster.
> Balance would suggest that a special command to load data quickly would
> be reasonably accepted by users.
>
>
>
>
> Minor points below:
>
> > > > In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
> > > > pg_dump *will* work with any other version of PostgreSQL, which *would
> > > > not* be the case if we added ALTER TABLE ... RELIABILITY statements into
> > > > it.
> > >
> > > Wrong --- the good thing about ALTER TABLE is that an old version of
> > > Postgres would simply reject it and keep going. Therefore we could get
> > > the speedup in dumps without losing compatibility, which is not true
> > > of COPY LOCK.
>
> That was pointing out one of Bruce's objections was not relevant because
> it assumed COPY LOCK was required to make pg_restore go faster; that was
> not the case - so there is no valid objection either way now.
I don't consider the single-transaction to be a no-cost solution. You
are adding flags to commands, and you are using a dump layout for
performance where the purpose for the layout is not clear. The ALTER is
clear to the user, and it allows nologging operations to happen after
the table is created.
In fact, for use in pg_dump, I think DROP is the proper operation for
loading, not your transaction wrapping solution. We already agree we
need DROP (or TRUNCATE), so why not use that rather than the transaction
wrap idea?
> > > BTW, this is a perfect example of the use-case for not abandoning a
> > > dump-file load simply because one command fails. (We have relied on
> > > this sort of reasoning many times before, too, for example by using
> > > "SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
> > > I don't think that "wrap the whole load into begin/end" is really a very
> > > workable answer, because there are far too many scenarios where you
> > > can't do that. Another one where it doesn't help is a data-only dump.
>
> Which is why --single-transaction is not the default, per the earlier
> discussion on that point (on -patches).
Right, but why not use DROP/TRUNCATE? That works for old dumps too, and
has no downsides, meaning it can be always on.
> > Yep, Tom is echoing my reaction. There is a temptation to add things up
> > onto existing commands, e.g. LOCK, and while it works, it makes for some
> > very complex user API's. Having COPY behave differently because it is
> > in a transaction is fine as long as it is user-invisible, but once you
> > require users to do that to get the speedup, it isn't user-invisible
> > anymore.
> >
> > (I can see it now, "Why is pg_dump putting things in transactions?",
> > "Because it prevents it from being logged." "Oh, should I be doing that
> > in my code?" "Perhaps, if you want ..." You can see where that
> > discussion is going. Having them see "ATER TABLE ... RELIBILITY
> > TRUNCATE" is very clear, and very clear on how it can be used in user
> > code.)
>
> The above case is not an argument against COPY LOCK. Exactly what you
> say above would still occur even when we have ALTER TABLE ...
> RELIABILITY statement, since COPY LOCK and
> COPY-optimized-within-same-transaction are different things.
See my posting above that we might want EXCLUSIVE for other commands,
meaning ALTER makes more sense.
So, to summarize, I think we should add DROP/TRUNCATE, and use that by
default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
From pgsql-hackers-owner+M78065=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 12:40:48 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
Message-ID: <43B570C9.6060406@dunslane.net>
Date: Fri, 30 Dec 2005 12:39:21 -0500
From: Andrew Dunstan <andrew@dunslane.net>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.7.12-1.3.1
X-Accept-Language: en-us, en
To: Tom Lane <tgl@sss.pgh.pa.us>
cc: simon@2ndquadrant.com, pgman@candle.pha.pa.us, kleptog@svana.org,
gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
References: <1135948152.2862.113.camel@localhost.localdomain> <56737.68.143.134.146.1135954413.squirrel@www.dunslane.net> <11876.1135954626@sss.pgh.pa.us>
In-Reply-To: <11876.1135954626@sss.pgh.pa.us>
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.041 required=5 tests=[AWL=0.041]
X-Spam-Score: 0.041
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 1815
Tom Lane wrote:
>"Andrew Dunstan" <andrew@dunslane.net> writes:
>
>
>>Simon Riggs said:
>>
>>
>>>Following Andrew's concerns, I'd also note that ALTER TABLE requires a
>>>much higher level of privilege to operate than does COPY. That sounds
>>>like it will make things more secure, but all it does is open up the
>>>administrative rights, since full ownership rights must be obtained
>>>merely to load data.
>>>
>>>
>
>
>
>>My concern is more about making plain that this is for special operations,
>>not normal operations. Or maybe I have misunderstood the purpose.
>>
>>
>
>Rephrase that as "full ownership rights must be obtained to load data in
>a way that requires dropping any existing indexes and locking out other
>users of the table". I don't think the use-case for this will be very
>large for non-owners, or indeed even for owners except during initial
>table creation; and so I don't think the above argument is strong.
>
>
>
>
Those restrictions aren't true of Bruce's proposed drop and
delete/truncate recovery modes, are they?
People do crazy things in pursuit of performance. Illustration: a few
months ago I was instrumenting an app (based on MySQL/ISAM) and I
noticed that under load it simply didn't update the inventory properly -
of 1000 orders placed within a few seconds it might reduce inventory by
3 or 4. I reported this and they shrugged their shoulders and said
"well, we'd have to lock the table and that would slow everything down
...".
I just want to be sure we aren't providing a footgun. "Oh, just set
recovery mode to delete. It won't make any difference unless you crash
and you'll run faster."
cheers
andrew
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
From pgsql-hackers-owner+M78066=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 12:58:52 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200512301758.jBUHwFv03107@candle.pha.pa.us>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <43B570C9.6060406@dunslane.net>
To: Andrew Dunstan <andrew@dunslane.net>
Date: Fri, 30 Dec 2005 12:58:15 -0500 (EST)
cc: Tom Lane <tgl@sss.pgh.pa.us>, simon@2ndquadrant.com, kleptog@svana.org,
gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
X-Spam-Score: 0.12
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 1996
Andrew Dunstan wrote:
> >>My concern is more about making plain that this is for special operations,
> >>not normal operations. Or maybe I have misunderstood the purpose.
> >>
> >>
> >
> >Rephrase that as "full ownership rights must be obtained to load data in
> >a way that requires dropping any existing indexes and locking out other
> >users of the table". I don't think the use-case for this will be very
> >large for non-owners, or indeed even for owners except during initial
> >table creation; and so I don't think the above argument is strong.
> >
> >
> >
> >
>
> Those restrictions aren't true of Bruce's proposed drop and
> delete/truncate recovery modes, are they?
Only the owner could do the ALTER, for sure, but once the owner sets it,
any user with permission to write to the table would have those
characteristics.
> People do crazy things in pursuit of performance. Illustration: a few
> months ago I was instrumenting an app (based on MySQL/ISAM) and I
> noticed that under load it simply didn't update the inventory properly -
> of 1000 orders placed within a few seconds it might reduce inventory by
> 3 or 4. I reported this and they shrugged their shoulders and said
> "well, we'd have to lock the table and that would slow everything down
> ...".
>
> I just want to be sure we aren't providing a footgun. "Oh, just set
> recovery mode to delete. It won't make any difference unless you crash
> and you'll run faster."
I think we have to trust the object owner in this case. I don't know of
any super-user-only ALTER commands, but I suppose we could set it up
that way if we wanted.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
From pgsql-hackers-owner+M78070=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 14:29:06 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Simon Riggs <simon@2ndquadrant.com>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>,
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
In-Reply-To: <200512301649.jBUGnxn21488@candle.pha.pa.us>
References: <200512301649.jBUGnxn21488@candle.pha.pa.us>
Date: Fri, 30 Dec 2005 19:28:41 +0000
Message-ID: <1135970921.5052.68.camel@localhost.localdomain>
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.034 required=5 tests=[AWL=0.034]
X-Spam-Score: 0.034
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 3112
On Fri, 2005-12-30 at 11:49 -0500, Bruce Momjian wrote:
> Yes, I know we agreed to the COPY LOCK, but new features now being
> requested, so we have to re-evaluate where we are going with COPY LOCK
> to get a more consistent solution.
Thank you.
> Ah, but people wanted fast INSERT INTO ... SELECT, and that would use
> EXCLUSIVE too. What about a massive UPDATE? Perhaps that could use
> EXCLUSIVE? We don't want to add "LOCK" to every command that might use
> EXCLUSIVE. ALTER is much better for this.
> I agree if we thought EXCLUSIVE would only be used for COPY, we could
> use LOCK, but I am thinking it will be used for other commands as well.
Agreed, I will look to implement this.
Could the internals of my recent patch be reviewed? Changing the user
interface is less of a problem than changing the internals, which is
where the hard work takes place. I do not want to extend this work
further only to have that part rejected later.
The implications of EXCLUSIVE are:
- there will be a check on each and every I, U, D to check the state of
the relation
- *every* operation that attempts a write lock will attempt to acquire
an EXCLUSIVE full table lock instead
- following successful completion of *each* DML statement, the relation
will be heap_sync'd involving a full scan of the buffer cache
Can I clarify the wording of the syntax? Is EXCLUSIVE the right word?
How about FASTLOAD or BULKLOAD? Those words seem less likely to be
misused in the future - i.e. we are invoking a special mode, rather than
invoking a special "go faster" option.
> I don't consider the single-transaction to be a no-cost solution. You
> are adding flags to commands, and you are using a dump layout for
> performance where the purpose for the layout is not clear. The ALTER is
> clear to the user, and it allows nologging operations to happen after
> the table is created.
>
> In fact, for use in pg_dump, I think DROP is the proper operation for
> loading, not your transaction wrapping solution. We already agree we
> need DROP (or TRUNCATE), so why not use that rather than the transaction
> wrap idea?
This was discussed on-list by 2 core team members, a committer and
myself, but I see no requirements change here. You even accepted the
invisible COPY optimization in your last post - why unpick that now?
Please forgive my tone, but I am lost for reasonable yet expressive
words.
The --single-transaction mode would apply even if the dump was created
using an earlier version of pg_dump. pg_dump has *not* been altered at
all. (And I would again add that the idea was not my own)
> So, to summarize, I think we should add DROP/TRUNCATE, and use that by
> default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
> for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
Would you mind stating again what you mean, just so I can understand
this? Your summary isn't enough.
Best Regards, Simon Riggs
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
From pgsql-hackers-owner+M78072=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 16:15:30 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200512302114.jBULEno02301@candle.pha.pa.us>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <1135970921.5052.68.camel@localhost.localdomain>
To: Simon Riggs <simon@2ndquadrant.com>
Date: Fri, 30 Dec 2005 16:14:49 -0500 (EST)
cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>,
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
X-Spam-Score: 0.12
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 7285
Simon Riggs wrote:
> On Fri, 2005-12-30 at 11:49 -0500, Bruce Momjian wrote:
>
> > Yes, I know we agreed to the COPY LOCK, but new features now being
> > requested, so we have to re-evaluate where we are going with COPY LOCK
> > to get a more consistent solution.
>
> Thank you.
Good. I think we can be happy that COPY LOCK didn't get into a release,
so we don't have to support it forever. When we are adding features, we
have to consider not only the current release, but future releases and
what people will ask for in the future so the syntax can be expanded
without breaking previous usage.
> > Ah, but people wanted fast INSERT INTO ... SELECT, and that would use
> > EXCLUSIVE too. What about a massive UPDATE? Perhaps that could use
> > EXCLUSIVE? We don't want to add "LOCK" to every command that might use
> > EXCLUSIVE. ALTER is much better for this.
>
> > I agree if we thought EXCLUSIVE would only be used for COPY, we could
> > use LOCK, but I am thinking it will be used for other commands as well.
>
> Agreed, I will look to implement this.
>
> Could the internals of my recent patch be reviewed? Changing the user
> interface is less of a problem than changing the internals, which is
> where the hard work takes place. I do not want to extend this work
> further only to have that part rejected later.
OK, I will look it over this week or next.
> The implications of EXCLUSIVE are:
> - there will be a check on each and every I, U, D to check the state of
> the relation
> - *every* operation that attempts a write lock will attempt to acquire
> an EXCLUSIVE full table lock instead
> - following successful completion of *each* DML statement, the relation
> will be heap_sync'd involving a full scan of the buffer cache
Yes, I think that is it. What we can do is implement EXCLUSIVE to
affect only COPY at this point, and document that, and later add other
commands.
> Can I clarify the wording of the syntax? Is EXCLUSIVE the right word?
> How about FASTLOAD or BULKLOAD? Those words seem less likely to be
> misused in the future - i.e. we are invoking a special mode, rather than
> invoking a special "go faster" option.
The problem with the FASTLOAD/BULKLOAD words is that EXCLUSIVE mode is
probably not the best for loading. I would think TRUNCATE would be a
better option.
In fact, in loading a table, I think both EXCLUSIVE and TRUNCATE would be
the same, mostly. You would create the table, set its RELIABILITY to
TRUNCATE, COPY into the table, then set the RELIABILITY to SHARE or
DEFAULT. The second ALTER has to sync all the dirty data blocks, which
the same thing EXCLUSIVE does at the conclusion of COPY.
So, we need a name for EXCLUSIVE mode that suggests how it is different
from TRUNCATE, and in this case, the difference is that EXCLUSIVE
preserves the previous contents of the table on recovery, while TRUNCATE
does not. Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
Anyway, the keywords are easy to modify, even after the patch is
submitted. FYI, I usually go through keywords.c looking for a keyword
we already use.
> > I don't consider the single-transaction to be a no-cost solution. You
> > are adding flags to commands, and you are using a dump layout for
> > performance where the purpose for the layout is not clear. The ALTER is
> > clear to the user, and it allows nologging operations to happen after
> > the table is created.
> >
> > In fact, for use in pg_dump, I think DROP is the proper operation for
> > loading, not your transaction wrapping solution. We already agree we
> > need DROP (or TRUNCATE), so why not use that rather than the transaction
> > wrap idea?
>
> This was discussed on-list by 2 core team members, a committer and
> myself, but I see no requirements change here. You even accepted the
> invisible COPY optimization in your last post - why unpick that now?
> Please forgive my tone, but I am lost for reasonable yet expressive
> words.
Do you think you are the only one who has rewritten a patch multiple
times? We all have. The goal is to get the functionality into the
system in the most seamless way possible. Considering the number of
people who use PostgreSQL, if it takes use 10 tries, it is worth it
considering the thousands of people who will use it. Would you have us
include a sub-optimal patch and have thousands of people adjust to its
non-optimal functionality? I am sure you would not. Perhaps a company
would say, "Oh, just ship it", but we don't.
> The --single-transaction mode would apply even if the dump was created
> using an earlier version of pg_dump. pg_dump has *not* been altered at
> all. (And I would again add that the idea was not my own)
I assume you mean this:
http://archives.postgresql.org/pgsql-patches/2005-12/msg00257.php
I guess with the ALTER commands I don't see much value in the
--single-transaction flag. I am sure others suggested it, but would
they suggest it now given our current direction. The fact that the
patch was submitted does not give it any more weight --- the question is
does this feature make sense for 8.2. The goal is not to cram as many
optimizations into PostgreSQL as possible, the goal is to present a
consistent usable system to users.
> > So, to summarize, I think we should add DROP/TRUNCATE, and use that by
> > default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
> > for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
>
> Would you mind stating again what you mean, just so I can understand
> this? Your summary isn't enough.
New ALTER TABLE mode, perhaps call it PERSISTENCE:
ALTER TABLE tab PERSISTENCE DROP ON RECOVERY
ALTER TABLE tab PERSISTENCE TRUNCATE ON RECOVERY
These would drop or truncate all tables with this flag on a non-clean
start of the postmaster, and write something in the server logs.
However, I don't know that we have the code in place to DROP/TRUNCATE in
recovery mode, and it would affect all databases, so it could be quite
complex to implement. In this mode, no WAL logs would be written for
table modifications, though DDL commands would have to be logged.
ALTER TABLE tab PERSISTENCE PRESERVE (or STABLE?)
Table contents are preserved across recoveries, but data modifications
can happen only one at a time. I don't think we have a lock mode that
does this, so I am worried a new lock mode will have to be created. A
simplified solution at this stage would be to take an exclusive lock on
the table, but really we just need a single-writer table lock, which I
don't think we have. initially this can implemented to only affect COPY
but later can be done for other commands.
ALTER TABLE tab PERSISTENCE DEFAULT
This would be our current default mode, which is full concurrency and
persistence.
It took me over an hour to write this, but I feel the time is worth it
because of the number of users who use our software.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
From pgsql-hackers-owner+M78076=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 17:37:00 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Simon Riggs <simon@2ndquadrant.com>, Andrew Dunstan <andrew@dunslane.net>,
Tom Lane <tgl@sss.pgh.pa.us>, Martijn van Oosterhout <kleptog@svana.org>,
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
References: <200512302114.jBULEno02301@candle.pha.pa.us>
In-Reply-To: <200512302114.jBULEno02301@candle.pha.pa.us>
From: Greg Stark <gsstark@mit.edu>
Organization: The Emacs Conspiracy; member since 1992
Date: 30 Dec 2005 17:36:24 -0500
Message-ID: <87mzii8b6f.fsf@stark.xeocode.com>
Lines: 28
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
X-Spam-Score: 0.113
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 1424
As far as EXCLUSIVE or COPY LOCK goes, I think this would be useful
functionality but perhaps there doesn't have to be any proprietary user
interface to it at all. Why not just check if the conditions are already
present to allow the optimization and if so go ahead.
That is, if the current transaction already has an exclusive lock on the table
and there are no indexes (and PITR isn't active) then Postgres could go ahead
and use the same WAL skipping logic as the other operations that already so
so. This would work for inserts whether coming from COPY or plain SQL INSERTs.
The nice thing about this is that the user's SQL wouldn't need any proprietary
extensions at all. Just tell people to do
BEGIN;
LOCK TABLE foo;
COPY foo from ...
COMMIT;
There could be a COPY LOCK option to obtain a lock, but it would be purely for
user convenience so they don't have to bother with BEGIN and COMMIt.
The only downside is a check to see if an exclusive table lock is present on
every copy and insert. That might be significant but perhaps there are ways to
finess that. If not perhaps only doing it on COPY would be a good compromise.
--
greg
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
From pgsql-hackers-owner+M78077=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 17:47:18 2005
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200512302246.jBUMkjF25196@candle.pha.pa.us>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <87mzii8b6f.fsf@stark.xeocode.com>
To: Greg Stark <gsstark@mit.edu>
Date: Fri, 30 Dec 2005 17:46:45 -0500 (EST)
cc: Simon Riggs <simon@2ndquadrant.com>, Andrew Dunstan <andrew@dunslane.net>,
Tom Lane <tgl@sss.pgh.pa.us>, Martijn van Oosterhout <kleptog@svana.org>,
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
X-Spam-Score: 0.12
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 2135
Greg Stark wrote:
>
> As far as EXCLUSIVE or COPY LOCK goes, I think this would be useful
> functionality but perhaps there doesn't have to be any proprietary user
> interface to it at all. Why not just check if the conditions are already
> present to allow the optimization and if so go ahead.
>
> That is, if the current transaction already has an exclusive lock on the table
> and there are no indexes (and PITR isn't active) then Postgres could go ahead
> and use the same WAL skipping logic as the other operations that already so
> so. This would work for inserts whether coming from COPY or plain SQL INSERTs.
>
> The nice thing about this is that the user's SQL wouldn't need any proprietary
> extensions at all. Just tell people to do
>
> BEGIN;
> LOCK TABLE foo;
> COPY foo from ...
> COMMIT;
>
> There could be a COPY LOCK option to obtain a lock, but it would be purely for
> user convenience so they don't have to bother with BEGIN and COMMIt.
>
> The only downside is a check to see if an exclusive table lock is present on
> every copy and insert. That might be significant but perhaps there are ways to
> finess that. If not perhaps only doing it on COPY would be a good compromise.
Well, again, if we wanted to use EXCLUSIVE only for COPY, this might
make sense. However, also consider that the idea for EXCLUSIVE was that
users could continue read-only queries on the table while it is being
loaded (like COPY allows now), and that in EXCLUSIVE mode, we are only
going to write into new pages.
If someone has an exclusive lock on the table and does a COPY or SELECT
INTO do we want to assume we are only going to write into new pages, and
do we want to force an exclusive lock rather than a single-writer lock?
I don't think so.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
From mpaesold@gmx.at Sat Dec 31 06:59:51 2005
Date: Sat, 31 Dec 2005 12:59:44 +0100 (MET)
From: Michael Paesold <mpaesold@gmx.at>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: simon@2ndquadrant.com, andrew@dunslane.net, tgl@sss.pgh.pa.us,
kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
pgsql-hackers@postgresql.org
References: <200512302114.jBULEno02301@candle.pha.pa.us>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
X-Priority: 3 (Normal)
X-Authenticated: #1946847
Message-ID: <14969.1136030384@www6.gmx.net>
X-Mailer: WWW-Mail 1.6 (Global Message Exchange)
X-Flags: 0001
Content-Length: 1305
Bruce Momjian wrote:
> > The --single-transaction mode would apply even if the dump was created
> > using an earlier version of pg_dump. pg_dump has *not* been altered at
> > all. (And I would again add that the idea was not my own)
>
> I assume you mean this:
>
> http://archives.postgresql.org/pgsql-patches/2005-12/msg00257.php
>
> I guess with the ALTER commands I don't see much value in the
> --single-transaction flag. I am sure others suggested it, but would
> they suggest it now given our current direction.
I just want to add that --single-transaction has a value of it's own. There
were times when I wanted to restore parts of a dump all-or-nothing.
This is possible with PostgreSQL, unlike many other DBM systems, because
people like Tom Lane have invested in ensuring that all DDL is working
without implicitly committing an enclosing transaction.
Using pg_restore directly into a database, it is not possible to get a
single transaction right now. One has to restore to a file and manually
added BEGIN/COMMIT. Just for that I think --single-transaction is a great
addition and a missing feature.
I think more people have a use-case for that.
Best Regards,
Michael Paesold
--
Telefonieren Sie schon oder sparen Sie noch?
NEU: GMX Phone_Flat http://www.gmx.net/de/go/telefonie
From pgsql-hackers-owner+M78213=pgman=candle.pha.pa.us@postgresql.org Tue Jan 3 12:08:43 2006
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200601031708.k03H85j27170@candle.pha.pa.us>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <17173.1136306881@sss.pgh.pa.us>
To: Tom Lane <tgl@sss.pgh.pa.us>
Date: Tue, 3 Jan 2006 12:08:05 -0500 (EST)
cc: Jim C. Nasby <jnasby@pervasive.com>,
Andrew Dunstan <andrew@dunslane.net>, simon@2ndquadrant.com,
kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.121 required=5 tests=[AWL=0.121]
X-Spam-Score: 0.121
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 1125
Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > On Tue, Jan 03, 2006 at 11:26:51AM -0500, Tom Lane wrote:
> >> Such an ALTER would certainly require exclusive lock on the table,
> >> so I'm not sure that I see much use-case for doing it like that.
> >> You'd want to do the ALTER and commit so as not to lock other people
> >> out of the table entirely while doing the bulk data-pushing.
>
> > Maybe this just isn't clear, but would EXCLUSIVE block writes from all
> > other sessions then?
>
> I don't think it should (which implies that EXCLUSIVE is a bad name).
Agreed, EXCLUSIVE was used to mean an _exclusive_ writer. The new words
I proposed were PRESERVE or STABLE.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
From tgl@sss.pgh.pa.us Tue Jan 3 12:37:34 2006
To: Stephen Frost <sfrost@snowman.net>
cc: Jim C. Nasby <jnasby@pervasive.com>,
Bruce Momjian <pgman@candle.pha.pa.us>,
Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <20060103165359.GP6026@ns.snowman.net>
References: <200512291605.jBTG5gi00396@candle.pha.pa.us> <7966.1135873468@sss.pgh.pa.us> <20060103154521.GC82560@pervasive.com> <20060103162137.GO6026@ns.snowman.net> <16856.1136305742@sss.pgh.pa.us> <20060103165359.GP6026@ns.snowman.net>
Comments: In-reply-to Stephen Frost <sfrost@snowman.net>
message dated "Tue, 03 Jan 2006 11:54:01 -0500"
Date: Tue, 03 Jan 2006 12:37:32 -0500
Message-ID: <17841.1136309852@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Content-Length: 976
Stephen Frost <sfrost@snowman.net> writes:
> The problem is that you might want to grant 'truncate' to people who
> *aren't* particularly trusted. For truncate, at least I have a
> real-world use-case for it.
I don't find this use-case particularly convincing. If the users are
allowed to delete all data in a given table, then that table must be
dedicated to them anyway; so it's not that easy to see why you can't
risk giving them ownership rights on it. The worst they can do is
screw up their own data, no?
In any case, I don't see what's so wrong with the model of using
SECURITY DEFINER interface functions when you want a security
restriction that's finer-grain than the system provides. I really
*don't* want to see us trying to, say, categorize every variety of
ALTER TABLE as a separately grantable privilege. I could live with
something like a catchall "ADMIN" privilege ... except it's not
clear how that would differ from ownership.
regards, tom lane
From pgsql-hackers-owner+M78221=pgman=candle.pha.pa.us@postgresql.org Tue Jan 3 13:30:34 2006
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
Date: Tue, 3 Jan 2006 13:30:56 -0500
From: Stephen Frost <sfrost@snowman.net>
To: Tom Lane <tgl@sss.pgh.pa.us>
cc: Jim C. Nasby <jnasby@pervasive.com>,
Bruce Momjian <pgman@candle.pha.pa.us>,
Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
Message-ID: <20060103183056.GR6026@ns.snowman.net>
Mail-Followup-To: Tom Lane <tgl@sss.pgh.pa.us>,
"Jim C. Nasby" <jnasby@pervasive.com>,
Bruce Momjian <pgman@candle.pha.pa.us>,
Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca,
zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
References: <200512291605.jBTG5gi00396@candle.pha.pa.us> <7966.1135873468@sss.pgh.pa.us> <20060103154521.GC82560@pervasive.com> <20060103162137.GO6026@ns.snowman.net> <16856.1136305742@sss.pgh.pa.us> <20060103165359.GP6026@ns.snowman.net> <17841.1136309852@sss.pgh.pa.us>
Content-Disposition: inline
In-Reply-To: <17841.1136309852@sss.pgh.pa.us>
X-Editor: Vim http://www.vim.org/
X-Info: http://www.snowman.net
X-Operating-System: Linux/2.4.24ns.3.0 (i686)
X-Uptime: 12:39:16 up 206 days, 9:50, 11 users, load average: 0.02, 0.05, 0.05
User-Agent: Mutt/1.5.9i
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.105 required=5 tests=[AWL=0.105]
X-Spam-Score: 0.105
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 2666
-- Start of PGP signed section.
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> I don't find this use-case particularly convincing. If the users are
> allowed to delete all data in a given table, then that table must be
> dedicated to them anyway; so it's not that easy to see why you can't
> risk giving them ownership rights on it. The worst they can do is
> screw up their own data, no?
Being able to delete all data in a given table in no way implies
ownership rights. The tables are part of a specification which the
users are being asked to respond to. Being able to change the table
types or remove the constraints put on the tables would allow the
users to upload garbage which would then affect downstream processing.
We can't guarentee this won't happen anyway but we try to confine the
things they can mess up to a reasonable set which we can check for (and
do, through a rather involved error checking system). There are *alot*
of things built on top of the table structures and having them change
would basically break the whole system (without the appropriate changes
being made to the other parts of the system).
> In any case, I don't see what's so wrong with the model of using
> SECURITY DEFINER interface functions when you want a security
> restriction that's finer-grain than the system provides. I really
> *don't* want to see us trying to, say, categorize every variety of
> ALTER TABLE as a separately grantable privilege. I could live with
> something like a catchall "ADMIN" privilege ... except it's not
> clear how that would differ from ownership.
I don't think anyone's asked for 'ALTER TABLE' privileges to be
seperately grantable. It seems to me that the privileges which *need*
to be grantable are ones associated with DML statements. I would
classify TRUNCATE, VACUUM and ANALYZE as DML statements (along with
select, insert, update, and delete). They're PostgreSQL-specific DML
statements but they still fall into that category. I don't think
it's a coincidence that the SQL-defined DML statements are all,
individually, grantable.
That doesn't mean I think we should get rid of RULE, REFERENCES or
TRIGGER, though honestly I've very rarely needed to grant any of them
(I don't think I've ever granted RULE or TRIGGER...). References is
DDL-oriented, but for *other* tables; RULE and TRIGGER are DDL and I
can't really justify why someone other than the owner would need them
but I'm guessing someone's using them. I don't think their existance
should imply that if we ever change the grants again we have to include
all types of 'ALTER TABLE', etc, though.
Thanks,
Stephen
-- End of PGP section, PGP failed!
From sfrost@snowman.net Tue Jan 3 13:30:13 2006
Date: Tue, 3 Jan 2006 13:30:56 -0500
From: Stephen Frost <sfrost@snowman.net>
To: Tom Lane <tgl@sss.pgh.pa.us>
cc: Jim C. Nasby <jnasby@pervasive.com>,
Bruce Momjian <pgman@candle.pha.pa.us>,
Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
Message-ID: <20060103183056.GR6026@ns.snowman.net>
Mail-Followup-To: Tom Lane <tgl@sss.pgh.pa.us>,
"Jim C. Nasby" <jnasby@pervasive.com>,
Bruce Momjian <pgman@candle.pha.pa.us>,
Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca,
zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
References: <200512291605.jBTG5gi00396@candle.pha.pa.us> <7966.1135873468@sss.pgh.pa.us> <20060103154521.GC82560@pervasive.com> <20060103162137.GO6026@ns.snowman.net> <16856.1136305742@sss.pgh.pa.us> <20060103165359.GP6026@ns.snowman.net> <17841.1136309852@sss.pgh.pa.us>
Content-Disposition: inline
In-Reply-To: <17841.1136309852@sss.pgh.pa.us>
X-Editor: Vim http://www.vim.org/
X-Info: http://www.snowman.net
X-Operating-System: Linux/2.4.24ns.3.0 (i686)
X-Uptime: 12:39:16 up 206 days, 9:50, 11 users, load average: 0.02, 0.05, 0.05
User-Agent: Mutt/1.5.9i
Content-Length: 2666
-- Start of PGP signed section.
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> I don't find this use-case particularly convincing. If the users are
> allowed to delete all data in a given table, then that table must be
> dedicated to them anyway; so it's not that easy to see why you can't
> risk giving them ownership rights on it. The worst they can do is
> screw up their own data, no?
Being able to delete all data in a given table in no way implies
ownership rights. The tables are part of a specification which the
users are being asked to respond to. Being able to change the table
types or remove the constraints put on the tables would allow the
users to upload garbage which would then affect downstream processing.
We can't guarentee this won't happen anyway but we try to confine the
things they can mess up to a reasonable set which we can check for (and
do, through a rather involved error checking system). There are *alot*
of things built on top of the table structures and having them change
would basically break the whole system (without the appropriate changes
being made to the other parts of the system).
> In any case, I don't see what's so wrong with the model of using
> SECURITY DEFINER interface functions when you want a security
> restriction that's finer-grain than the system provides. I really
> *don't* want to see us trying to, say, categorize every variety of
> ALTER TABLE as a separately grantable privilege. I could live with
> something like a catchall "ADMIN" privilege ... except it's not
> clear how that would differ from ownership.
I don't think anyone's asked for 'ALTER TABLE' privileges to be
seperately grantable. It seems to me that the privileges which *need*
to be grantable are ones associated with DML statements. I would
classify TRUNCATE, VACUUM and ANALYZE as DML statements (along with
select, insert, update, and delete). They're PostgreSQL-specific DML
statements but they still fall into that category. I don't think
it's a coincidence that the SQL-defined DML statements are all,
individually, grantable.
That doesn't mean I think we should get rid of RULE, REFERENCES or
TRIGGER, though honestly I've very rarely needed to grant any of them
(I don't think I've ever granted RULE or TRIGGER...). References is
DDL-oriented, but for *other* tables; RULE and TRIGGER are DDL and I
can't really justify why someone other than the owner would need them
but I'm guessing someone's using them. I don't think their existance
should imply that if we ever change the grants again we have to include
all types of 'ALTER TABLE', etc, though.
Thanks,
Stephen
-- End of PGP section, PGP failed!
From pgsql-hackers-owner+M78233=pgman=candle.pha.pa.us@postgresql.org Tue Jan 3 17:39:06 2006
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200601032238.k03McP804163@candle.pha.pa.us>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <20060103212750.GT82560@pervasive.com>
To: Jim C. Nasby <jnasby@pervasive.com>
Date: Tue, 3 Jan 2006 17:38:25 -0500 (EST)
cc: Tom Lane <tgl@sss.pgh.pa.us>, Andrew Dunstan <andrew@dunslane.net>,
simon@2ndquadrant.com, kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca,
zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.121 required=5 tests=[AWL=0.121]
X-Spam-Score: 0.121
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 1714
Jim C. Nasby wrote:
> > We would be creating a new lock type for this.
>
> Sorry if I've just missed this in the thread, but what would the new
> lock type do? My impression is that as it stands you can either do:
>
> BEGIN;
> ALTER TABLE EXCLUSIVE;
> ...
> ALTER TABLE SHARE; --fsync
> COMMIT;
>
> Which would block all other access to the table as soon as the first
> ALTER TABLE happens. Or you can:
>
> ALTER TABLE EXCLUSIVE;
> ...
> ALTER TABLE SHARE;
>
> Which means that between the two ALTER TABLES every backend that does
> DML on that table will not have that DML logged, but because there's no
> exclusive lock that DML would be allowed to occur.
Right, the DML will be single-threaded and fsync of all dirty pages will
happen before commit of each transaction.
> BTW, there might be some usecase for the second scenario, in which case
> it would probably be better to tell the user to aquire a table-lock on
> their own rather than do it automatically as part of the update...
> > Basically meaning your idea of update while EXCLUSIVE/PRESERVE/STABLE is
> > happening is never going to be implemented because it is just too hard
> > to do, and too prone to error.
>
> What I figured. Never hurts to ask though. :)
Actually, it does hurt because it generates discussion volume for no
purpose.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
From pgsql-hackers-owner+M78234=pgman=candle.pha.pa.us@postgresql.org Tue Jan 3 17:54:16 2006
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Simon Riggs <simon@2ndquadrant.com>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>,
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
In-Reply-To: <200512302114.jBULEno02301@candle.pha.pa.us>
References: <200512302114.jBULEno02301@candle.pha.pa.us>
Date: Tue, 03 Jan 2006 22:53:53 +0000
Message-ID: <1136328833.5052.223.camel@localhost.localdomain>
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.04 required=5 tests=[AWL=0.040]
X-Spam-Score: 0.04
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 5373
On Fri, 2005-12-30 at 16:14 -0500, Bruce Momjian wrote:
> Simon Riggs wrote:
> > The implications of EXCLUSIVE are:
> > - there will be a check on each and every I, U, D to check the state of
> > the relation
> > - *every* operation that attempts a write lock will attempt to acquire
> > an EXCLUSIVE full table lock instead
> > - following successful completion of *each* DML statement, the relation
> > will be heap_sync'd involving a full scan of the buffer cache
>
> Yes, I think that is it. What we can do is implement EXCLUSIVE to
> affect only COPY at this point, and document that, and later add other
> commands.
>
> > Can I clarify the wording of the syntax? Is EXCLUSIVE the right word?
> > How about FASTLOAD or BULKLOAD? Those words seem less likely to be
> > misused in the future - i.e. we are invoking a special mode, rather than
> > invoking a special "go faster" option.
>
> The problem with the FASTLOAD/BULKLOAD words is that EXCLUSIVE mode is
> probably not the best for loading. I would think TRUNCATE would be a
> better option.
>
> In fact, in loading a table, I think both EXCLUSIVE and TRUNCATE would be
> the same, mostly. You would create the table, set its RELIABILITY to
> TRUNCATE, COPY into the table, then set the RELIABILITY to SHARE or
> DEFAULT. The second ALTER has to sync all the dirty data blocks, which
> the same thing EXCLUSIVE does at the conclusion of COPY.
>
> So, we need a name for EXCLUSIVE mode that suggests how it is different
> from TRUNCATE, and in this case, the difference is that EXCLUSIVE
> preserves the previous contents of the table on recovery, while TRUNCATE
> does not. Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
> Anyway, the keywords are easy to modify, even after the patch is
> submitted. FYI, I usually go through keywords.c looking for a keyword
> we already use.
I'm very happy for suggestions on what these new modes are called.
> > > So, to summarize, I think we should add DROP/TRUNCATE, and use that by
> > > default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
> > > for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
> >
> > Would you mind stating again what you mean, just so I can understand
> > this? Your summary isn't enough.
>
> New ALTER TABLE mode, perhaps call it PERSISTENCE:
>
> ALTER TABLE tab PERSISTENCE DROP ON RECOVERY
> ALTER TABLE tab PERSISTENCE TRUNCATE ON RECOVERY
>
> These would drop or truncate all tables with this flag on a non-clean
> start of the postmaster, and write something in the server logs.
> However, I don't know that we have the code in place to DROP/TRUNCATE in
> recovery mode, and it would affect all databases, so it could be quite
> complex to implement. In this mode, no WAL logs would be written for
> table modifications, though DDL commands would have to be logged.
Right now, this will be a TODO item... it looks like it will take some
thought to implement correctly.
> ALTER TABLE tab PERSISTENCE PRESERVE (or STABLE?)
>
> Table contents are preserved across recoveries, but data modifications
> can happen only one at a time. I don't think we have a lock mode that
> does this, so I am worried a new lock mode will have to be created. A
> simplified solution at this stage would be to take an exclusive lock on
> the table, but really we just need a single-writer table lock, which I
> don't think we have. initially this can implemented to only affect COPY
> but later can be done for other commands.
ExclusiveLock locks out everything apart from readers, no new lock mode
AFAICS. Implementing that is little additional work for COPY.
Tom had a concern about setting this for I, U, D commands via the
executor. Not sure what the details of that are, as yet.
We can use either of the unlogged modes for pg_dump, so I'd suggest its
this one. Everybody happy with this being the new default in pg_dump, or
should it be an option?
> ALTER TABLE tab PERSISTENCE DEFAULT
>
> This would be our current default mode, which is full concurrency and
> persistence.
I'm thinking whether the ALTER TABLE statement might be better with two
bool flags rather than a 3-state char.
flag 1: ENABLE LOGGING | DISABLE LOGGING
flag 2: FULL RECOVERY | TRUNCATE ON RECOVERY
Giving 3 possible sets of options:
-- the default
ALTER TABLE mytable ENABLE LOGGING FULL RECOVERY; (default)
-- EXCLUSIVE mode
ALTER TABLE mytable DISABLE LOGGING FULL RECOVERY;
...which would be used like this
ALTER TABLE mytable DISABLE LOGGING;
COPY or other bulk data manipulation SQL
ALTER TABLE mytable ENABLE LOGGING;
...since FULL RECOVERY is the default.
-- multiuser temp table mode
ALTER TABLE mytable DISABLE LOGGING TRUNCATE ON RECOVERY;
...which would usually be left on all the time
which only uses one new keyword LOGGING and yet all the modes are fairly
explicit as to what they do.
An alternative might be the slightly more verbose:
ALTER TABLE mytable DISABLE LOGGING FORCE EXCLUSIVE TABLE LOCK;
which would be turned off by
ALTER TABLE mytable ENABLE LOGGING;
Comments?
Best Regards, Simon Riggs
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
From simon@2ndquadrant.com Tue Jan 3 18:10:32 2006
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Simon Riggs <simon@2ndquadrant.com>
To: Jim C. Nasby <jnasby@pervasive.com>,
Bruce Momjian <pgman@candle.pha.pa.us>
cc: Tom Lane <tgl@sss.pgh.pa.us>, Andrew Dunstan <andrew@dunslane.net>,
kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
pgsql-hackers@postgresql.org
In-Reply-To: <200601032120.k03LKl609990@candle.pha.pa.us>
References: <200601032120.k03LKl609990@candle.pha.pa.us>
Date: Tue, 03 Jan 2006 23:10:16 +0000
Message-ID: <1136329816.5052.239.camel@localhost.localdomain>
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
Content-Length: 2118
On Tue, 2006-01-03 at 16:20 -0500, Bruce Momjian wrote:
> Jim C. Nasby wrote:
> > Idealistically, if EXCLUSIVE/PRESERVE/STABLE does it's thing by only
> > appending new pages, it would be nice if other backends could continue
> > performing updates at the same time, assuming there's free space
> > available elsewhere within the table (and that you'd be able to recover
> > those logged changes regardless of the non-logged operations). But
> > that's a pretty lofty goal...
>
> "Idealistically", yep. It would be great if we could put a helmet on
> and the computer would read your mind. :-)
>
> Basically meaning your idea of update while EXCLUSIVE/PRESERVE/STABLE is
> happening is never going to be implemented because it is just too hard
> to do, and too prone to error.
The reason for locking the whole table was to ensure that we do not have
a mixture of logged and non-logged writers writing to the same data
blocks, since that could damage blocks unrecoverably in the event of a
crash. (Though perhaps only if full_block_writes is on)
The ALTER TABLE .. EXCLUSIVE/(insert name) mode would mean that *any*
backend who took a write lock on the table, would lock out the whole
table. So this new mode is not restricted to the job/user who ran the
ALTER TABLE command. (I would note that that is how Oracle and Teradata
do this for pre-load utility table locking, but why should we follow
them on that?)
Currently, when we add a new row when the FSM is empty, we check the
last block of the table. That would cause multiple writers to access the
same blocks and so we would be in danger. The only way to avoid that
would be for logged writers (who would use the FSM if it were not empty)
to notify back to the FSM that they have just added a block - and remove
the behaviour to look for the last block.
Anyway, one step at a time. *Maybe* we can do that in the future, but
right now I'd like to add the basic fast write/load functionality.
Also, I think I will do the docs first this time, just so everyone can
read what we're getting ahead of time, to ensure we all agree.
Best Regards, Simon Riggs
From pgsql-hackers-owner+M78236=pgman=candle.pha.pa.us@postgresql.org Tue Jan 3 18:24:20 2006
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Simon Riggs <simon@2ndquadrant.com>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Jim C. Nasby <jnasby@pervasive.com>, Tom Lane <tgl@sss.pgh.pa.us>,
Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org, gsstark@mit.edu,
pg@rbt.ca, zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
In-Reply-To: <200601032238.k03McP804163@candle.pha.pa.us>
References: <200601032238.k03McP804163@candle.pha.pa.us>
Date: Tue, 03 Jan 2006 23:23:54 +0000
Message-ID: <1136330634.5052.247.camel@localhost.localdomain>
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.043 required=5 tests=[AWL=0.043]
X-Spam-Score: 0.043
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 725
On Tue, 2006-01-03 at 17:38 -0500, Bruce Momjian wrote:
> Right, the DML will be single-threaded and fsync of all dirty pages will
> happen before commit of each transaction.
heap_sync() would occur at end of statement, as it does with CTAS. We
could delay until EOT but I'm not sure I see why; in most cases they'd
be the same point anyway.
I'd been toying with the idea of making the freshly added blocks live
only in temp_buffers to avoid the shared_buffers overhead, but that was
starting to sounds too wierd for my liking.
Best Regards, Simon Riggs
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
From simon@2ndquadrant.com Tue Jan 3 18:58:13 2006
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Simon Riggs <simon@2ndquadrant.com>
To: Michael Paesold <mpaesold@gmx.at>
cc: Bruce Momjian <pgman@candle.pha.pa.us>, andrew@dunslane.net,
tgl@sss.pgh.pa.us, kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca,
zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
In-Reply-To: <14969.1136030384@www6.gmx.net>
References: <200512302114.jBULEno02301@candle.pha.pa.us>
<14969.1136030384@www6.gmx.net>
Date: Tue, 03 Jan 2006 23:58:09 +0000
Message-ID: <1136332689.5052.263.camel@localhost.localdomain>
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
Content-Length: 1493
On Sat, 2005-12-31 at 12:59 +0100, Michael Paesold wrote:
> Bruce Momjian wrote:
>
> > > The --single-transaction mode would apply even if the dump was created
> > > using an earlier version of pg_dump. pg_dump has *not* been altered at
> > > all. (And I would again add that the idea was not my own)
> >
> > I assume you mean this:
> >
> > http://archives.postgresql.org/pgsql-patches/2005-12/msg00257.php
> >
> > I guess with the ALTER commands I don't see much value in the
> > --single-transaction flag. I am sure others suggested it, but would
> > they suggest it now given our current direction.
>
> I just want to add that --single-transaction has a value of it's own. There
> were times when I wanted to restore parts of a dump all-or-nothing.
>
> This is possible with PostgreSQL, unlike many other DBM systems, because
> people like Tom Lane have invested in ensuring that all DDL is working
> without implicitly committing an enclosing transaction.
>
> Using pg_restore directly into a database, it is not possible to get a
> single transaction right now. One has to restore to a file and manually
> added BEGIN/COMMIT. Just for that I think --single-transaction is a great
> addition and a missing feature.
>
> I think more people have a use-case for that.
I did originally separate the --single-transaction patch for this
reason. I think its a valid patch on its own and its wrapped and ready
to go, with some deletions from the doc patch.
Best Regards, Simon Riggs
From pgsql-hackers-owner+M78239=pgman=candle.pha.pa.us@postgresql.org Tue Jan 3 19:12:18 2006
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Simon Riggs <simon@2ndquadrant.com>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Tom Lane <tgl@sss.pgh.pa.us>, Martijn van Oosterhout <kleptog@svana.org>,
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
In-Reply-To: <200512291637.jBTGbdC03848@candle.pha.pa.us>
References: <200512291637.jBTGbdC03848@candle.pha.pa.us>
Date: Wed, 04 Jan 2006 00:11:55 +0000
Message-ID: <1136333515.5052.273.camel@localhost.localdomain>
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.045 required=5 tests=[AWL=0.045]
X-Spam-Score: 0.045
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 1200
On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
> Having COPY behave differently because it is
> in a transaction is fine as long as it is user-invisible, but once you
> require users to do that to get the speedup, it isn't user-invisible
> anymore.
Since we're agreed on adding ALTER TABLE rather than COPY LOCK, we have
our explicit mechanism for speedup.
However, it costs a single line of code and very very little execution
time to add in the optimization to COPY to make it bypass WAL when
executed in the same transaction that created the table. Everything else
is already there.
As part of the use_wal test:
+ if (resultRelInfo->ri_NumIndices == 0 &&
+ !XLogArchivingActive() &&
>> (cstate->rel->rd_createSubid != InvalidSubTransactionId ))
+ use_wal = false;
the value is already retrieved from cache...
Can anyone see a reason *not* to put that change in also? We just don't
advertise it as the "suggested" route to gaining performance, nor would
we rely on it for pg_dump/restore performance.
Best Regards, Simon Riggs
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
From pgsql-hackers-owner+M78303=pgman=candle.pha.pa.us@postgresql.org Thu Jan 5 12:23:39 2006
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
X-Greylist: from auto-whitelisted by SQLgrey-
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200601051722.k05HMSM02052@candle.pha.pa.us>
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
In-Reply-To: <1136328833.5052.223.camel@localhost.localdomain>
To: Simon Riggs <simon@2ndquadrant.com>
Date: Thu, 5 Jan 2006 12:22:28 -0500 (EST)
cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>,
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
X-Spam-Score: 0.12
X-Spam-Level:
X-Mailing-List: pgsql-hackers
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
List-Help: <mailto:majordomo@postgresql.org?body=help>
List-Id: <pgsql-hackers.postgresql.org>
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
List-Post: <mailto:pgsql-hackers@postgresql.org>
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Length: 6020
Simon Riggs wrote:
> > So, we need a name for EXCLUSIVE mode that suggests how it is different
> > from TRUNCATE, and in this case, the difference is that EXCLUSIVE
> > preserves the previous contents of the table on recovery, while TRUNCATE
> > does not. Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
> > Anyway, the keywords are easy to modify, even after the patch is
> > submitted. FYI, I usually go through keywords.c looking for a keyword
> > we already use.
>
> I'm very happy for suggestions on what these new modes are called.
>
> > > > So, to summarize, I think we should add DROP/TRUNCATE, and use that by
> > > > default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
> > > > for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
> > >
> > > Would you mind stating again what you mean, just so I can understand
> > > this? Your summary isn't enough.
> >
> > New ALTER TABLE mode, perhaps call it PERSISTENCE:
> >
> > ALTER TABLE tab PERSISTENCE DROP ON RECOVERY
> > ALTER TABLE tab PERSISTENCE TRUNCATE ON RECOVERY
> >
> > These would drop or truncate all tables with this flag on a non-clean
> > start of the postmaster, and write something in the server logs.
> > However, I don't know that we have the code in place to DROP/TRUNCATE in
> > recovery mode, and it would affect all databases, so it could be quite
> > complex to implement. In this mode, no WAL logs would be written for
> > table modifications, though DDL commands would have to be logged.
>
> Right now, this will be a TODO item... it looks like it will take some
> thought to implement correctly.
OK, I know my suggestions have made it more complicated.
TODO added:
* Allow control over which tables are WAL-logged
Allow tables to bypass WAL writes and just fsync() dirty pages on
commit. To do this, only a single writer can modify the table, and
writes must happen only on new pages. Readers can continue accessing
the table. This would affect COPY, and perhaps INSERT/UPDATE too.
Another option is to avoid transaction logging entirely and truncate
or drop the table on crash recovery. These should be implemented
using ALTER TABLE, e.g. ALTER TABLE PERSISTENCE [ DROP | TRUNCATE |
STABLE | DEFAULT ]. Tables using non-default logging should not use
referential integrity with default-logging tables, and tables using
stable logging probably can not have indexes. [walcontrol]
> > ALTER TABLE tab PERSISTENCE PRESERVE (or STABLE?)
> >
> > Table contents are preserved across recoveries, but data modifications
> > can happen only one at a time. I don't think we have a lock mode that
> > does this, so I am worried a new lock mode will have to be created. A
> > simplified solution at this stage would be to take an exclusive lock on
> > the table, but really we just need a single-writer table lock, which I
> > don't think we have. initially this can implemented to only affect COPY
> > but later can be done for other commands.
>
> ExclusiveLock locks out everything apart from readers, no new lock mode
> AFAICS. Implementing that is little additional work for COPY.
Nice.
> Tom had a concern about setting this for I, U, D commands via the
> executor. Not sure what the details of that are, as yet.
That is much more complicated than the COPY-only idea, for sure. I am
thinking we could add the ALTER syntax and just do COPY at this stage,
meaning that I/U/D still do full logging until we get to improving them.
The big benefit is that the user API doesn't need to change when we
improve the code. In fact I think we could do the TRUNCATE/DROP easily
for I/U/D, but the STABLE option would require work and we don't need to
implement it in the first patch.
> We can use either of the unlogged modes for pg_dump, so I'd suggest its
> this one. Everybody happy with this being the new default in pg_dump, or
> should it be an option?
>
> > ALTER TABLE tab PERSISTENCE DEFAULT
> >
> > This would be our current default mode, which is full concurrency and
> > persistence.
>
> I'm thinking whether the ALTER TABLE statement might be better with two
> bool flags rather than a 3-state char.
>
> flag 1: ENABLE LOGGING | DISABLE LOGGING
>
> flag 2: FULL RECOVERY | TRUNCATE ON RECOVERY
>
> Giving 3 possible sets of options:
>
> -- the default
> ALTER TABLE mytable ENABLE LOGGING FULL RECOVERY; (default)
>
> -- EXCLUSIVE mode
> ALTER TABLE mytable DISABLE LOGGING FULL RECOVERY;
> ...which would be used like this
> ALTER TABLE mytable DISABLE LOGGING;
> COPY or other bulk data manipulation SQL
> ALTER TABLE mytable ENABLE LOGGING;
> ...since FULL RECOVERY is the default.
>
> -- multiuser temp table mode
> ALTER TABLE mytable DISABLE LOGGING TRUNCATE ON RECOVERY;
> ...which would usually be left on all the time
>
> which only uses one new keyword LOGGING and yet all the modes are fairly
> explicit as to what they do.
>
> An alternative might be the slightly more verbose:
> ALTER TABLE mytable DISABLE LOGGING FORCE EXCLUSIVE TABLE LOCK;
> which would be turned off by
> ALTER TABLE mytable ENABLE LOGGING;
>
> Comments?
I had the same idea originally, but avoided it because the logging
really does affect what other options you can use. For example, if you
want truncate on recovery, you certainly do not want logging, so it
seems the options are not really independent. In fact if someone asks
for truncate on recovery, do we automatically turn off logging for them,
or throw an error, or a warning. It just seemed too error-prone and
confusing, though perhaps more logical. Of course, if others like the
above, we can do it.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
From simon@2ndquadrant.com Thu Jan 5 16:56:25 2006
Return-path: <simon@2ndquadrant.com>
Received: from smtp.nildram.co.uk (smtp.nildram.co.uk [195.112.4.54])
by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k05LuPb02246
for <pgman@candle.pha.pa.us>; Thu, 5 Jan 2006 16:56:25 -0500 (EST)
Received: from [192.168.0.3] (unknown [84.12.184.6])
by smtp.nildram.co.uk (Postfix) with ESMTP
id A9F0F268C4E; Thu, 5 Jan 2006 21:56:18 +0000 (GMT)
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
From: Simon Riggs <simon@2ndquadrant.com>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Tom Lane <tgl@sss.pgh.pa.us>, Martijn van Oosterhout <kleptog@svana.org>,
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
In-Reply-To: <200601051727.k05HR5p02803@candle.pha.pa.us>
References: <200601051727.k05HR5p02803@candle.pha.pa.us>
Content-Type: text/plain
Date: Thu, 05 Jan 2006 21:56:21 +0000
Message-ID: <1136498181.21025.285.camel@localhost.localdomain>
MIME-Version: 1.0
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
Content-Transfer-Encoding: 7bit
Status: OR
On Thu, 2006-01-05 at 12:27 -0500, Bruce Momjian wrote:
> Seems like a nice optimization.
Negative thoughts: Toast tables have a toast index on them, yes? We have
agreed that we cannot use the optimization if we have indexes on the
main table. It follows that we cannot use the optimization if we have
*any* toasted data, since that would require a pointer between two
blocks, which would not be correctly recovered following a crash. If we
log the toast table then there could be a mismatch between heap and
toast table; if we don't log the toast table there could be a mismatch
between toast table and toast index.
We can test to see if the toast table is empty when we do ALTER TABLE,
but loading operations may try to create toasted data rows.
Presumably that means we must either:
i) abort a COPY if we get a toastable value
ii) if we get a toastable value, insert the row into a new block, which
we do logging of, then also log the toast insert and the toast index
insert - i.e. some blocks we log, others not
This is still useful for many applications, IMHO, but the list of
restrictions seems to be growing. Worse, we wouldn't know that the toast
tables were empty until after we did the COPY TO for a pg_dump, so we
wouldn't be able to retrospectively add an ALTER TABLE command ahead of
the COPY.
Thoughts? Hopefully there are some flaws in my thinking here,
Best Regards, Simon Riggs
|