lsn和redo

本文详细介绍了PostgreSQL中的逻辑序列号(lsn)的三种表示形式,包括pg_controldata中的输出、pg中的控制函数表示和xlog文件名的结构。此外,文章还讨论了redopoint的变化,通过实际操作演示了checkpoint和redopoint的计算过程。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

瀚高数据库
目录
文档用途
详细信息
相关文档

记录lsn三种记录形式,以及redo对lsn的操作。

详细信息

一、lsn的三种形式

1. pg_controldata中可以看到这样的lsn表示:

Latest checkpoint location: 0/1548018Latest
checkpoint’s REDO location: 0/1547FC8
Latest checkpoint’s REDO WAL file: 000000010000000000000001

2.pg中的一些控制函数也可以看到类似的表示:

psql (16.1)                                                                                                                                                              

Type "help" for help.                                                                                                                                                    

                                                                                                                                                                         

postgres=# select pg_current_wal_lsn();                                                                                                                                  

 pg_current_wal_lsn                                                                                                                                                      

--------------------                                                                                                                                                     

 0/15480C8                                                                                                                                                               

(1 row)                                                                                                                                                                  

                                                                                                         

[postgres@node1 pg_wal]$ pg_controldata | grep REDO

Latest checkpoint's REDO location:    0/15482D8

Latest checkpoint's REDO WAL file:    000000010000000000000001

                                                               

postgres=# select pg_walfile_name('0/15480C8');                                                                                                                          

     pg_walfile_name                                                                                                                                                     

--------------------------                                                                                                                                               

 000000010000000000000001                                                                                                                                                

(1 row)  

3. pg中的xlog文件命名看起来是另一种形式:

-rw-------. 1 postgres postgres 16777216 Jan 5 10:11 000000010000000000000001

二、解析文件名含义

一、文件名含义

000000010000000000000001
00000001    00000000    00000001

timeline          logid            segid

二、控制函数查出来的’A/B’形式的lsn含义

 0/15480C8

            0             1          5480C8

            logid       segid       offset

三、pg内部变量含义:就是lsn不带’/’

(gdb) p RedoRecPtr

$6 = 22315736

(gdb) p/x RedoRecPtr

$7 = 0x15482d8



            0          1          5482d8

            logid       segid       offset

后两种是等价的。

xlog文件名是把lsn最后6个16进制省略了,也就是一个16MB的文件中可以保存省略的偏移量offset。

offset范围:0x000000 - 0xFFFFFF = 16MB 所以16MB的文件的每个字节都可以通过offset索引到。

如要调试,可参考pg16恢复过程的调用栈:

(gdb) bt

#0  ReadCheckpointRecord (xlogprefetcher=0xf7dde8, RecPtr=22315736, replayTLI=1) at xlogrecovery.c:3971

#1  0x0000000000599c6f in InitWalRecovery (ControlFile=0x7fffe41d4980, wasShutdown_ptr=0x7fffffffde2f, haveBackupLabel_ptr=0x7fffffffde2d, 

    haveTblspcMap_ptr=0x7fffffffde2e) at xlogrecovery.c:769

#2  0x0000000000589710 in StartupXLOG () at xlog.c:5171

#3  0x00000000009073af in StartupProcessMain () at startup.c:282

#4  0x00000000008fbd49 in AuxiliaryProcessMain (auxtype=StartupProcess) at auxprocess.c:141

#5  0x00000000009061d7 in StartChildProcess (type=StartupProcess) at postmaster.c:5372

#6  0x0000000000901959 in PostmasterMain (argc=1, argv=0xf7bf20) at postmaster.c:1458

#7  0x00000000007cb6d1 in main (argc=1, argv=0xf7bf20) at main.c:198

三、Redo point变更

使用pg_controldata查看当前ControlFile内容:

[postgres@node1 pg_wal]$ pg_controldata | grep REDO

Latest checkpoint's REDO location:    0/15482D8

Latest checkpoint's REDO WAL file:    000000010000000000000001

插入几条测试数据:

psql (16.1)

Type "help" for help.



postgres=# insert into test values(1);

INSERT 0 1

postgres=# insert into test values(2);

INSERT 0 1

postgres=# insert into test values(3);

INSERT 0 1

postgres=# 

使用pg_waldump查看wal文件:

......

......

rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/0156AC60, prev 0/0156A9D8, desc: RUNNING_XACTS nextXid 735 latestCompletedXid 734 oldestRunningXid 735

rmgr: Heap        len (rec/tot):     59/    59, tx:        735, lsn: 0/0156AC98, prev 0/0156AC60, desc: INSERT+INIT off: 1, flags: 0x00, blkref #0: rel 1663/5/16387 blk 0

rmgr: Btree       len (rec/tot):     90/    90, tx:        735, lsn: 0/0156ACD8, prev 0/0156AC98, desc: NEWROOT level: 0, blkref #0: rel 1663/5/16390 blk 1, blkref #2: rel 1663/5/16390 blk 0

rmgr: Btree       len (rec/tot):     64/    64, tx:        735, lsn: 0/0156AD38, prev 0/0156ACD8, desc: INSERT_LEAF off: 1, blkref #0: rel 1663/5/16390 blk 1

rmgr: Transaction len (rec/tot):     34/    34, tx:        735, lsn: 0/0156AD78, prev 0/0156AD38, desc: COMMIT 2024-01-05 17:31:07.213249 CST

rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/0156ADA0, prev 0/0156AD78, desc: RUNNING_XACTS nextXid 736 latestCompletedXid 735 oldestRunningXid 736

rmgr: Heap        len (rec/tot):     59/    59, tx:        736, lsn: 0/0156ADD8, prev 0/0156ADA0, desc: INSERT off: 2, flags: 0x00, blkref #0: rel 1663/5/16387 blk 0

rmgr: Btree       len (rec/tot):     64/    64, tx:        736, lsn: 0/0156AE18, prev 0/0156ADD8, desc: INSERT_LEAF off: 2, blkref #0: rel 1663/5/16390 blk 1

rmgr: Transaction len (rec/tot):     34/    34, tx:        736, lsn: 0/0156AE58, prev 0/0156AE18, desc: COMMIT 2024-01-05 17:31:16.074250 CST

rmgr: Heap        len (rec/tot):     59/    59, tx:        737, lsn: 0/0156AE80, prev 0/0156AE58, desc: INSERT off: 3, flags: 0x00, blkref #0: rel 1663/5/16387 blk 0

rmgr: Btree       len (rec/tot):     64/    64, tx:        737, lsn: 0/0156AEC0, prev 0/0156AE80, desc: INSERT_LEAF off: 3, blkref #0: rel 1663/5/16390 blk 1

rmgr: Transaction len (rec/tot):     34/    34, tx:        737, lsn: 0/0156AF00, prev 0/0156AEC0, desc: COMMIT 2024-01-05 17:31:22.578713 CST

rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/0156AF28, prev 0/0156AF00, desc: RUNNING_XACTS nextXid 738 latestCompletedXid 737 oldestRunningXid 738

pg_waldump: error: error in WAL record at 0/156AF28: invalid record length at 0/156AF60: expected at least 24, got 0

最后一个XLOG Record记录的位置为0/0156AF28,加上记录大小50(十六进制为0x32),位置为0/0156AF5A,按理论上来说,如果现在执行checkpoint,该位置为REDO point.

执行checkpoint:

postgres=# checkpoint ;

2024-01-05 17:41:28.687 CST [9080] LOG:  checkpoint starting: immediate force wait

2024-01-05 17:41:28.710 CST [9080] LOG:  checkpoint complete: wrote 40 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.007 s, sync=0.004 s, total=0.024 s; sync files=33, longest=0.003 s, average=0.001 s; distance=139 kB, estimate=139 kB; lsn=0/156AF98, redo lsn=0/156AF60

CHECKPOINT

postgres=# 

pg_controldata查看:

[postgres@node1 pg_wal]$ pg_controldata|grep 'REDO location'
Latest checkpoint's REDO location:    0/156AF60

再次查看XLOG Record记录:

......
rmgr: Btree       len (rec/tot):     64/    64, tx:        735, lsn: 0/0156AD38, prev 0/0156ACD8, desc: INSERT_LEAF off: 1, blkref #0: rel 1663/5/16390 blk 1

rmgr: Transaction len (rec/tot):     34/    34, tx:        735, lsn: 0/0156AD78, prev 0/0156AD38, desc: COMMIT 2024-01-05 17:31:07.213249 CST

rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/0156ADA0, prev 0/0156AD78, desc: RUNNING_XACTS nextXid 736 latestCompletedXid 735 oldestRunningXid 736

rmgr: Heap        len (rec/tot):     59/    59, tx:        736, lsn: 0/0156ADD8, prev 0/0156ADA0, desc: INSERT off: 2, flags: 0x00, blkref #0: rel 1663/5/16387 blk 0

rmgr: Btree       len (rec/tot):     64/    64, tx:        736, lsn: 0/0156AE18, prev 0/0156ADD8, desc: INSERT_LEAF off: 2, blkref #0: rel 1663/5/16390 blk 1

rmgr: Transaction len (rec/tot):     34/    34, tx:        736, lsn: 0/0156AE58, prev 0/0156AE18, desc: COMMIT 2024-01-05 17:31:16.074250 CST

rmgr: Heap        len (rec/tot):     59/    59, tx:        737, lsn: 0/0156AE80, prev 0/0156AE58, desc: INSERT off: 3, flags: 0x00, blkref #0: rel 1663/5/16387 blk 0

rmgr: Btree       len (rec/tot):     64/    64, tx:        737, lsn: 0/0156AEC0, prev 0/0156AE80, desc: INSERT_LEAF off: 3, blkref #0: rel 1663/5/16390 blk 1

rmgr: Transaction len (rec/tot):     34/    34, tx:        737, lsn: 0/0156AF00, prev 0/0156AEC0, desc: COMMIT 2024-01-05 17:31:22.578713 CST

rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/0156AF28, prev 0/0156AF00, desc: RUNNING_XACTS nextXid 738 latestCompletedXid 737 oldestRunningXid 738

rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/0156AF60, prev 0/0156AF28, desc: RUNNING_XACTS nextXid 738 latestCompletedXid 737 oldestRunningXid 738

rmgr: XLOG        len (rec/tot):    114/   114, tx:          0, lsn: 0/0156AF98, prev 0/0156AF60, desc: CHECKPOINT_ONLINE redo 0/156AF60; tli 1; prev tli 1; fpw true; xid 0:738; oid 24579; multi 1; offset 0; oldest xid 722 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 738; online

可以看到0/0156AF60是最后一条checkpoint记录的起始写入位置,REDO point为0/0156AF60,比预想的要多出6个字节(0/0156AF60-0/0156AF5A = 6)。

[postgres@node1 pg_wal]$ echo "obase=10;ibase=16;56AF28"|bc

5680936

[postgres@node1 pg_wal]$ hexdump -C 000000010000000000000001 -s 5680936 -n 50

0056af28  32 00 00 00 00 00 00 00  00 af 56 01 00 00 00 00  |2.........V.....|

0056af38  10 08 00 00 09 be d0 ba  ff 18 00 00 00 00 00 00  |................|

0056af48  00 00 00 e3 bd 2d e2 02  00 00 e2 02 00 00 e1 02  |.....-..........|

0056af58  00 00                                             |..|

0056af5a

[postgres@node1 pg_wal]$ 

XLOG Record的头部首先是XLogRecord结构体,第一个域是uint32的record的大小,即0x00000032,十进制为50,跟pg_waldump打印出的一样。

[postgres@node1 pg_wal]$ hexdump -C 000000010000000000000001 -s 5680986 -n 6

0056af5a  00 00 00 00 00 00                                 |......|

0056af60

[postgres@node1 pg_wal]$ 

后续的6个字节均为0x00

[postgres@node1 pg_wal]$ hexdump -C 000000010000000000000001 -s 5681048 -n 114

0056af98  72 00 00 00 00 00 00 00  60 af 56 01 00 00 00 00  |r.......`.V.....|

0056afa8  10 00 00 00 e0 5a e5 07  ff 58 60 af 56 01 00 00  |.....Z...X`.V...|

0056afb8  00 00 01 00 00 00 01 00  00 00 01 00 00 00 00 00  |................|

0056afc8  00 00 e2 02 00 00 00 00  00 00 03 60 00 00 01 00  |...........`....|

0056afd8  00 00 00 00 00 00 d2 02  00 00 01 00 00 00 01 00  |................|

0056afe8  00 00 01 00 00 00 00 00  00 00 c8 ce 97 65 00 00  |.............e..|

0056aff8  00 00 00 00 00 00 00 00  00 00 e2 02 00 00 00 00  |................|

0056b008  00 00                                             |..|

0056b00

image.png

上述图有点老了,pg16CheckPoint是88B,图片来源:https://www.interdb.jp/pg/pgsql09/04.html

大小为0x00000072,即114B(头部XLogRecord 24B + XLogRecordDataHeaderShort 2B + Checkpoint 88B )。从FF 58为XLogRecordDataHeaderShort结构体的内容,0xFF为标志位,0x58为Data的大小(即88B).

checkpoint记录的内容详见Checkpoint结构体,该结构体第一个域字段为8个字节的LSN–>0x00000000 0156AF60,即REDO point:0/0156AF60.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值