canal正常启动,但是无法读取 bin log 日志,meta.data 文件的原因
canal
alibaba/canal: Canal 是由阿里巴巴开源的分布式数据库同步系统,主要用于实现MySQL数据库的日志解析和实时增量数据订阅与消费,广泛应用于数据库变更消息的捕获、数据迁移、缓存更新等场景。
项目地址:https://gitcode.com/gh_mirrors/ca/canal

·
canal 不知道因为什么原因挂掉了,重启的话,adapter 端日志也显示启动正常,找了半天是因为 canalserver 端的配置出现问题,conf 目录下的 meta.dat 文件读取的文件不存在,错误日志是这样的:
2019-09-15 23:59:21.853 [destination = testcore , address = /172.18.108.67:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position
2019-09-15 23:59:21.854 [destination = testcore , address = /172.18.108.67:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - prepare to find start position just last position
{"identity":{"slaveId":-1,"sourceAddress":{"address":"172.18.108.67","port":3306}},"postion": {"gtid":"","included":false,"journalName":"mysql-bin.000030","position":832575421,"serverId":10867,"timestamp":1567176390000}}
2019-09-15 23:59:21.855 [destination = testcore , address = /172.18.108.67:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=mysql-bin.000030,position=832575421,serverId=10867,gtid=,timestamp=1567176390000] cost : 2ms , the next step is binlog dump
2019-09-15 23:59:21.857 [destination = testcore , address = /172.18.108.67:3306 , EventParser] ERROR c.a.o.canal.parse.inbound.mysql.dbsync.DirectLogFetcher - I/O error while reading from client socket
java.io.IOException: Received error packet: errno = 1236, sqlstate = HY000 errmsg = Could not find first log file name in binary log index file
at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:102) ~[canal.parse-1.1.3.jar:na]
at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:235) [canal.parse-1.1.3.jar:na]
at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:257) [canal.parse-1.1.3.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
2019-09-15 23:59:21.857 [destination = testcore , address = /172.18.108.67:3306 , EventParser] ERROR c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - dump address /172.18.108.67:3306 has an error, retrying. caused by
java.io.IOException: Received error packet: errno = 1236, sqlstate = HY000 errmsg = Could not find first log file name in binary log index file
at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:102) ~[canal.parse-1.1.3.jar:na]
at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:235) ~[canal.parse-1.1.3.jar:na]
at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:257) ~[canal.parse-1.1.3.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
2019-09-15 23:59:21.857 [destination = testcore , address = /172.18.108.67:3306 , EventParser] ERROR com.alibaba.otter.canal.common.alarm.LogAlarmHandler - destination:testcore[java.io.IOException: Received error packet: errno = 1236, sqlstate = HY000 errmsg = Could not find first log file name in binary log index file
at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:102)
at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:235)
at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:257)
at java.lang.Thread.run(Thread.java:745)
]
需要把 meta.dat 文件删除,然后重启 canalserver ,重新匹配日志文件读取才正常。




alibaba/canal: Canal 是由阿里巴巴开源的分布式数据库同步系统,主要用于实现MySQL数据库的日志解析和实时增量数据订阅与消费,广泛应用于数据库变更消息的捕获、数据迁移、缓存更新等场景。
最近提交(Master分支:9 个月前 )
50a57029
* fix(canal/prometheus): 修复延迟指标的缺陷
-【缺陷描述】
Admin控制台修改canal.properties配置后,Server自动进行重启,在Grafan上观测到同步实例的PUT/GET/ACK延迟居高不下。
-【原因定位】
源库心跳正常触发,MemoryStoreWithBuffer正常推进位点,profiling 正常统计,但是延迟指标依然是越来越来越高。Prometheus调用collect接口采集到的exec time时间始终是固定不变的。经debug排查到StoreCollector采集器内存hold的StoreMetricsHolder与CanalInstance实例中的引用已经不同啦,CanalIntance重启时已经被重建过一份新的实例,但是StoreMetricsHolder却没有保存到内存Hold中。原因是Map.putIfAbsent调用引起。
-【修复效果】
- 修复后,重启Server心跳正常推进,延迟瞬间降下来。梳理其他Collector代码都是调用Map.put,只有这里使用putIfAbsent可能是粗心导致吧。修复效果
* fix(canal/client-adapter): 修复adapter插件重试错误缺陷
-【缺陷描述】
adapter运行期间,针对server或者Instance进行过一次重启或者断网测试,发现adapter通过client拉取数据报错以后持续打印错误日志,永远不得恢复。虽然adapter processor 针对错误有重试逻辑,但是client的tcp连接已经发生中断,且无法恢复。
-【问题定位】
adapter processor 针对错误的处理比较粗,需要细分是从上游get message报错,还是下游sync data报错?
如果是 get message报错,特别是tcp 断开,需要跳出重试,重新尝试重连。如果是下游 sync data 报错,再考虑重试写入。下游插件开发者自行保证连接重连机制、写入幂等。框架只提供重试。
- 说明:顺带更新一下 .gitignore 文件 15 天前
6b7ddbbe
-【缺陷描述】
Admin控制台修改canal.properties配置后,Server自动进行重启,在Grafan上观测到同步实例的PUT/GET/ACK延迟居高不下。
-【原因定位】
源库心跳正常触发,MemoryStoreWithBuffer正常推进位点,profiling 正常统计,但是延迟指标依然是越来越来越高。Prometheus调用collect接口采集到的exec time时间始终是固定不变的。经debug排查到StoreCollector采集器内存hold的StoreMetricsHolder与CanalInstance实例中的引用已经不同啦,CanalIntance重启时已经被重建过一份新的实例,但是StoreMetricsHolder却没有保存到内存Hold中。原因是Map.putIfAbsent调用引起。
-【修复效果】
- 修复后,重启Server心跳正常推进,延迟瞬间降下来。梳理其他Collector代码都是调用Map.put,只有这里使用putIfAbsent可能是粗心导致吧。修复效果 15 天前
更多推荐
所有评论(0)