canal正常启动，但是无法读取 bin log 日志，meta.data 文件的原因

canal

alibaba/canal: Canal 是由阿里巴巴开源的分布式数据库同步系统，主要用于实现MySQL数据库的日志解析和实时增量数据订阅与消费，广泛应用于数据库变更消息的捕获、数据迁移、缓存更新等场景。

项目地址：https://gitcode.com/gh_mirrors/ca/canal

免费下载资源

丁丁点灯o

5852人浏览 · 2019-09-16 15:25:58

丁丁点灯o · 2019-09-16 15:25:58 发布

canal 不知道因为什么原因挂掉了，重启的话，adapter 端日志也显示启动正常，找了半天是因为 canalserver 端的配置出现问题，conf 目录下的 meta.dat 文件读取的文件不存在，错误日志是这样的：

2019-09-15 23:59:21.853 [destination = testcore , address = /172.18.108.67:3306 , EventParser] WARN  c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position
2019-09-15 23:59:21.854 [destination = testcore , address = /172.18.108.67:3306 , EventParser] WARN  c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - prepare to find start position just last position
 {"identity":{"slaveId":-1,"sourceAddress":{"address":"172.18.108.67","port":3306}},"postion":	{"gtid":"","included":false,"journalName":"mysql-bin.000030","position":832575421,"serverId":10867,"timestamp":1567176390000}}
2019-09-15 23:59:21.855 [destination = testcore , address = /172.18.108.67:3306 , EventParser] WARN  c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=mysql-bin.000030,position=832575421,serverId=10867,gtid=,timestamp=1567176390000] cost : 2ms , the next step is binlog dump
2019-09-15 23:59:21.857 [destination = testcore , address = /172.18.108.67:3306 , EventParser] ERROR c.a.o.canal.parse.inbound.mysql.dbsync.DirectLogFetcher - I/O error while reading from client socket
java.io.IOException: Received error packet: errno = 1236, sqlstate = HY000 errmsg = Could not find first log file name in binary log index file
    at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:102) ~[canal.parse-1.1.3.jar:na]
    at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:235) [canal.parse-1.1.3.jar:na]
    at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:257) [canal.parse-1.1.3.jar:na]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
2019-09-15 23:59:21.857 [destination = testcore , address = /172.18.108.67:3306 , EventParser] ERROR 	c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - dump address /172.18.108.67:3306 has an error, retrying. caused by 
java.io.IOException: Received error packet: errno = 1236, sqlstate = HY000 errmsg = Could not find first log file name in binary log index file
    at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:102) ~[canal.parse-1.1.3.jar:na]
    at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:235) ~[canal.parse-1.1.3.jar:na]
    at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:257) ~[canal.parse-1.1.3.jar:na]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
2019-09-15 23:59:21.857 [destination = testcore , address = /172.18.108.67:3306 , EventParser] ERROR com.alibaba.otter.canal.common.alarm.LogAlarmHandler - destination:testcore[java.io.IOException: Received error packet: errno = 1236, sqlstate = HY000 errmsg = Could not find first log file name in binary log index file
    at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:102)
    at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:235)
    at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:257)
    at java.lang.Thread.run(Thread.java:745)

]

需要把 meta.dat 文件删除，然后重启 canalserver ，重新匹配日志文件读取才正常。

GitHub 加速计划 / ca / canal

下载

最近提交(Master分支：9 个月前 )

50a57029 * fix(canal/prometheus): 修复延迟指标的缺陷 -【缺陷描述】 Admin控制台修改canal.properties配置后，Server自动进行重启，在Grafan上观测到同步实例的PUT/GET/ACK延迟居高不下。 -【原因定位】源库心跳正常触发，MemoryStoreWithBuffer正常推进位点，profiling 正常统计，但是延迟指标依然是越来越来越高。Prometheus调用collect接口采集到的exec time时间始终是固定不变的。经debug排查到StoreCollector采集器内存hold的StoreMetricsHolder与CanalInstance实例中的引用已经不同啦，CanalIntance重启时已经被重建过一份新的实例，但是StoreMetricsHolder却没有保存到内存Hold中。原因是Map.putIfAbsent调用引起。 -【修复效果】 - 修复后，重启Server心跳正常推进，延迟瞬间降下来。梳理其他Collector代码都是调用Map.put，只有这里使用putIfAbsent可能是粗心导致吧。修复效果 * fix(canal/client-adapter): 修复adapter插件重试错误缺陷 -【缺陷描述】 adapter运行期间，针对server或者Instance进行过一次重启或者断网测试，发现adapter通过client拉取数据报错以后持续打印错误日志，永远不得恢复。虽然adapter processor 针对错误有重试逻辑，但是client的tcp连接已经发生中断，且无法恢复。 -【问题定位】 adapter processor 针对错误的处理比较粗，需要细分是从上游get message报错，还是下游sync data报错？如果是 get message报错，特别是tcp 断开，需要跳出重试，重新尝试重连。如果是下游 sync data 报错，再考虑重试写入。下游插件开发者自行保证连接重连机制、写入幂等。框架只提供重试。 - 说明：顺带更新一下 .gitignore 文件 15 天前

6b7ddbbe -【缺陷描述】 Admin控制台修改canal.properties配置后，Server自动进行重启，在Grafan上观测到同步实例的PUT/GET/ACK延迟居高不下。 -【原因定位】源库心跳正常触发，MemoryStoreWithBuffer正常推进位点，profiling 正常统计，但是延迟指标依然是越来越来越高。Prometheus调用collect接口采集到的exec time时间始终是固定不变的。经debug排查到StoreCollector采集器内存hold的StoreMetricsHolder与CanalInstance实例中的引用已经不同啦，CanalIntance重启时已经被重建过一份新的实例，但是StoreMetricsHolder却没有保存到内存Hold中。原因是Map.putIfAbsent调用引起。 -【修复效果】 - 修复后，重启Server心跳正常推进，延迟瞬间降下来。梳理其他Collector代码都是调用Map.put，只有这里使用putIfAbsent可能是粗心导致吧。修复效果 15 天前