最近用到sentry无法收集与统计数据,结果看到了几个容器始终是restarting的状态,然后看了一下容器内部的日志。
docker logs --tail 100 899f1a0c17c0
显示如下内容:
08:48:17 [INFO] arroyo.processing.processor: Closing <arroyo.backends.kafka.consumer.KafkaConsumer object at 0x7f698915e090>...
08:48:17 [INFO] arroyo.processing.processor: Partitions to revoke: [Partition(topic=Topic(name='ingest-events'), index=0)]
08:48:17 [INFO] arroyo.processing.processor: Partition revocation complete.
08:48:17 [INFO] arroyo.processing.processor: Processor terminated
Traceback (most recent call last):
File "/.venv/bin/sentry", line 4, in <module>
raise SystemExit(main())
^^^^^^
File "/usr/src/sentry/src/sentry/runner/main.py", line 149, in main
func(**kwargs)
File "/.venv/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/sentry/src/sentry/runner/decorators.py", line 83, in inner
return ctx.invoke(f, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/sentry/src/sentry/runner/decorators.py", line 35, in inner
return ctx.invoke(f, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/sentry/src/sentry/runner/commands/run.py", line 386, in basic_consumer
run_processor_with_signals(processor, consumer_name)
File "/usr/src/sentry/src/sentry/utils/kafka.py", line 46, in run_processor_with_signals
processor.run()
File "/.venv/lib/python3.11/site-packages/arroyo/processing/processor.py", line 322, in run
self._run_once()
File "/.venv/lib/python3.11/site-packages/arroyo/processing/processor.py", line 384, in _run_once
self.__message = self.__consumer.poll(timeout=1.0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.11/site-packages/arroyo/backends/kafka/consumer.py", line 414, in poll
raise OffsetOutOfRange(str(error))
arroyo.errors.OffsetOutOfRange: KafkaError{code=_AUTO_OFFSET_RESET,val=-140,str="fetch failed due to requested offset not available on the broker: Broker: Offset out of range (broker 1001)"}
看到了明显的报错:
fetch failed due to requested offset not available on the broker: Broker: Offset out of range (broker 1001)
大概意思说通过kakfa给到的偏移量读取不到消息。很有可能kafka之前的消息因为一些原因丢了,所以这里报错了。经过一些研究后找到如下解决方案,核心就是说重置一下kafka的某些topic的偏移量来解决这个报错。
但是要找到是哪一个topic很关键,还是从上面的报错信息中找到一行Partition(topic=Topic(name='ingest-events'), index=0)
主要内容,这个很关键,说明要重置name='ingest-events'
的偏移量。
下面来看一下如何来重置,先把所有的容器停了,只开启kafka,否则可能报错Assignments can only be reset if the group 'group' is inactive, but the current state is Stable.
,然后进入kafka容器,然后看一下这个ingest-events
可能在哪一个组里面。
docker exec -it 98a4adc6acdb bash
kafka-consumer-groups --bootstrap-server kafka:9092 --list
然后看到ingest-consumer
比较像,然后跟进一下这个组。
kafka-consumer-groups --bootstrap-server kafka:9092 --group ingest-consumer --describe
然后发现确实在这个组里面。
下面开始重置topic
的偏移量。
kafka-consumer-groups --bootstrap-server 127.0.0.1:9092 --group ingest-consumer --topic ingest-events --reset-offsets --to-latest --execute
kafka-consumer-groups --bootstrap-server 127.0.0.1:9092 --group ingest-consumer --topic ingest-transactions --reset-offsets --to-latest --execute
对于此问题还有帮助的文章有:
https://juejin.cn/post/7409523890853265434
https://github.com/getsentry/self-hosted/issues/478