Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft主节点状态报错ERROR_TYPE_STATE_MACHINE,该节点所有持久化实例所在的服务变成空,在控制台查不到。其他节点可以查到 #7887

Closed
zz630 opened this issue Mar 4, 2022 · 1 comment
Labels
status/invalid This doesn't seem right

Comments

@zz630
Copy link
Contributor

zz630 commented Mar 4, 2022

主节点A状态报错Error [type=ERROR_TYPE_STATE_MACHINE, status=Status[EBUSY<1009>: FSMCaller is overload.]]
存储使用了ceph,5节点共用一个存储,数据分别存在各自ip的目录下,共2万个持久化实例。一开始参考https://github.com/sofastack/sofa-jraft/issues/472,怀疑是状态机过载。重启该节点后,各节点重新选主,状态正常,数据正常。然后我再注册一个持久化实例,过了一段时间后,新主B所有的持久化实例都查不到了,jraft日志报错.
2022-03-04 12:33:12,870 ERROR Encountered an error=Status[EBUSY<1009>: FSMCaller is overload.] on StateMachine com.alibaba.nacos.cc
ore.distributed.raft.NacosStateMachine, it's highly recommended to implement this method as raft stops working since some error occ
curs, you should figure out the cause and repair or remove this node.

com.alipay.sofa.jraft.error.RaftException: FSMCaller is overload.
at com.alipay.sofa.jraft.core.FSMCallerImpl.enqueueTask(FSMCallerImpl.java:236)
at com.alipay.sofa.jraft.core.FSMCallerImpl.onCommitted(FSMCallerImpl.java:245)
at com.alipay.sofa.jraft.core.BallotBox.commitAt(BallotBox.java:137)
at com.alipay.sofa.jraft.core.Replicator.onAppendEntriesReturned(Replicator.java:1428)
at com.alipay.sofa.jraft.core.Replicator.onRpcReturned(Replicator.java:1258)
at com.alipay.sofa.jraft.core.Replicator$4.run(Replicator.java:1589)
at com.alipay.sofa.jraft.rpc.impl.AbstractClientService$1.complete(AbstractClientService.java:241)
) at java.lang.Thread.run(Thread.java:748).MpscSingleThreadExecutor.lambda$doStartWorker$3(MpscSingleThreadExecutor.java:2633
看到JRaftServer的nodeOptions.setEnableMetrics(true); 但是我没找到metrics输出的地方

  • Version [e.g. nacos-server 2.0.3]
@zz630
Copy link
Contributor Author

zz630 commented Jul 8, 2022

ceph IO较慢,换成本地磁盘后没再遇到该问题

@KomachiSion KomachiSion added status/invalid This doesn't seem right and removed contribution welcome labels Sep 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

3 participants