The cluster size is 400 applications, 4,000 pods, and 3 billion segments are generated every day. Looking for advice. #12235
liuxinagxiang
started this conversation in
General
Replies: 2 comments
-
Elasticsearch healthy doesn't mean it is powerful enough. Check self-observability data, especially OAP flush metrics. I believe it is too slow, then everything goes to be blocked eventually. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Is your SkyWalking deployed in Kubernetes? If so, could you provide the relevant documentation? I'm also making relevant modifications recently. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
My project generates about 3 billion segments a day, and the current architecture is skywalking-agent --> kafka ---> skywalking-L1-L2 ---> ES.
The scale is as follows: 400 microservices, nearly 4,000 pods, three Kafka nodes, skywalking-L1 6 nodes: Xmx8G Xms8G Xmn3G, skywalking-L2 6 nodes: Xmx8G Xms8G Xmn3G, elasticsearch 6 nodes: Xmx16G Xms16G Xmn6G
#kafka3.7.0 skywalking9.2 elasticsearch7.17.18
The current problem encountered is that kafka cluster and elaticsearch cluster are normal but skywalking L1 and L2 clusters continue to report errors,skywalking oap service consumption is very slow . After each restart, kafka data is consumed normally within ten minutes and various timeout errors are reported. so I have moved L2 has been moved to k8s cluster container deployment to avoid manually restarting oap every time 😅
I want to know how to plan the cluster according to the size of my project. Which configurations can be optimized? Does anyone have any suggestions?
I was referring to this article recently: https://skywalking.apache.org/zh/2022-08-30-pingan-jiankang/
Beta Was this translation helpful? Give feedback.
All reactions