-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Performance Testing
- Server Number: 7, nimbus/web ui/zookeeper run in one node, supervisor run in the left 6 nodes
- Hardware:
CPU: 24 Core(Intel(R) Xeon(R) CPU E5-2430 0 @ 2.20GHz)
Memory: 96G
Disk: 2T[sata,7200rpm]
Network Adapter: 1Ge*2
- Software:
OS: Redhat Enterprise Linux 5.7 x86_64
Java: java version "1.6.0_32" 64bits
Storm: Storm-0.9.2-netty-p297
JStorm: jstorm-0.9.4.1
- JStorm testing source code
- Storm testing source code, JStorm testing source code is same as Storm's except the code read Zookeeper node, so there are two set of testing code.
- JStorm is faster than Storm
- The ratio of task_num/worker_number
- The bigger message size, the higher throughput, when CPU and network-throughput is enough
- The simpler the topology, the higher throughput
Don't change executor/task parallelism, just set different worker number
Spout number: 18
Bolt number: 18
Acker number: 18
Message Size: 10byte
Max.spout.pending: 10000
topology.performance.metrics: false
topology.alimonitor.metrics.post: false
disruptor.use.sleep: false
Topology Level: one kind of spout, one kind of bolt, shuffle grouping
Remarks: The horizontal axis is different worker number, Y-coordinate is the emit speed of Spout every 10 seconds
- JStorm performance is higher than storm's
- JStorm save more cpu usage
- When worker number is 12, both JStorm throughput and Storm throughput is best, if worker number is bigger, but the throughput reduce, so normally The root cause is as following:
- The ratio of task number/worker number is bigger, the more data will be past in one JVM, no network cost, don't need do serialize/deserialize worker.
- When the ratio of task number/worker number is too high, thread context switch will cost much CPU, so the throughput isn't simply increase when enlarge the ratio of task/worker.
- When server number is fixed, enlarge the worker's number, it will increase worker's common threads such total dispatch thread, total sending thread, netty client thread, these thread will cost more cpu, so increasing worker's number doesn't simply improve the performance.
Set different message size, check the throughput changing.
Spout number: 18
Bolt number: 18
Acker number: 18
Worker number:
Max.spout.pending: 10000
topology.performance.metrics: false
topology.alimonitor.metrics.post: false
disruptor.use.sleep: false
Topology Level: one kind of spout, one kind of bolt, shuffle grouping
- Increasing message size will improve both JStorm and Storm performance when CPU and network is enough
- Normally JStorm throughput is higher than Storm's
JStorm 0.9.0 Performance is very good, the maximum sending speed of a single worker is 110,000 QPS when using netty, when using zeromq, the maximum speed is 120,000 QPS.
Conclusion
- JStorm 0.9.0 10% faster than Storm 0.9.0 when using netty and JStorm netty plugin is stable, while the netty plugin of Storm is unstable.
- In the case of using Zeromq, JStorm 0.9.0 30% faster than Storm 0.9.0.
Reason
- Zeromq reduce a memory copy.
- Increase deserialization thread.
- Rewrite sampling code, significantly reducing the sampling impact.
- Ack code optimization.
- Optimize the performance of the buffer map.
- Java is more bottom than clojure.
Testing
Testing Example
Testing example is https://github.com/longdafeng/storm-examples%20https:/github.com/longdafeng/storm-examples
Testing Environment
Five 16 cores, 98G physical machines
uname -a :
Linux dwcache1 2.6.32-220.23.1.tb735.el5.x86_64 #1 SMP Tue Aug 14 16:03:04 CST 2012 x86_64 x86_64 x86_64 GNU/Linux
Testing Results
JStorm with netty, Spout sending QPS is 110,000
Storm with netty, Spout sending QPS is 100,000 (screenshot is the QPS of upper application, not including the QPS of ack, Spout sending QPS exactly two times of the upper application QPS).
JStorm with zeromq, Spout sending QPS is 120,000
Storm with zeromq, Spout sending QPS is 90,000 (screenshot is the QPS of the upper application, not including the QPS of ack, Spout sending QPS exactly two times of the upper application QPS).