Skip to content
This repository has been archived by the owner on Jun 16, 2023. It is now read-only.

Performance Testing

longdafeng edited this page Sep 30, 2014 · 5 revisions

Preparation

  • Server Number: 7, nimbus/web ui/zookeeper run in one node, supervisor run in the left 6 nodes
  • Hardware:
CPU: 24 Core(Intel(R) Xeon(R) CPU E5-2430 0 @ 2.20GHz)
Memory: 96G
Disk: 2T[sata,7200rpm]
Network Adapter: 1Ge*2
  • Software:
OS: Redhat Enterprise Linux 5.7 x86_64 
Java: java version "1.6.0_32" 64bits
Storm: Storm-0.9.2-netty-p297
JStorm: jstorm-0.9.4.1 

Conclusion

  1. JStorm is faster than Storm
  2. The ratio of task_num/worker_number
  3. The bigger message size, the higher throughput, when CPU and network-throughput is enough
  4. The simpler the topology, the higher throughput

Performance of changing worker number

Test

Don't change executor/task parallelism, just set different worker number

Setting

Spout number: 18
Bolt number: 18
Acker number: 18
Message Size: 10byte
Max.spout.pending: 10000
topology.performance.metrics: false
topology.alimonitor.metrics.post: false
disruptor.use.sleep: false
Topology Level: one kind of spout, one kind of bolt, shuffle grouping

Test Result

Throughput VS workers

Remarks: The horizontal axis is different worker number, Y-coordinate is the emit speed of Spout every 10 seconds

Cpu Usage VS workers

Conclusion

  1. JStorm performance is higher than storm's
  2. JStorm save more cpu usage
  3. When worker number is 12, both JStorm throughput and Storm throughput is best, if worker number is bigger, but the throughput reduce, so normally The root cause is as following:
  4. The ratio of task number/worker number is bigger, the more data will be past in one JVM, no network cost, don't need do serialize/deserialize worker.
  5. When the ratio of task number/worker number is too high, thread context switch will cost much CPU, so the throughput isn't simply increase when enlarge the ratio of task/worker.
  6. When server number is fixed, enlarge the worker's number, it will increase worker's common threads such total dispatch thread, total sending thread, netty client thread, these thread will cost more cpu, so increasing worker's number doesn't simply improve the performance.

Test Message Size

Test

Set different message size, check the throughput changing.

Setting

Spout number: 18
Bolt number: 18
Acker number: 18
Worker number: 
Max.spout.pending: 10000
topology.performance.metrics: false
topology.alimonitor.metrics.post: false
disruptor.use.sleep: false
Topology Level: one kind of spout, one kind of bolt, shuffle grouping

Test Result

Throughput VS Message Size

Cpu Usage VS Message Size

Conclusion

  1. Increasing message size will improve both JStorm and Storm performance when CPU and network is enough
  2. Normally JStorm throughput is higher than Storm's

JStorm 0.9.0 Performance is very good, the maximum sending speed of a single worker is 110,000 QPS when using netty, when using zeromq, the maximum speed is 120,000 QPS.

Conclusion

  • JStorm 0.9.0 10% faster than Storm 0.9.0 when using netty and JStorm netty plugin is stable, while the netty plugin of Storm is unstable.
  • In the case of using Zeromq, JStorm 0.9.0 30% faster than Storm 0.9.0.

Reason

  • Zeromq reduce a memory copy.
  • Increase deserialization thread.
  • Rewrite sampling code, significantly reducing the sampling impact.
  • Ack code optimization.
  • Optimize the performance of the buffer map.
  • Java is more bottom than clojure.

Testing

Testing Example

Testing example is https://github.com/longdafeng/storm-examples%20https:/github.com/longdafeng/storm-examples

Testing Environment

Five 16 cores, 98G physical machines

uname -a :
Linux dwcache1 2.6.32-220.23.1.tb735.el5.x86_64 #1 SMP Tue Aug 14 16:03:04 CST 2012 x86_64 x86_64 x86_64 GNU/Linux

Testing Results

JStorm with netty, Spout sending QPS is 110,000

jstorm.0.9.0.netty

Storm with netty, Spout sending QPS is 100,000 (screenshot is the QPS of upper application, not including the QPS of ack, Spout sending QPS exactly two times of the upper application QPS).

storm.0.9.0.netty

JStorm with zeromq, Spout sending QPS is 120,000

jstorm.0.9.0.zmq

Storm with zeromq, Spout sending QPS is 90,000 (screenshot is the QPS of the upper application, not including the QPS of ack, Spout sending QPS exactly two times of the upper application QPS).

storm.0.9.0.zmq

Clone this wiki locally