Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance degradation when no handler #5257

Closed
ckoutsouridis opened this issue Jul 26, 2024 · 11 comments
Closed

performance degradation when no handler #5257

ckoutsouridis opened this issue Jul 26, 2024 · 11 comments
Labels

Comments

@ckoutsouridis
Copy link

ckoutsouridis commented Jul 26, 2024

i have noticed that when we publish messages on addresses with no handlers, it is actually more expensive than publishing messages on addresses with simple consumers.

the bellow example demonstrates

    @Test
    void noHandlers() {

        var vertx = Vertx.vertx();

        var eventBus = vertx.eventBus();

        AtomicLong atomicLong = new AtomicLong();
        eventBus.localConsumer("test", event -> {
            atomicLong.incrementAndGet();
        });
        eventBus.publish("test", "payload");

        eventBus.publish("test", "payload");

        eventBus.publish("test", "payload");

        long start = System.currentTimeMillis();
        for (int i = 0; i < 100000; i++) {
            eventBus.publish("test", "payload");
        }
        long end = System.currentTimeMillis();
        System.out.println("time with consumer: "+ (end - start));


        start = System.currentTimeMillis();
        for (int i = 0; i < 100000; i++) {
            eventBus.publish("no addr", "payload");
        }
        end = System.currentTimeMillis();
        System.out.println("time with consumer: "+ (end - start));
    }

which outputs on my machine

time with consumer: 185
time with no consumer: 895

i believe this boils down to the way the no handlers is modelled by throwing exceptions and concatenating strings in EventBusImpl.deliverMessageLocally

    ConcurrentCyclicSequence<HandlerHolder> handlers = handlerMap.get(msg.address());
    boolean messageLocal = isMessageLocal(msg);
    if (handlers != null) {
       ...
      return null;
    } else {
      if (metrics != null) {
        metrics.messageReceived(msg.address(), !msg.isSend(), messageLocal, 0);
      }
      return new ReplyException(ReplyFailure.NO_HANDLERS, "No handlers for address " + msg.address);
    }
@tsegismont
Copy link
Contributor

Would you be able to create a microbenchmark with JMH (we have tests like this already in Vert.x core)?

With such a test it would be possible to validate changes making the no_handlers case more efficient.

@jpforevers
Copy link

jpforevers commented Dec 3, 2024

Amazing, someone has already encountered this problem and raised it.This is indeed a problem. I have already proven it locally.

I have an MQTT broker developed based on Vert.x MQTT, which passes event messages through the event bus. Event messages sent through the event bus may not necessarily have handlers to handle, and without handlers, the overall performance of the MQTT broker experiences a significant decline.

It took me a long time to initially identify that the performance issue was caused by sending event messages without a handler. Specifically, in this commit: vxmq. In this commit, I cancelled the registration of empty handlers for all event bus addresses, resulting in a sharp decline in performance.

@tsegismont @ckoutsouridis

@jpforevers
Copy link

jpforevers commented Dec 3, 2024

Amazing, someone has already encountered this problem and raised it.This is indeed a problem. I have already proven it locally.

I have an MQTT broker developed based on vert. x MQTT, which passes event messages through the event bus. Event messages sent through the event bus may not necessarily have handlers to handle, and without handlers, the overall performance of the MQTT broker experiences a significant decline.

It took me a long time to initially identify that the performance issue was caused by sending event messages without a handler. Specifically, in this commit: vxmq. In this commit, I cancelled the registration of empty handlers for all event bus addresses, resulting in a sharp decline in performance.

@tsegismont @ckoutsouridis

Ironically, the name of this commit is 'Optimize performance: remove an empty event consumption that may be due to performance issues', because I thought removing the empty handlers would improve performance, but it actually reduced performance. HaHa.

@jpforevers
Copy link

Amazing, someone has already encountered this problem and raised it.This is indeed a problem. I have already proven it locally.

I have an MQTT broker developed based on Vert.x MQTT, which passes event messages through the event bus. Event messages sent through the event bus may not necessarily have handlers to handle, and without handlers, the overall performance of the MQTT broker experiences a significant decline.

It took me a long time to initially identify that the performance issue was caused by sending event messages without a handler. Specifically, in this commit: vxmq. In this commit, I cancelled the registration of empty handlers for all event bus addresses, resulting in a sharp decline in performance.

@tsegismont @ckoutsouridis

It should be clarified that I have only confirmed that canceling the registration of empty handlers for all eventbus addresses will result in performance degradation, but I do not know the specific reason

@vietj
Copy link
Member

vietj commented Dec 4, 2024

anyone can share a benchmark ?

@jpforevers
Copy link

anyone can share a benchmark ?

I am testing locally and will provide the benchmark asap.

@jpforevers
Copy link

jpforevers commented Dec 17, 2024

anyone can share a benchmark ?

This is my benchmark testing project: eventbus-with-without-handler-benchmark. Start by: mvn clean compile exec:java, this is result on my computer:

Benchmark                                             (mode)   Mode  Cnt       Score      Error  Units
EventBusBenchmark.sendToAddressWithHandler     non-clustered  thrpt   25  151029.036 ± 2798.349  ops/s
EventBusBenchmark.sendToAddressWithHandler         clustered  thrpt   25  143005.199 ± 2476.310  ops/s
EventBusBenchmark.sendToAddressWithoutHandler  non-clustered  thrpt   25  150226.319 ± 3519.441  ops/s
EventBusBenchmark.sendToAddressWithoutHandler      clustered  thrpt   25   35797.632 ±  633.378  ops/s

The conclusion is that in cluster mode, there is a significant decrease in performance when publishing data to eventbus addresses without registered handlers. This is consistent with the performance of my MQTT broker code.

@vietj @tsegismont @ckoutsouridis

@jpforevers
Copy link

@vietj @tsegismont Please checkout my benchmark.

@tsegismont tsegismont added question and removed bug labels Jan 3, 2025
@tsegismont
Copy link
Contributor

@ckoutsouridis I tried the test from the issue's description and got opposite results. Anyway, this is not an actual benchmark so we can draw conclusions from it.

I wrote a simple JMH benchmark:

package io.vertx.benchmarks;

import io.vertx.core.Vertx;
import io.vertx.core.eventbus.EventBus;
import io.vertx.core.eventbus.Message;
import io.vertx.core.eventbus.MessageConsumer;
import org.openjdk.jmh.annotations.*;

import java.util.concurrent.TimeUnit;

@Warmup(iterations = 10, time = 200, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 200, timeUnit = TimeUnit.MILLISECONDS)
@Threads(1)
@BenchmarkMode(Mode.Throughput)
@Fork(value = 2)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Thread)
public class EventBusSendAndPublishBenchmark {

  private static final String ADDRESS_WITH_CONSUMER = "foo";
  private static final String ADDRESS_WITHOUT_CONSUMER = "bar";
  private static final String MSG = "msg";

  Vertx vertx;
  EventBus eventBus;
  MessageConsumer<String> consumer;

  @Setup
  public void setup() {
    vertx = Vertx.vertx();
    eventBus = vertx.eventBus();
    consumer = eventBus.consumer(ADDRESS_WITH_CONSUMER, EventBusSendAndPublishBenchmark::consume);
  }

  @CompilerControl(CompilerControl.Mode.DONT_INLINE)
  public static void consume(Message<String> ignore) {
  }

  @TearDown
  public void tearDown() throws Exception {
    consumer.unregister().await();
    vertx.close().await();
  }

  @Benchmark
  public void publish() {
    eventBus.publish(ADDRESS_WITH_CONSUMER, MSG);
  }

  @Benchmark
  public void publishNoHandlers() {
    eventBus.publish(ADDRESS_WITHOUT_CONSUMER, MSG);
  }
}

And got these results (admittedly, on my laptop):

Benchmark                                           Mode  Cnt     Score     Error   Units
EventBusSendAndPublishBenchmark.publish            thrpt   20  1505.896 ± 913.070  ops/ms
EventBusSendAndPublishBenchmark.publishNoHandlers  thrpt   20  4902.226 ± 618.335  ops/ms

There's quite some noise, but the results don't show any degradation when there are no handlers.

@tsegismont
Copy link
Contributor

@jpforevers the problem you are talking about is different: you're pointing to the fact that, in clustered mode, it's less performant to send (or publish) when there are no handlers.

This is expected: we can cache and maintain eventbus subscription lookups when there is a handler, we cannot when there isn't (we can't cache all lookups that would result in an empty result).

@tsegismont
Copy link
Contributor

Closing the issue as it is seems there is no bug denoted by the issue

@tsegismont tsegismont closed this as not planned Won't fix, can't repro, duplicate, stale Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants