Skip to content
This repository has been archived by the owner on Oct 16, 2024. It is now read-only.

Improve performance of getOperationName #230

Merged
merged 1 commit into from
May 2, 2022

Conversation

loicmathieu
Copy link
Contributor

This is a copy of my PR on the (dead) original java interceptor lib.

I recently performance load test on an application using the java interceptor to send traces to Jaeger.
Inside a performance profile, I saw a high cost (3% total CPU cost, but more than 60% of span creation+start+finish time) for the getOperationMethod due to the usage of a StringFormatter.

I did some small experiment with JHM, and using a StringBuilder with an initial capacity seems a better option (10x quicker). As creating span occurs in the hot path of an application, I think it's a good thing to remove this hotspot.

I try several approach, this seems to be the better. We can go with simple string concatenation if the code seems too complex, as since Java 9 (and the string concatenation improvement) it is as performant as using a StringBuilder.

Here is my JMH benchmark

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class StringFormatVsJoinerVsBuilderVsConcat {

    private Method method;

    @Setup
    public void setup() throws NoSuchMethodException {
        method = TestClass.class.getMethod("testMethod");
    }

    @Benchmark
    public String stringFormat() {
        return String.format("%s.%s", method.getDeclaringClass().getName(), method.getName());
    }

    @Benchmark
    public String stringJoiner() {
        StringJoiner stringJoiner = new StringJoiner(".");
        stringJoiner.add(method.getDeclaringClass().getName());
        stringJoiner.add(method.getName());
        return stringJoiner.toString();
    }

    @Benchmark
    public String stringBuilder() {
        StringBuilder stringBuilder = new StringBuilder();
        stringBuilder.append(method.getDeclaringClass().getName());
        stringBuilder.append('.');
        stringBuilder.append(method.getName());
        return stringBuilder.toString();
    }

    @Benchmark
    public String stringBuilderWithCapacity() {
        int capacity = method.getDeclaringClass().getName().length() + method.getName().length() + 1;
        StringBuilder stringBuilder = new StringBuilder(capacity);
        stringBuilder.append(method.getDeclaringClass().getName());
        stringBuilder.append('.');
        stringBuilder.append(method.getName());
        return stringBuilder.toString();
    }

    @Benchmark
    public String stringConcat() {
        return method.getDeclaringClass().getName() + '.' + method.getName();
    }

    public static class TestClass {
        public void testMethod() {
            //do nothing here
        }
    }
}

And the result on my laptop (Linux - Java 11 - 6 cores)

Benchmark                                                        Mode  Cnt    Score    Error  Units
StringFormatVsJoinerVsBuilderVsConcat.stringBuilder              avgt    5   59,093 ±  6,031  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringBuilderWithCapacity  avgt    5   40,480 ±  3,536  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringConcat               avgt    5   60,190 ±  5,230  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringFormat               avgt    5  543,775 ± 35,095  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringJoiner               avgt    5   62,161 ±  2,027  ns/op

@loicmathieu
Copy link
Contributor Author

See the PR on the (dead) original repo: opentracing-contrib/java-interceptors#16

@Ladicek
Copy link
Contributor

Ladicek commented Apr 26, 2022

I generally like this, but immediately thought: are the duplicate pieces of code in your new implementation any bad? I added this method to your benchmark:

    @Benchmark
    public String ladicek() {
        String className = method.getDeclaringClass().getName();
        String methodName = method.getName();
        int capacity = className.length() + methodName.length() + 1;
        return new StringBuilder(capacity).append(className).append('.').append(methodName).toString();
    }

and ran it using OpenJDK 1.8.0_312 on my desktop machine (which was completely idle, I wasn't even looged into a GUI session). These are the results:

Benchmark                                                        Mode  Cnt    Score   Error  Units
StringFormatVsJoinerVsBuilderVsConcat.ladicek                    avgt   25   16.411 ± 0.186  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringBuilder              avgt   25   90.206 ± 1.100  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringBuilderWithCapacity  avgt   25   51.072 ± 0.502  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringConcat               avgt   25   89.840 ± 0.289  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringFormat               avgt   25  432.582 ± 6.439  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringJoiner               avgt   25  104.483 ± 1.661  ns/op

Looks like the duplicate pieces are any bad and simplifying the code per above should lead to even bigger gains :-)

@loicmathieu
Copy link
Contributor Author

@Ladicek this looks interesting ! I can incorporate your change in my PR and run it again on Java 11 as it may behave differently than 8.

I forgot that even if the method starts with get they may not be trivial getters ;) the reflection API always deserve caregfully crafted usage.

@Ladicek
Copy link
Contributor

Ladicek commented Apr 26, 2022

I'm running the benchmark on JDK 11 now, but it takes a while :-)

@Ladicek
Copy link
Contributor

Ladicek commented Apr 26, 2022

Ah and Class.getName() and Method.getName() actually are trivial getters, but for some reason, it seems the JIT doesn't optimize the code so well. I don't know if the usage of reflection prevents some optimizations or whether I'm just running the benchmark wrongly :-)

@loicmathieu
Copy link
Contributor Author

Class.getName() is not trivial, and there is a call to getDeclaringClass also but I won't expect such big perf differencies here as the JIT should properly inline everything. Maybe there is a special treatment for Class and Method.
I'm running the benchmark on my laptop too, wait and see.

@loicmathieu
Copy link
Contributor Author

Here are the result on JDK 11, the new impl is in stringBuilderWithCapacityImproved and there is no noticeable performance improvement on my setup. However, I can update the code if you want?

Benchmark                                                                Mode  Cnt    Score   Error  Units
StringFormatVsJoinerVsBuilderVsConcat.stringBuilder                      avgt   25   62,927 ± 0,720  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringBuilderWithCapacity          avgt   25   44,191 ± 0,067  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringBuilderWithCapacityImproved  avgt   25   43,504 ± 0,086  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringConcat                       avgt   25   63,333 ± 0,244  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringFormat                       avgt   25  582,902 ± 3,452  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringJoiner                       avgt   25   63,200 ± 0,134  ns/op

@Ladicek
Copy link
Contributor

Ladicek commented Apr 26, 2022

Here are my results on OpenJDK 11.0.14.1:

Benchmark                                                        Mode  Cnt    Score    Error  Units
StringFormatVsJoinerVsBuilderVsConcat.ladicek                    avgt   25   15.474 ±  0.119  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringBuilder              avgt   25   55.076 ±  0.909  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringBuilderWithCapacity  avgt   25   48.263 ±  0.201  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringConcat               avgt   25   54.160 ±  0.854  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringFormat               avgt   25  346.262 ± 20.013  ns/op
StringFormatVsJoinerVsBuilderVsConcat.stringJoiner               avgt   25   52.265 ±  0.531  ns/op

I can't really explain the difference :-)

@loicmathieu
Copy link
Contributor Author

I can't really explain the difference :-)

Well, we'll need to ge deeper using profiler, gc stats, asm ... Or we can just decide to go with it ;)

I updated my patch to call the methods only once.

@loicmathieu
Copy link
Contributor Author

Fun fact, by updating my benchmark method from

    @Benchmark
    public String stringBuilderWithCapacityImproved() {
        String className = method.getDeclaringClass().getName();
        String methodName = method.getName();
        int capacity = className.length() + methodName.length() + 1;
        StringBuilder stringBuilder = new StringBuilder(capacity);
        stringBuilder.append(className);
        stringBuilder.append('.');
        stringBuilder.append(methodName);
        return stringBuilder.toString();
    }

To

    @Benchmark
    public String stringBuilderWithCapacityImproved() {
        String className = method.getDeclaringClass().getName();
        String methodName = method.getName();
        int capacity = className.length() + methodName.length() + 1;
        return new StringBuilder(capacity).append(className).append('.').append(methodName).toString();
    }

Which should translate to the same bytecode I have different results which are more aligned with what you observe. I'll dig a little more tomorrow.

@loicmathieu
Copy link
Contributor Author

I relaunch the test yesterday and ... they now perform the same ...

Anyway, the current code may performe 10x better, can we proceed with it @Ladicek ?

@Ladicek
Copy link
Contributor

Ladicek commented Apr 29, 2022

LGTM, but I'll leave the decision to @radcortez :-)

@radcortez radcortez merged commit 8097a51 into smallrye:main May 2, 2022
@radcortez radcortez added this to the 2.1.1 milestone Aug 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants