Improve performance by skipping bytes instead of parsing again on a second pass (or first) #17

dkomanov · 2024-09-16T15:49:30Z

For 2GB heap dump:

$ wc large.hprof 
   2545071   20258239 2438383468 large.hprof

Baseline performance (ParallelGC is much better):

$ time java -Xmx5G -XX:+UseParallelGC -cp $CLASSPATH edu.tufts.eaftan.hprofparser.Parse --handler=edu.tufts.eaftan.hprofparser.handler.NullRecordHandler large.hprof

real	0m34.781s
user	0m39.424s
sys	0m1.759s

$ time java -Xmx5G -XX:+UseG1GC -cp $CLASSPATH edu.tufts.eaftan.hprofparser.Parse --handler=edu.tufts.eaftan.hprofparser.handler.NullRecordHandler large.hprof

real	0m42.754s
user	0m57.383s
sys	0m3.121s

After optimizations:

$ time java -Xmx5G -XX:+UseParallelGC -cp $CLASSPATH edu.tufts.eaftan.hprofparser.Parse --handler=edu.tufts.eaftan.hprofparser.handler.NullRecordHandler large.hprof

real	0m24.539s
user	0m25.911s
sys	0m1.373s

Other possible optimizations:

Increase buffer size for BufferedInputStream. Tried it, at most 200 milliseconds gain (variability is bigger).
Change primArrayDump interface to use native primitive arrays instead of Value<?>[]. It gives significant gain: 16 seconds against 24 seconds.

In local tests it showed improvement: 34 seconds -> 24 seconds for 2GB heap dump.

dkomanov · 2024-09-16T15:53:22Z

This is how primArrayDump optimization would look like: dkomanov@81891b2

dkomanov added 3 commits September 16, 2024 18:31

Remove unused mySkipBytes

20ca6b3

Use try-with-resources, skip bytes on the second pass

d40270c

Improve performance by skipping bytes on passes when data is not needed

2c9209a

In local tests it showed improvement: 34 seconds -> 24 seconds for 2GB heap dump.

dkomanov force-pushed the master branch from c7a5e1e to 2c9209a Compare September 16, 2024 15:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance by skipping bytes instead of parsing again on a second pass (or first) #17

Improve performance by skipping bytes instead of parsing again on a second pass (or first) #17

dkomanov commented Sep 16, 2024

dkomanov commented Sep 16, 2024

Improve performance by skipping bytes instead of parsing again on a second pass (or first) #17

Are you sure you want to change the base?

Improve performance by skipping bytes instead of parsing again on a second pass (or first) #17

Conversation

dkomanov commented Sep 16, 2024

dkomanov commented Sep 16, 2024