-
-
Notifications
You must be signed in to change notification settings - Fork 628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize memory usage by avoiding intermediate buffer in message serialization #928
Conversation
…alization This commit replaces the use of an intermediate buffer in the message serialization process with a direct write-to-buffer approach. The original implementation used MustMarshalBinary() which involved an extra memory copy to an intermediate buffer before writing to the final writeBuffer, leading to high memory consumption for large messages. The new WriteTo function writes message data directly to the writeBuffer, significantly reducing memory overhead and CPU time spent on garbage collection.
Optimization Documentation SummaryExperimental Environment:
CPU Usage Comparison Table
Memory Allocation and Transfer Rate Comparison Table
Based on the aforementioned experimental environment and test results, the following conclusions can be drawn regarding the improvements:
In summary, the experimental results indicate that the implemented optimizations have succeeded in reducing memory consumption while simultaneously boosting data transmission performance. This represents a significant performance optimization within the BitTorrent client. Future efforts will focus on further optimization to ensure continued enhancement of speed while further reducing resource consumption. |
Could you include a benchmark for peerConnMsgWriter.write? I suspect that precomputing the message length, and growing the buffer manually aren't necessary. I would expect that the majority of the gains would come just from writing directly into the buffer, which should retain its size and underlying storage everytime the buffers are swapped in the writer routine. |
Thanks for your feedback. I'll update the community when I have some benchmark results to share, and conduct a detailed analysis to assess the benefits of precomputing the message length. |
Thanks! |
Benchmarking ReportPerformed on a macOS system equipped with an Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz and utilizing an amd64 architecture. Benchmark ResultsBelow is a tabulated summary showing how each function performs with various piece lengths:
|
The new commit has removed the code for manually growing the buffer. The benefit of handling the buffer growth manually appears to be marginal. |
Thanks, I'll check it out soon. |
I made some tweaks to the benchmarks to make them more consistent. In particular, one form was always using the 4M length. I also added the most common size actually in use (16KiB, the default block size). I think a few types have gone missing in the changes, I'll fix those up too and likely merge shortly. |
The start and stop of the timer actually caused issues with the timing, as the reset is too small to have any effect on the benchmark. |
I reduced the changes a bit to reuse the old payload writing code, since HashRequest was missing. The MarshalBinary performance is doubled, but the write to buffer stuff is not quite as fast (but still 16x faster than it was). I'll merge what I have so far, but if you want to make tweaks to reduce the allocations further to get the extra 50% back, I think the code base is in a better position to take those changes, and it will be clearer what is being done to get that. I should add the allocated space is massively reduced. Sorry if the changes were pretty heavy handed, but I think restructuring the write to buffer, and then doing smaller cleanups in a separate change is less risky. |
I apologize for the oversight with the HashRequest; it was unintentional and I regret any inconvenience caused. Also, I'm grateful for your thorough work on the updates. Collaborating with you has been incredibly enlightening, and I look forward to doing so again. |
This commit replaces the use of an intermediate buffer in the message serialization process with a direct write-to-buffer approach. The original implementation used MustMarshalBinary() which involved an extra memory copy to an intermediate buffer before writing to the final writeBuffer, leading to high memory consumption for large messages. The new WriteTo function writes message data directly to the writeBuffer, significantly reducing memory overhead and CPU time spent on garbage collection.