Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GODRIVER-3090 Optimize logging truncation for large documents #1699

Merged
merged 34 commits into from
Aug 6, 2024

Conversation

timothy-kim-mongo
Copy link
Contributor

@timothy-kim-mongo timothy-kim-mongo commented Jul 3, 2024

GODRIVER-3090

Summary

The goal of this pull request is to optimize the logging truncation process for large documents in the Go Driver. This optimization aims to improve driver performance when handling and truncating large extensive BSON documents and payloads, specifically addressing issues related to logging large documents, which can adversely affect driver performance and can cause resource containment and are critical for maintaining application performance and operational efficiency.

Key changes introduced include the implementation of a StringN method for bsoncore.Document. This method allows BSON documents to be stringified up to a specified byte limit (N), incorporating precise truncation logic that accounts for multi-byte characters. By leveraging the existing logging truncation algorithm, the method ensures accurate and efficient handling of BSON documents in extended JSON format, thereby maintaining data integrity and aligning with BSON specifications.

Furthermore, the pull request updates various BSON element string methods (Array.StringN, Value.StringN, etc.) to utilize the new StringN functionality. This ensures consistent and optimized truncation across different BSON element types, such as arrays and nested documents, when exceeding specified byte limits. Comprehensive unit tests and benchmark tests accompany these changes, validating the performance improvements in terms of reduced execution time (ns/op), decreased memory allocations (B/op), and enhanced efficiency in handling large BSON payloads.

Therefore, this pull request significantly improves driver performance by optimizing BSON document truncation for logging purposes. It addresses scalability challenges and enhances the reliability of the MongoDB Go Driver in handling large BSON payloads, thereby benefiting applications with critical logging and operational needs.

@timothy-kim-mongo timothy-kim-mongo self-assigned this Jul 3, 2024
@timothy-kim-mongo
Copy link
Contributor Author

Still looking for additional ways of optimizing

@timothy-kim-mongo timothy-kim-mongo marked this pull request as draft July 3, 2024 21:45
@mongodb-drivers-pr-bot mongodb-drivers-pr-bot bot added the priority-3-low Low Priority PR for Review label Jul 8, 2024
Copy link
Contributor

mongodb-drivers-pr-bot bot commented Jul 8, 2024

API Change Report

./x/bsonx/bsoncore

compatible changes

Array.StringN: added
Document.StringN: added
Element.StringN: added
Value.StringN: added

@timothy-kim-mongo timothy-kim-mongo marked this pull request as ready for review July 12, 2024 17:43
x/bsonx/bsoncore/value.go Outdated Show resolved Hide resolved
x/bsonx/bsoncore/document.go Outdated Show resolved Hide resolved
x/bsonx/bsoncore/document.go Outdated Show resolved Hide resolved
x/bsonx/bsoncore/document.go Outdated Show resolved Hide resolved
internal/logger/logger.go Outdated Show resolved Hide resolved
internal/logger/logger.go Outdated Show resolved Hide resolved
internal/logger/logger.go Outdated Show resolved Hide resolved
x/bsonx/bsoncore/array_test.go Outdated Show resolved Hide resolved
x/bsonx/bsoncore/document.go Outdated Show resolved Hide resolved
Comment on lines +117 to +119
if buf.Len() == n {
return buf.String()
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addition of this code, fixed a bug that was making truncation 11 instead of the length of 10 in one of the test cases.
It also improved the benchmarking speed of truncating massive arrays.
Before Change-
BenchmarkRawString/massive_arrays_StringN-10 642 1860169 ns/op 632576 B/op 411 allocs/op

After Change-
BenchmarkRawString/massive_arrays_StringN-10 105404 10738 ns/op 15744 B/op 396 allocs/op

prestonvasquez
prestonvasquez previously approved these changes Aug 2, 2024
Copy link
Collaborator

@prestonvasquez prestonvasquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @timothy-kim-mongo 🔧 . If you replace the _StringN benchmarks with the old solution:

str := bsoncore.Document(bs).String()
bsoncoreutil.Truncate(str, 1024)

and report allocations, you get a 3000x allocation improvement!

BenchmarkRawString/massive_arrays_StringN-10_old                          39          31489319 ns/op        60814318 B/op    1200137 allocs/op

BenchmarkRawString/massive_arrays_StringN-10                      117501             10122 ns/op           15744 B/op        396 allocs/op

bson/raw_test.go Show resolved Hide resolved
bson/raw_test.go Outdated Show resolved Hide resolved
internal/bsoncoreutil/bsoncoreutil_test.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@matthewdale matthewdale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 👍

@timothy-kim-mongo timothy-kim-mongo merged commit b0d91d0 into mongodb:master Aug 6, 2024
30 of 33 checks passed
@timothy-kim-mongo timothy-kim-mongo deleted the GODRIVER-3090 branch August 6, 2024 06:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority-3-low Low Priority PR for Review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants