UPBKit explores integrating
upb with
the Google Objective-C Protocol Buffers library. (I'll refer to the Google
Objective-C Protocol Buffers library as "GPB" from now on — GPB is the
three-letter Objective-C prefix that the library uses for its classes, and is
likely an abbreviation for Google Protocol Buffers.) It leverages upb's fast
protobuf parsing
speed
(a.k.a. decoding/deserialization) to parse protos faster — while keeping the
familiar Objective-C protobuf API that folks are used to. If your application
deserializes large proto objects, you could see a significant speedup. This
works by using upb to decode the serialized protobuf, then copying the data from
the decoded upb_Message
to a standard GPBMessage
object. The copying sounds
slow, but current benchmarks show that it's often faster.
UPBKit is a research project, and isn't intended for production use at the moment.
Cavaeat: memory usage will probably be higher due to both the upb_Message
and
the GPBMessage
Objective-C object graph both being present. However, as with
all things performance, it depends what tradeoffs are important to you.
UPBKit can also create submessages lazily, which can dramatically speed up parse time (10× faster isn't out of the question). This is similar to Swift's lazy stored properties. Taking this example from the proto3 language guide:
message SearchResponse {
repeated Result results = 1; // `results` is an NSArray<GPBMessage *> object
}
message Result {
string url = 1; // `url` is an NSString object
string title = 2; // `title` is an NSString object
repeated string snippets = 3; // `snippets` is an NSArray<NSString *> object
}
Here, the only Objective-C object that would be created is the top-level
SearchResponse
object. UPBKit can create the results
sub-object as a special
"lazy object" that is fully created only when it's used. Compare this to the
full object graph that would be instantiated from a normal parse:
SearchResponse
, the NSArray
for results
, the NSString
s for url
and
title
, and the NSArray
of NSString
s for snippets
. The parse-time savings
become greater as protos become larger, since submessages, strings and bytes
appear frequently.
Note that upb still does a full parse of the serialized proto, including validating all strings as UTF-8 where the protobuf specification requires it. It's only the Objective-C object graph that's lazily created.
UPBKit uses Bazel as its primary build system. Xcode projects are also provided in the source code repository for convenience (and are generated by Bazel).
The author uses Visual Studio Code with the following extensions for development:
- Bazel
- clangd
- vscode-proto3
We use compile_commands.json to integrate with VSCode's semantic language features. To refresh the compile_commands database, run
bazel run @hedron_compile_commands//:refresh_all
You probably want to install SwiftProtobuf to tinker with the Swift bindings. On macOS:
brew install swift-protobuf
We use rules_xcodeproj to generate Xcode projects from Bazel's BUILD files. To refresh the Xcode projects:
bazel run //GPBExtensions/tests:benchmark_xcodeproj
bazel run //GPBExtensions/tests:GPBMessage_UPBDecodingTest_xcodeproj
If you're surprised that Bazel needs Internet access to do a build, this may help:
bazel fetch //...
See CONTRIBUTING.md
for details.
Apache 2.0; see LICENSE
for details.
This project is not an official Google project. It is not supported by Google and Google specifically disclaims all warranties as to its quality, merchantability, or fitness for a particular purpose.