Skip to content

andrep/upbkit

UPBKit

UPBKit explores integrating upb with the Google Objective-C Protocol Buffers library. (I'll refer to the Google Objective-C Protocol Buffers library as "GPB" from now on — GPB is the three-letter Objective-C prefix that the library uses for its classes, and is likely an abbreviation for Google Protocol Buffers.) It leverages upb's fast protobuf parsing speed (a.k.a. decoding/deserialization) to parse protos faster — while keeping the familiar Objective-C protobuf API that folks are used to. If your application deserializes large proto objects, you could see a significant speedup. This works by using upb to decode the serialized protobuf, then copying the data from the decoded upb_Message to a standard GPBMessage object. The copying sounds slow, but current benchmarks show that it's often faster. UPBKit is a research project, and isn't intended for production use at the moment.

Cavaeat: memory usage will probably be higher due to both the upb_Message and the GPBMessage Objective-C object graph both being present. However, as with all things performance, it depends what tradeoffs are important to you.

Lazy Creation

UPBKit can also create submessages lazily, which can dramatically speed up parse time (10× faster isn't out of the question). This is similar to Swift's lazy stored properties. Taking this example from the proto3 language guide:

message SearchResponse {
  repeated Result results = 1;  // `results` is an NSArray<GPBMessage *> object
}

message Result {
  string url = 1;  // `url` is an NSString object
  string title = 2;  // `title` is an NSString object
  repeated string snippets = 3;  // `snippets` is an NSArray<NSString *> object
}

Here, the only Objective-C object that would be created is the top-level SearchResponse object. UPBKit can create the results sub-object as a special "lazy object" that is fully created only when it's used. Compare this to the full object graph that would be instantiated from a normal parse: SearchResponse, the NSArray for results, the NSStrings for url and title, and the NSArray of NSStrings for snippets. The parse-time savings become greater as protos become larger, since submessages, strings and bytes appear frequently.

Note that upb still does a full parse of the serialized proto, including validating all strings as UTF-8 where the protobuf specification requires it. It's only the Objective-C object graph that's lazily created.

Hacking

UPBKit uses Bazel as its primary build system. Xcode projects are also provided in the source code repository for convenience (and are generated by Bazel).

Visual Studio Code integration

The author uses Visual Studio Code with the following extensions for development:

  • Bazel
  • clangd
  • vscode-proto3

We use compile_commands.json to integrate with VSCode's semantic language features. To refresh the compile_commands database, run

bazel run @hedron_compile_commands//:refresh_all

You probably want to install SwiftProtobuf to tinker with the Swift bindings. On macOS:

brew install swift-protobuf

Xcode

We use rules_xcodeproj to generate Xcode projects from Bazel's BUILD files. To refresh the Xcode projects:

bazel run //GPBExtensions/tests:benchmark_xcodeproj
bazel run //GPBExtensions/tests:GPBMessage_UPBDecodingTest_xcodeproj

Offline Bazel

If you're surprised that Bazel needs Internet access to do a build, this may help:

bazel fetch //...

Contributing

See CONTRIBUTING.md for details.

License

Apache 2.0; see LICENSE for details.

Disclaimer

This project is not an official Google project. It is not supported by Google and Google specifically disclaims all warranties as to its quality, merchantability, or fitness for a particular purpose.

About

upbkit: upb protobuf deserialization for the Objective-C Protocol Buffers library

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published