This repository is no longer maintained, as it is no longer used by any of the teams that have been maintaining it.
If you are looking to load protobuf messages into BigQuery, consider using Google BigQuery Storage API.
Functionality for converting between Protocol Buffers and Avro. This can for example be used to bulk load protobuf messages to BigQuery.
Examples use the following protobuf message:
message Book {
string name = 1;
string author = 2;
string title = 3;
bool read = 4;
}
Avro schema inference for arbitrary protobuf messages.
func ExampleInferSchema() {
msg := &library.Book{}
schema, err := protoavro.InferSchema(msg.ProtoReflect().Descriptor())
if err != nil {
panic(err)
}
expected := avro.Nullable(avro.Record{
Type: avro.RecordType,
Name: "Book",
Namespace: "google.example.library.v1",
Fields: []avro.Field{
{Name: "name", Type: avro.Nullable(avro.String())},
{Name: "author", Type: avro.Nullable(avro.String())},
{Name: "title", Type: avro.Nullable(avro.String())},
{Name: "read", Type: avro.Nullable(avro.Boolean())},
},
})
fmt.Println(cmp.Equal(expected, schema))
// Output: true
}
Writes protobuf messages to an Object Container File.
func ExampleMarshaler() {
var msg library.Book
var b bytes.Buffer
marshaller, err := protoavro.NewMarshaler(msg.ProtoReflect().Descriptor(), &b)
if err != nil {
panic(err)
}
if err := marshaller.Marshal(
&library.Book{
Name: "shelves/1/books/1",
Title: "Harry Potter",
Author: "J. K. Rowling",
},
&library.Book{
Name: "shelves/1/books/2",
Title: "Lord of the Rings",
Author: "J. R. R. Tolkien",
},
); err != nil {
panic(err)
}
}
Reads protobuf messages from a Object Container File.
func ExampleUnmarshaler() {
var reader io.Reader
unmarshaller, err := protoavro.NewUnmarshaler(reader)
if err != nil {
panic(err)
}
for unmarshaller.Scan() {
var msg library.Book
if err := unmarshaller.Unmarshal(&msg); err != nil {
panic(err)
}
}
}
Messages are mapped as nullable records in Avro. All fields will be nullable. Fields will have the same casing as in the protobuf descriptor.
One ofs are mapped to nullable fields in Avro, where at most one field will be set at a time.
Maps are mapped as a list of records with two fields, key
and value
.
Order of map entries is undefined.
Enums are mapped as enums of string values in Avro.
Some well known types have a special mapping:
Protobuf | Avro |
---|---|
wrappers (ex google.protobuf.DoubleValue) | Nullable scalars (ex [null, double] ) |
google.protobuf.Any | string containing JSON encoding of Any |
google.protobuf.Struct | string containing JSON encoding of Struct |
google.protobuf.Timestamp | long.timestamp-micros |
google.protobuf.Duration | float (seconds) |
google.type.Date | int.date |
google.type.TimeOfDay | long.time-micros |
Avro does not have a native type for timestamps with nanosecond precision.
google.protobuf.Timestamp
and google.type.TimeOfDay
are truncated to
microsecond precision when encoded as Avro.