Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][docs] Add schema compatibility between Go and Java client #656

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 137 additions & 2 deletions docs/client-libraries-schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,10 @@ id: client-libraries-schema
title: Work with schema
sidebar_label: "Work with schema"
---

````mdx-code-block
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
````

## Get started with schema

Expand Down Expand Up @@ -142,4 +145,136 @@ while True:
except Exception:
# Message failed to be processed
consumer.negative_acknowledge(msg)
```
```

## Work with the Go schema
Working with Go schema has slight differences from Java schema.
This part will introduce the schema compatibility between Go client and Java client.

### Avro/JSON Schema
AVRO and JSON schema in Go and Java are compatible, but there are some differences in how the schemas are defined.
````mdx-code-block
<Tabs
defaultValue="AVRO"
values={[{"label":"AVRO","value":"AVRO"},{"label":"JSON","value":"JSON"}]}>

<TabItem value="AVRO">
Go typically uses schema definitions, a string JSON, to create schemas. However, Java often uses class types for schema creation. As a result, Java allows non-primitive fields to be nullable by default, while in Go's schema definition, the nullability of fields needs to be explicitly stated.

```go
// Compatible with defining a schema in Java
exampleSchemaDefCompatible := NewAvroSchema(`{"fields":
[
{"name":"id","type":"int"},{"default":null,"name":"name","type":["null","string"]}
],
"name":"MyAvro","namespace":"schemaNotFoundTestCase","type":"record"}`, nil)
// Not compatible with defining a schema in Java
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this refer to the following Java example?

If the Go Client is used like this, the Java client should also support it, right?

In java client, there are two ways:

  1. Use SchemaDefinition to set schema JsonDef
          String jsonDef = "{"fields":[{"name":"id","type":"int"},{"default":null,"name":"name","type":["null","string"]}],"name":"MyAvro","namespace":"schemaNotFoundTestCase","type":"record"}"
           Schema schema = Schema.AVRO(SchemaDefinition.builder().withJsonDef("").build());
  1. Or, to set not allow field is null.
      SchemaDefinition<Example> schemaDefinition =
                     SchemaDefinition.builder().withPojo(Example.class).withAlwaysAllowNull(false).build();
      Schema schema = Schema.AVRO(schemaDefinition);
      
    Producer<Example> producer = pulsarClient.newProducer(schema)
                .topic(topic).create();

we should show the solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you are right.

exampleSchemaDefIncompatible := NewAvroSchema(`{"fields":
[
{"name":"id","type":"int"},{"default":null,"name":"name","type":["string"]}
],
"name":"MyAvro","namespace":"schemaNotFoundTestCase","type":"record"}`, nil)
Producer := NewAvroSchema(exampleSchemaDef, nil)

```
```java
@AllArgsConstructor
@NoArgsConstructor
public static class Example {
public String name;
public int id;
}

Producer<Example> producer = pulsarClient.newProducer(Schema.AVRO(Example.class))
.topic(topic).create();
```
And another way to keep compatible is use schema definition to create schema in the JAVA client too.
```java
SchemaDefinition<Example> schemaDefinition =
SchemaDefinition.builder().withPojo(Example.class).withAlwaysAllowNull(false).build();
Schema schema = Schema.AVRO(schemaDefinition);

Producer<Example> producer = pulsarClient.newProducer(schema)
.topic(topic).create();
```
</TabItem>
<TabItem value="JSON">
Go typically uses schema definitions, a string JSON, to create schemas. However, Java often uses class types for schema creation. As a result, Java allows non-primitive fields to be nullable by default, while in Go's schema definition, the nullability of fields needs to be explicitly stated.

```go
// Compatible with defining a schema in Java
exampleSchemaDefCompatible := "{\"type\":\"record\",\"name\":\"Example\",\"namespace\":\"test\"," +
"\"fields\":[{\"name\":\"ID\",\"type\":\"int\"},{\"name\":\"Name\",\"type\":[\"null\", \"string\"]}]}"

consumerJSCompatible := NewJSONSchema(exampleSchemaDefCompatible, nil)
// Not compatible with defining a schema in Java
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which define code of Java is this incompatible with?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like Avro, is a problem with null fields.

exampleSchemaDefIncompatible := "{\"type\":\"record\",\"name\":\"Example\",\"namespace\":\"test\"," +
"\"fields\":[{\"name\":\"ID\",\"type\":\"int\"},{\"name\":\"Name\",\"type\":\"string\"}]}"

consumerJSIncompatible := NewJSONSchema(exampleSchemaDefIncompatible, nil)
```

```java
@AllArgsConstructor
@NoArgsConstructor
public static class Example {
public String name;
public int id;
}

Producer<Example> producer = pulsarClient.newProducer(Schema.AVRO(Example.class))
.topic(topic).create();
```

And another way to keep compatible is use schema definition to create schema in the JAVA client too.
```java
SchemaDefinition<Example> schemaDefinition =
SchemaDefinition.builder().withPojo(Example.class).withAlwaysAllowNull(false).build();
Schema schema = Schema.AVRO(schemaDefinition);

Producer<Example> producer = pulsarClient.newProducer(schema)
.topic(topic).create();
```

</TabItem>
</Tabs>
````

### Proto Schema
Proto and ProtoNative schemas exhibit some incompatibility between Go and Java clients. This is because Avro Proto currently does not provide full compatibility between Java and Go.

```proto
message TestMessage {
string stringField = 1;
int32 intField = 2;
}
```

Defining a schema in Java can be parsed by a class.
```json
protoSchemaDef = "{\"type\":\"record\",\"name\":\"TestMessage\",\"namespace\":\"org.apache.pulsar.client.api.schema.proto.Test\",\"fields\":[{\"name\":\"stringField\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"},\"default\":\"\"},{\"name\":\"intField\",\"type\":\"int\",\"default\":0}]}"

```

Defining a schema in Go needs to write manually.
```json
protoSchemaDef = "{\"type\":\"record\",\"name\":\"Example\",\"namespace\":\"test\"," +
"\"fields\":[{\"name\":\"num\",\"type\":\"int\"},{\"name\":\"msf\",\"type\":\"string\"}]}"
```
To address the incompatibility between Proto and ProtoNative types, you can follow this approach:
1. In the Java client, parse the message using the Avro Proto library to obtain the schema definition.
2. Use this obtained schema definition in the Go client to ensure both clients use the same schema definition.
```json
protoSchemaDef = "{\"type\":\"record\",\"name\":\"TestMessage\",\"namespace\":\"org.apache.pulsar.client.api.schema.proto.Test\",\"fields\":[{\"name\":\"stringField\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"},\"default\":\"\"},{\"name\":\"intField\",\"type\":\"int\",\"default\":0}]}"

```
3. Modify the Go Proto Message by adding compatibility extensions. For example, add `[(avro_java_string) = "String"]` extension to string type fields.
```proto
message TestMessage {
string stringField = 1 [(avro_java_string) = "String"];
int32 intField = 2;
}
```

### ProtoNative Schema
Similar to the Proto schema, ProtoNative schemas are also incompatible between Java and Go clients. To address this, you can use a unified schema define and add `[(avro_java_string) = "String"]` extension to the Go client's Proto message.