Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨Use base64 to encode mssql binary field #33755

Merged
merged 11 commits into from
Jan 4, 2024
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ data:
connectorSubtype: database
connectorType: source
definitionId: b5ea17b1-f170-46dc-bc31-cc744ca984c1
dockerImageTag: 3.4.0
dockerImageTag: 3.4.1
dockerRepository: airbyte/source-mssql
documentationUrl: https://docs.airbyte.com/integrations/sources/mssql
githubIssueLabel: source-mssql
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,8 @@ static Properties getDebeziumProperties(final JdbcDatabase database, final Confi
props.setProperty("provide.transaction.metadata", "false");

props.setProperty("converters", "mssql_converter");
props.setProperty("binary.handling.mode", "base64");

props.setProperty("mssql_converter.type", MssqlDebeziumConverter.class.getName());

// If new stream(s) are added after a previously successful sync,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
import com.microsoft.sqlserver.jdbc.SQLServerResultSetMetaData;
import io.airbyte.cdk.db.jdbc.JdbcSourceOperations;
import io.airbyte.protocol.models.JsonSchemaType;
import java.nio.charset.Charset;
import java.sql.JDBCType;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
Expand All @@ -29,6 +28,7 @@
import java.time.format.DateTimeFormatter;
import java.time.format.DateTimeParseException;
import microsoft.sql.DateTimeOffset;
import org.apache.commons.codec.binary.Base64;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

Expand Down Expand Up @@ -128,7 +128,7 @@ protected void putBinary(final ObjectNode node,
final int index)
throws SQLException {
final byte[] bytes = resultSet.getBytes(index);
final String value = new String(bytes, Charset.defaultCharset());
final String value = Base64.encodeBase64String(bytes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this I think is in line with what we do in other java connectors 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add a test also for binary type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean that a binary field in the source database is encoded as a AirbyteType String? Is that something that was discussed, or just how it was implemented for simplicity reasons?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the current behavior for binary column is undefined in our public doc (and see slack discussion with destination team here https://airbytehq-team.slack.com/archives/C03C4AVJWG4/p1700173565688209), it sounds like base64 is the direction we are heading to; so yes mostly for simplicity reason.

node.put(columnName, value);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ protected void initTests() {
.sourceType("binary")
.airbyteType(JsonSchemaType.STRING_BASE_64)
.addInsertValues("CAST( 'A' AS BINARY(1))", "null")
.addExpectedValues("A", null)
.addExpectedValues("QQ==", null)
.createTablePatternSql(CREATE_TABLE_SQL)
.build());

Expand All @@ -267,7 +267,7 @@ protected void initTests() {
.fullSourceDataType("varbinary(3)")
.airbyteType(JsonSchemaType.STRING_BASE_64)
.addInsertValues("CAST( 'ABC' AS VARBINARY)", "null")
.addExpectedValues("ABC", null)
.addExpectedValues("QUJD", null)
.createTablePatternSql(CREATE_TABLE_SQL)
.build());

Expand Down
1 change: 1 addition & 0 deletions docs/integrations/sources/mssql.md
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,7 @@ WHERE actor_definition_id ='b5ea17b1-f170-46dc-bc31-cc744ca984c1' AND (configura

| Version | Date | Pull Request | Subject |
|:--------|:-----------|:------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------|
| 3.4.1 | 2024-01-02 | [33755](https://github.com/airbytehq/airbyte/pull/33755) | Encode binary to base64 format |
| 3.4.0 | 2023-12-19 | [33481](https://github.com/airbytehq/airbyte/pull/33481) | Remove LEGACY state flag |
| 3.3.2 | 2023-12-14 | [33505](https://github.com/airbytehq/airbyte/pull/33225) | Using the released CDK. |
| 3.3.1 | 2023-12-12 | [33225](https://github.com/airbytehq/airbyte/pull/33225) | extracting MsSql specific files out of the CDK. |
Expand Down
Loading