Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc(fivetran_sdk): Updated development guide for partners #43

Closed
wants to merge 20 commits into from
Closed
Changes from 18 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 43 additions & 6 deletions development-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,13 +61,32 @@ The following are hard requirements to be able to deploy Partner code to Fivetra
- Encrypt HTTP requests: Things like URLs, URL parameters, and query params are always encrypted for logging, and customer approval is needed to decrypt and examine them.


## Connector Guidelines
## Setup Form Guidelines
- Keep the form clear and concise, only requesting essential information for successful connector setup.
- Use clear and descriptive labels for each form field. Avoid technical jargon if possible.
- Organize the fields in a logical order that reflects the setup process.

### RPC Calls
#### ConfigurationForm
This operation retrieves all the setup form fields and tests information. You can provide various parameters for the fields to enhance the user experience, such as descriptions, optional fields, and more.

#### Test
The previous RPC call retrieves the tests that need to be executed during connection setup. This operation then invokes the test with the customer's credentials as parameters. Finally, it should return a success or failure indication for the test execution.

## Source Connector Guidelines

- Don't push anything other than source data to the destination. State will be saved to production DB and returned in `UpdateRequest`.
- Don't forget to handle new schemas/tables/columns per the information and user choices in `UpdateRequest#selection`.
- Make sure you checkpoint at least once an hour. The more frequently you do it, the better.

## Destination Guidelines
### RPC Calls
#### Schema
This operation retrieves the customer's schemas, tables, and columns. It also includes an optional `selection_not_supported` field that indicates whether customers can select or deselect tables and columns within the Fivetran dashboard.

#### Update
This operation should retrieve data from the source. We send a request using the `UpdateRequest` message, which includes the customer's state, credentials, and schema information. The response, streaming through the `UpdateResponse` message, can contain data records and other supported operations.

## Destination Connector Guidelines

- Do not push anything other than source data to the destination.

Expand All @@ -88,7 +107,7 @@ Batch files are compressed using [ZSTD](https://en.wikipedia.org/wiki/Zstd)
### Batch Files
- Each batch file is size limited to 100MB
- Number of records in each batch file can vary depending on row size
- Currently we only support CSV file format
- Currently we support CSV and PARQUET file format

#### CSV
- Fivetran creates batch files using `com.fasterxml.jackson.dataformat.csv.CsvSchema` which by default doesn't consider backslash as escape character. If you are reading the batch file then make sure that you do not consider backslash as escape character.
Expand All @@ -105,13 +124,18 @@ This operation should report all columns in the destination table, including Fiv
- This operation might be requested for a table that does not exist in the destination. In that case, it should NOT fail, simply ignore the request and return `success = true`.
- `utc_delete_before` has millisecond precision.

#### WriteBatchRequest
- `replace_files` is for `upsert` operation where the rows should be inserted if they don't exist or updated if they do. Each row will always provide values for all columns. Set the `_fivetran_synced` column in the destination with the values coming in from the csv files.
#### WriteBatch
This operation provides details about the batch files containing the records to be pushed to the destination. We provide the `WriteBatchRequest` parameter that contains all the information required for you to read the batch files. Here are some of the fields included in the request message:
- `replace_files` is for the `upsert` operation where the rows should be inserted if they don't exist or updated if they do. Each row will always provide values for all columns. Set the `_fivetran_synced` column in the destination with the values coming in from the .CSV files.

- `update_files` is for `update` operation where modified columns have actual values whereas unmodified columns have the special value `unmodified_string` in `CsvFileParams`. Soft-deleted rows will arrive in here as well. Update the `_fivetran_synced` column in the destination with the values coming in from the csv files.
- `update_files` is for the `update` operation where modified columns have actual values whereas unmodified columns have the special value `unmodified_string` in `CsvFileParams`. Soft-deleted rows will arrive in here as well. Update the `_fivetran_synced` column in the destination with the values coming in from the .CSV files.
fivetran-satvikpatil marked this conversation as resolved.
Show resolved Hide resolved

- `delete_files` is for `hard delete` operation. Use primary key columns (or `_fivetran_id` system column for primary-keyless tables) to perform `DELETE FROM`.

- `keys` is a map that provides a list of secret keys, one for each batch file, that can be used to decrypt them.

- `file_params` provides information about the file type and any configurations applied to it, such as encryption or compression.

Also, Fivetran will deduplicate operations such that each primary key will show up only once in any of the operations

Do not assume order of columns in the batch files. Always read the CSV file header to determine column order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Do not assume order of columns in the batch files. Always read the CSV file header to determine column order.
Do not assume order of columns in the batch files. Always read the columns of batch file to determine column order.

Expand All @@ -120,6 +144,19 @@ Do not assume order of columns in the batch files. Always read the CSV file head
- `null_string` value is used to represent `NULL` value in all batch files.
- `unmodified_string` value is used to indicate columns in `update_files` where the values did not change.

#### Capabilities
This operation offers the ability for the partner code to declare its choices for capabilities listed below:

- Datatype Mappings: Provides the option to map destination data types to Fivetran data types.
- Max value for columns: Provides an option to specify the maximum value for each data type.

#### AlterTable
This operation is used to communicate changes to a table as specific update operations. The `SchemaDiff` message within the `AlterTableRequest` parameter provides the details:
- Adding a column (`add_column`): Fivetran uses this field to provide information about a new column to be added in a destination table.
- Update Column type (`change_column_type`): This field provides information on updated type of a column in the source that needs to be reflected in a destination table.
- Primary key updates (`updated_primary_keys`): If the primary key has changed, this field lists all the columns used in the updated primary key.
fivetran-satvikpatil marked this conversation as resolved.
Show resolved Hide resolved


### Examples of Data Types
Examples of each [DataType](https://github.com/fivetran/fivetran_sdk/blob/main/common.proto#L73C6-L73C14) as they would appear in CSV batch files are as follows:
- UNSPECIFIED: This data type will never appear in batch files
Expand Down
Loading