Skip to content
This repository has been archived by the owner on May 27, 2024. It is now read-only.

BOR-2992: Set destination stream buffer size to 5mb #3

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

hemant6488
Copy link

No description provided.

@hemant6488 hemant6488 changed the title Set destination stream buffer size to 5mb BOR-2992: Set destination stream buffer size to 5mb Sep 12, 2022
@github-actions
Copy link

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to run corresponding integration tests:

  • source-postgres
  • destination-pubsub
  • destination-s3
  • source-sftp
  • source-clickhouse-strict-encrypt
  • destination-redis
  • destination-pulsar
  • destination-azure-blob-storage
  • destination-keen
  • source-tidb
  • source-snowflake
  • source-clickhouse
  • destination-kinesis
  • source-bigquery
  • destination-cassandra
  • source-cockroachdb-strict-encrypt
  • source-oracle
  • source-db2
  • source-scaffold-java-jdbc
  • source-e2e-test
  • source-postgres-strict-encrypt
  • source-cockroachdb
  • destination-jdbc
  • destination-meilisearch
  • destination-mqtt
  • destination-clickhouse-strict-encrypt
  • destination-mariadb-columnstore
  • source-e2e-test-cloud
  • destination-local-json
  • destination-dev-null
  • destination-scylla
  • source-mysql
  • destination-tidb
  • source-relational-db
  • source-alloydb
  • source-mssql-strict-encrypt
  • destination-postgres-strict-encrypt
  • destination-mysql-strict-encrypt
  • destination-elasticsearch
  • destination-mssql
  • destination-kafka
  • destination-oracle
  • destination-postgres
  • source-mysql-strict-encrypt
  • source-mongodb-strict-encrypt
  • destination-mongodb
  • destination-dynamodb
  • destination-mssql-strict-encrypt
  • source-kafka
  • destination-rockset
  • destination-e2e-test
  • destination-snowflake
  • source-oracle-strict-encrypt
  • destination-redshift
  • source-db2-strict-encrypt
  • destination-databricks
  • destination-gcs
  • source-mssql
  • destination-mysql
  • destination-oracle-strict-encrypt
  • destination-clickhouse
  • destination-csv
  • destination-mongodb-strict-encrypt
  • source-redshift
  • destination-bigquery-denormalized
  • source-mongodb-v2
  • destination-bigquery
  • source-elasticsearch
  • source-jdbc

@hemant6488
Copy link
Author

@void666 @jhecking This cannot be configured via environment variables, because env variables are never passed down to the connectors, they're handled at the airbyte-worker level. Plus this exists in the connector base interface, so I don't think it's a good idea to use the airbyte-config package inside the bases.

@jhecking
Copy link

As discussed, we don't need to use env variables. We can pass the desired max. buffer size as part of the connector config. The new config value also has to be added to the spec for the destination-s3 connector.

Comment on lines +52 to 65
public FileBuffer(final String fileExtension, final Integer perStreamBufferSizeMb) {
this.fileExtension = fileExtension;
this.maxConcurrentStreams = DEFAULT_MAX_CONCURRENT_STREAM_IN_BUFFER;
this.maxPerStreamBufferSizeBytes = perStreamBufferSizeMb * 1024 * 1024;
tempFile = null;
outputStream = null;
}

public FileBuffer(final String fileExtension, final int maxConcurrentStreams) {
this.fileExtension = fileExtension;
this.maxConcurrentStreams = maxConcurrentStreams;
this.maxPerStreamBufferSizeBytes = MAX_PER_STREAM_BUFFER_SIZE_BYTES;
tempFile = null;
outputStream = null;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I am understanding how the constructor overriding will work here
Both are of the same type, both are accepting two parameters.

FileBuffer(xlsx, 100) -> intending to override the maxConcurrentStreams
FileBuffer(xlsx, 200)  -> intending to override the perStreamBufferSizeMb

I would recommend using a three-parameter constructor, instead of overriding the two-parameter one. Since the types are castable (Integer -> int, and int -> Integer), the other usages will have an impact if we go with two parameter overrides.

@@ -8,10 +8,13 @@

public interface S3FormatConfig {

Integer DEFAULT_FILE_CHUNK_SIZE_MB = 200;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be a derived value from FileBuffer's MAX_PER_STREAM_BUFFER_SIZE_BYTES. If the default value is changed there, should be reflected here as well.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants