-
Notifications
You must be signed in to change notification settings - Fork 3
Conversation
|
||
@Configuration | ||
@ConfigurationProperties | ||
public class OrphanedDataStrategyConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so we need a suffix on this one since it's not the actual OrphanedDataStrategy
. I see we have one other class in this package called EventReceiverConfiguration
- should we follow the naming convention and change Config
->Configuration
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes good point, one day I'll name these config classes appropriately...
import com.hotels.bdp.circustrain.api.conf.OrphanedDataStrategy; | ||
|
||
@Configuration | ||
@ConfigurationProperties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this need a prefix "orphaned-data-strategy" to get bound to the value from the YAML?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so because in this case there are no "sub" properties. Spring is clever enough to know to map the yaml property "orphaned-data-strategy" to the field in here called "orphanedDataStrategy". I've tested this out on EMR doing an actual replication and it works. Shame there's no integration tests in SY like in CT to test out different config settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think can be separate we already have an issue: #5
@@ -147,6 +164,7 @@ The table below describes all the available configuration values for Shunting Ya | |||
|`sqs.queue`|Yes|Fully qualified URI of the [AWS SQS](https://aws.amazon.com/sqs/) Queue to read the Hive events from.| | |||
|`sqs.wait.time.seconds`|No|Wait time in seconds for which the receiver will poll the SQS queue for a batch of messages. Default is 10 seconds. Read more about long polling with AWS SQS [here](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-long-polling.html).| | |||
|`source-table-filter.table-names`|No|A list of tables selected for Shunting Yard replication. Supported format: `database_1.table_1, database_2.table_2`. If these are not provided, Shunting Yard will not replicate any tables.| | |||
|`orphaned-data-strategy`|No|Orphaned data strategy to use for replications. Possible values: `NONE` and `HOUSEKEEPING`. Default is `HOUSEKEEPING`.| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm missing where this default happens in the code? I see you've added the ability to set an orphaned data strategy but is there a test (maybe an existing one so it doesn't show up in the PR) that shows that if you don't set this then the value is HOUSEKEEPING
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this happens in OrphanedDataStrategyConfig
line 26 the default is if no yaml configuration overrides this initial setting.
So there's no test for this because i guess it would require an integration test of sorts in order to test the mapping from yaml to bean, and we don't have these in SY at the moment. should I make it part of this PR to add these?
README.md
Outdated
@@ -163,6 +181,18 @@ Graphite configuration can be passed to Shunting Yard using an optional `--ct-co | |||
namespace: com.company.shuntingyard | |||
prefix: dev | |||
|
|||
### Configuring table transformations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section in not required to be added coz it's already available in CT's readme.
Instead, I suggest we add a section after Housekeeping
section below titled "Using Beekeeper for Housekeeping" and then provide the table-properties
that need to be set in order to activate Beekeeper for your target tables / refer to the README of Beekeeper where you mention it. Similar to how its mentioned for configuring Housekeeping: https://github.com/ExpediaGroup/shunting-yard/blob/master/README.md#sample-ct-configyml-for-housekeeping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update this
@@ -96,6 +96,7 @@ public void withMultiValuePartitionFilter() { | |||
assertThat(replication.getReplicaTable().getDatabaseName()).isEqualTo("replicaDatabaseName"); | |||
assertThat(replication.getReplicaTable().getTableName()).isEqualTo("replicaTableName"); | |||
assertThat(replication.getReplicaTable().getTableLocation()).isEqualTo("replicaTableLocation"); | |||
assertThat(replication.getOrphanedDataStrategy()).isEqualTo(OrphanedDataStrategy.HOUSEKEEPING); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be good to add a test when OrphanedDataStrategy
is null.
The CircusTrainConfig
doesn't do a null check on this argument at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, will add this in, thanks
Co-Authored-By: Abhimanyu Gupta <[email protected]>
Co-Authored-By: Abhimanyu Gupta <[email protected]>
…unting-yard into support-table-parameters
Co-Authored-By: Abhimanyu Gupta <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, let's just leave this to be updated and merged once we've released a new version of CT.
This now uses the latest Circus Train version, also tested this on EMR and it works, so going to merge. |
Making a small update for SY to support the latest Circus Train and to allow skipping CT's housekeeping process.
Main change is in
CircusTrainRunner