-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(low-code): added flatten_lists option to FlattenFields transformation #206
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughThis pull request introduces a new Changes
Sequence DiagramsequenceDiagram
participant User
participant FlattenFields
participant Record
User->>FlattenFields: Configure flatten_lists
alt flatten_lists is True
FlattenFields->>Record: Flatten list structures
else flatten_lists is False
FlattenFields->>Record: Preserve original list structure
end
Possibly Related PRs
Suggested Reviewers
Hey there! 👋 I noticed you've added a really neat configuration option for list flattening. Quick question: have you considered adding some documentation about the new 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (4)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/transformations/flatten_fields.py (1)
14-14
: Consider adding docstring for the new parameter?The new
flatten_lists
parameter could benefit from a docstring explaining its purpose and default behavior, wdyt? For example:""" :param flatten_lists: Whether to flatten list fields into separate fields with numeric indices. If True (default), a list [1, 2, 3] becomes {"0": 1, "1": 2, "2": 3}. If False, lists are preserved in their original form. """airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
1883-1887
: The implementation looks good! Would you consider enhancing the documentation? wdyt?The property is well-structured and maintains backward compatibility. Consider adding more detailed documentation with examples to help users understand the behavior:
flatten_lists: title: Flatten Lists - description: Whether to flatten lists or leave it as is. Default is True. + description: | + Controls whether array fields should be flattened during transformation. When set to true (default), + arrays are flattened into dot-notation fields (e.g., {"array": [1, 2]} becomes {"array.1": 1, "array.2": 2}). + When set to false, arrays remain unchanged, which is useful for connectors like Airtable that require + preserving array structures. + examples: + - true # Flattens arrays: {"array": [1, 2]} -> {"array.1": 1, "array.2": 2} + - false # Preserves arrays: {"array": [1, 2]} -> {"array": [1, 2]}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(1 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(1 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(1 hunks)airbyte_cdk/sources/declarative/transformations/flatten_fields.py
(2 hunks)unit_tests/sources/declarative/transformations/test_flatten_fields.py
(3 hunks)
🧰 Additional context used
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
[error] 636-636: Argument "flatten_lists" to "FlattenFields" has incompatible type "bool | None"; expected "bool"
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/transformations/flatten_fields.py (1)
44-44
: LGTM! Clean implementation of the conditional list flatteningThe addition of
and self.flatten_lists
condition is elegant and maintains backward compatibility.unit_tests/sources/declarative/transformations/test_flatten_fields.py (1)
53-83
: LGTM! Great test coverageThe new test cases thoroughly cover both simple and nested list scenarios when
flatten_lists=False
. Particularly good to see the mixed case where some fields are flattened while lists are preserved.airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
726-730
: LGTM! Well-documented schema fieldThe
flatten_lists
field is properly defined with clear description and correct default value.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
Outdated
Show resolved
Hide resolved
/autofix
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the exhaustive test cases. I'll approve even regardless of some nit comments
[ | ||
({"FirstName": "John", "LastName": "Doe"}, {"FirstName": "John", "LastName": "Doe"}), | ||
({"123Number": 123, "456Another123": 456}, {"123Number": 123, "456Another123": 456}), | ||
(True, {"FirstName": "John", "LastName": "Doe"}, {"FirstName": "John", "LastName": "Doe"}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really attached to this but I'm wondering what is your opinion on this. It is becoming harder and harder when we have a lot of cases to relate to which parameter is what. Do you think it would be overkill to have _FLATTEN_LISTS = True
and _DO_NOT_FLATTEN_LISTS = False
and use them here for extra clarity? Or who it be better to have ids for each test cases? Or should we do both?
@@ -11,6 +11,8 @@ | |||
|
|||
@dataclass | |||
class FlattenFields(RecordTransformation): | |||
flatten_lists: bool = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make this a private fields by prefixing this by _
?
Problem
FlattenFields
transformation updates array fields by defaultfrom
{"array": [1, 2, 3]}
to
{"array.1": 1, "array.2": 2, "array.3": 3}
(for more examples see
unit_tests/sources/declarative/transformations/test_flatten_fields.py
)Some connectors (e.g
airtable
) don't need such transformation of arrays, we just need to keep it as is.Solution
Added
flatten_lists
toFlattenFields
(default is True) to be able to set up needed behavior of the transformation.Summary by CodeRabbit
New Features
flatten_lists
configuration to control list flattening behavior during data transformations.Tests
flatten_lists
configuration option, including scenarios for bothTrue
andFalse
settings.