Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workbench: Implement a more deterministic solution for validating COType #6198

Open
sharadsw opened this issue Feb 5, 2025 · 0 comments
Open
Assignees
Labels
1 - Enhancement Improvements or extensions to existing behavior 2 - WorkBench Issues that are related to the WorkBench

Comments

@sharadsw
Copy link
Contributor

sharadsw commented Feb 5, 2025

Follow-up to #6194

As a quick fix, COType validation uses a naive approach of figuring out the COType value in a given row. It fetches the cotype column based on string matching but there could be scenarios where this is too simplistic. This can be improved upon by sending a parameter to ScopedTreeRecord which indicates which column to look at for COType, ideally obtained from parsing the rest of the dataset prior to this.

COL_NAMES = ["Type", "Collection Object Type"]
def find_cotype_in_row(row: Row):
for col_name, value in row.items():
if col_name in COL_NAMES:
return col_name, value
return None

@sharadsw sharadsw added 1 - Enhancement Improvements or extensions to existing behavior 2 - WorkBench Issues that are related to the WorkBench labels Feb 5, 2025
@sharadsw sharadsw self-assigned this Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 - Enhancement Improvements or extensions to existing behavior 2 - WorkBench Issues that are related to the WorkBench
Projects
None yet
Development

No branches or pull requests

1 participant