Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(table_metadata): longType truncate the ROW/MAP structure #31587

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

justinpark
Copy link
Member

@justinpark justinpark commented Dec 20, 2024

SUMMARY

Currently, coltype stringify does not properly explain the structure of multi-depth in the form of MAP and ROW types.
This code has been modified to help clarify the structure of MAP/ROW types in Trino. (FYI: Superset does NOT consume longType value yet but it will be used for #26395)

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

api/v1/database//table_metadata

Before:

    {
      "comment": null,
      "keys": [],
      "longType": "BOOLEAN",
      "name": "is_boolean",
      "type": "BOOLEAN"
    },
    {
      "comment": null,
      "keys": [],
      "longType": "ROW",
      "name": "map_value",
      "type": "ROW"
    },

After:

    {
      "comment": null,
      "keys": [],
      "longType": "BOOLEAN()",
      "name": "is_boolean",
      "type": "BOOLEAN"
    },
    {
      "comment": null,
      "keys": [],
      "longType": "ROW([('reason', VARCHAR()), ('response', VARCHAR()), ('is_bool', BOOLEAN()), ('ids', ARRAY(VARCHAR()))])",
      "name": "map_value",
      "type": "ROW"
    },

TESTING INSTRUCTIONS

Unit tests are added

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completed my review and didn't find any issues... but I did find this penguin.

 __
( o>
///\
\V_/_
Suppressed issues based on your team's Korbit activity
This issue Is similar to Because

lines 59:61:

Using a broad catch-all Exception without logging or providing context about the error

Silent Exception Suppression

Similar issues were not addressed in the past

When you react to issues (for example, an upvote or downvote) or you fix them, Korbit will tune future reviews based on these signals.

Files scanned
File Path Reviewed
superset/databases/utils.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Current Korbit Configuration

General Settings
Setting Value
Review Schedule Automatic excluding drafts
Max Issue Count 10
Automatic PR Descriptions
Issue Categories
Category Enabled
Naming
Database Operations
Documentation
Logging
Error Handling
Systems and Environment
Objects and Data Structures
Readability and Maintainability
Asynchronous Processing
Design Patterns
Third-Party Libraries
Performance
Security
Functionality

Feedback and Support

Note

Korbit Pro is free for open source projects 🎉

Looking to add Korbit to your team? Get started with a free 2 week trial here

@michael-s-molina michael-s-molina added the review:checkpoint Last PR reviewed during the daily review standup label Dec 20, 2024
Copy link

codecov bot commented Dec 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.32%. Comparing base (76d897e) to head (a996ad2).
Report is 1387 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #31587       +/-   ##
===========================================
+ Coverage   60.48%   83.32%   +22.83%     
===========================================
  Files        1931      544     -1387     
  Lines       76236    38927    -37309     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    32436    -13678     
+ Misses      28017     6491    -21526     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 48.50% <0.00%> (-0.66%) ⬇️
javascript ?
mysql 75.84% <100.00%> (?)
postgres 75.91% <100.00%> (?)
presto 53.05% <0.00%> (-0.75%) ⬇️
python 83.32% <100.00%> (+19.83%) ⬆️
sqlite 75.43% <100.00%> (?)
unit 60.79% <100.00%> (+3.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@justinpark justinpark force-pushed the fix--col-long-data-type branch 2 times, most recently from d004737 to fa5c083 Compare January 6, 2025 22:57
@@ -55,7 +55,7 @@ def get_indexes_metadata(

def get_col_type(col: dict[Any, Any]) -> str:
try:
dtype = f"{col['type']}"
dtype = f"{[col['type']]}"[1:-1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard for me to make sense of this line of code before and after the change. Maybe needs a comment, docstring and/or better typing?

Also if this is solving a trino-specific issue, does it belong somewhere in db_engine_spec?

@sadpandajoe sadpandajoe removed the review:checkpoint Last PR reviewed during the daily review standup label Jan 7, 2025
@justinpark justinpark force-pushed the fix--col-long-data-type branch from fa5c083 to caa2c39 Compare January 7, 2025 23:24
@mistercrunch
Copy link
Member

Not directly related, but while trying to add support for python 3.12 on another PR, noticed some things that are related to this.

  • newer version of pandas removes support for to_sql's dtype arg to receive sqlalchemy types
  • some code specific to presto around the codebase doing just this - DateTime if "presto" else string ...

@justinpark justinpark force-pushed the fix--col-long-data-type branch from caa2c39 to f36bec0 Compare January 8, 2025 18:18
@justinpark justinpark force-pushed the fix--col-long-data-type branch from f36bec0 to a996ad2 Compare January 30, 2025 22:58
@justinpark
Copy link
Member Author

justinpark commented Jan 30, 2025

@mistercrunch Then, what do you think about changing it to access via repr?
In the case of SQLAlchemy, repr contains information that matches the longType better than str, and I don't think it's risky to convert this part used for longType to repr().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants