Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add encoder/decoder in structureDataset for snowflake. #1811

Closed
wants to merge 16 commits into from

Conversation

hhcs9527
Copy link
Contributor

@hhcs9527 hhcs9527 commented Aug 30, 2023

TL;DR

  1. Add encoder/decoder in structureDataset for the snowflake.
  2. Currently support the new protocol, snowflake, and sf.
    we can add them here, the default is a list of additional protocols, like the following.
image

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

Example code for user, we can parse the uro with the following format of uri,
"snowflake://{dummy_user}:{dummy_account}/{dummy_warehouse}/{dummy_database}/{dummy_schema}/{dummy_table}"

import mock
import pandas as pd
import pytest
from typing_extensions import Annotated

from flytekit import StructuredDataset, kwtypes, task, workflow, Secret

@task(secret_requests=[Secret(group="snowflake", key="private_key.pem")])
def t1() -> StructuredDataset:
    df = pd.DataFrame({"NAME": ["Tom", "Joseph"], "AGE": [220, 27]})
    return StructuredDataset(
        dataframe=df, uri="sf://phuang9527:rq06467.us-east4.gcp/COMPUTE_WH/AGENT/PEOPLE/PEOPLE"
    )

@task(secret_requests=[Secret(group="snowflake", key="private_key.pem")])
def t2() -> StructuredDataset:
    df = pd.DataFrame({"NAME": ["Tom", "Joseph"], "AGE": [235, 27]})
    return StructuredDataset(
        dataframe=df, uri="snowflake://phuang9527:rq06467.us-east4.gcp/COMPUTE_WH/AGENT/PEOPLE/PEOPLE"
    )

@task(secret_requests=[Secret(group="snowflake", key="private_key.pem")])
def t3(sd: StructuredDataset) -> pd.DataFrame:
    return sd.open(pd.DataFrame).all() # we will load the whole table for user


@workflow
def wf() -> pd.DataFrame:
    t1()
    return t3(sd=t2())

if __name__ == "__main__":
    wf()

Tracking Issue

Flyte Agent Ecosystem

@codecov
Copy link

codecov bot commented Aug 30, 2023

Codecov Report

Attention: Patch coverage is 59.72222% with 29 lines in your changes missing coverage. Please review.

Project coverage is 54.84%. Comparing base (660f5aa) to head (0b47872).
Report is 671 commits behind head on master.

Files with missing lines Patch % Lines
flytekit/types/structured/snowflake.py 45.65% 25 Missing ⚠️
flytekit/core/type_engine.py 50.00% 2 Missing ⚠️
flytekit/types/structured/__init__.py 71.42% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1811      +/-   ##
==========================================
+ Coverage   54.33%   54.84%   +0.51%     
==========================================
  Files         288      295       +7     
  Lines       21934    22282     +348     
  Branches     3344     3357      +13     
==========================================
+ Hits        11917    12221     +304     
- Misses       9855     9902      +47     
+ Partials      162      159       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@hhcs9527 hhcs9527 mentioned this pull request Aug 30, 2023
8 tasks
@hhcs9527 hhcs9527 marked this pull request as ready for review August 31, 2023 09:48
@hhcs9527
Copy link
Contributor Author

Hi @pingsutw , just finished add ingencoder/decoder in structureDataset for snowflake and unit test, please approve the testing and take a look.

@hhcs9527 hhcs9527 requested a review from pingsutw September 1, 2023 04:38
@hhcs9527
Copy link
Contributor Author

hhcs9527 commented Sep 8, 2023

Hi @pingsutw, please approve the CI test.

@hhcs9527 hhcs9527 force-pushed the snowflake-structure-dataset branch from 7fac349 to 09052cb Compare September 10, 2023 09:39
@hhcs9527
Copy link
Contributor Author

Hi @pingsutw, please approve the CI test.

@hhcs9527
Copy link
Contributor Author

Hi @pingsutw, just finished the changes, Please take a look.

@hhcs9527
Copy link
Contributor Author

Hi @pingsutw, just finished the changes to get, please review.
Thanks a lot!

pingsutw
pingsutw previously approved these changes Sep 14, 2023
flytekit/types/structured/__init__.py Outdated Show resolved Hide resolved
flytekit/types/structured/snowflake.py Outdated Show resolved Hide resolved
@hhcs9527
Copy link
Contributor Author

Hi @pingsutw, just finished the changes to get, please review.
Thanks a lot!

pingsutw
pingsutw previously approved these changes Sep 15, 2023
@hhcs9527 hhcs9527 force-pushed the snowflake-structure-dataset branch from 13c6d5b to ba3d2fc Compare September 27, 2023 16:59
Signed-off-by: HH <[email protected]>
@hhcs9527 hhcs9527 requested a review from pingsutw September 27, 2023 17:38
@hhcs9527
Copy link
Contributor Author

Hi @pingsutw,
I finished the code change to support multiple protocols and fixed the URI.

I will create the new PR to support the new URI.
new URI: uri="snowflake://dummy_user:dummy_account/dummy_warehouse/dummy_database/dummy_schema/dummy_table

Thanks a lot

@Future-Outlier Future-Outlier mentioned this pull request Jul 25, 2024
3 tasks
@Future-Outlier
Copy link
Member

This is done by other PR, so I'm going to close it, thank you @hhcs9527 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants