Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] connect: createDataFrame #3363

Merged
merged 1 commit into from
Dec 5, 2024
Merged

Conversation

andrewgazelka
Copy link
Member

No description provided.

Copy link
Member Author

andrewgazelka commented Nov 20, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

@andrewgazelka andrewgazelka force-pushed the andrew/connect-column-operations branch from 22d7e93 to 83f2117 Compare November 20, 2024 18:25
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from 864f8ba to 2c4ef28 Compare November 20, 2024 18:25
@andrewgazelka andrewgazelka force-pushed the andrew/connect-column-operations branch from 83f2117 to df7ffd5 Compare November 20, 2024 18:32
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from 2c4ef28 to 421d94b Compare November 20, 2024 18:33
@andrewgazelka andrewgazelka force-pushed the andrew/connect-column-operations branch from df7ffd5 to 5c5279f Compare November 20, 2024 18:43
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from 421d94b to 1b03d12 Compare November 20, 2024 18:43
@andrewgazelka andrewgazelka force-pushed the andrew/connect-column-operations branch from 5c5279f to ba57801 Compare November 20, 2024 18:48
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from 1b03d12 to 95bd95e Compare November 20, 2024 18:48
@andrewgazelka andrewgazelka force-pushed the andrew/connect-column-operations branch from ba57801 to 81ff3b6 Compare November 20, 2024 19:32
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from 95bd95e to 866aa05 Compare November 20, 2024 19:32
@andrewgazelka andrewgazelka force-pushed the andrew/connect-column-operations branch from 81ff3b6 to 5ddc229 Compare November 20, 2024 22:12
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from 866aa05 to 59f0592 Compare November 20, 2024 23:25
@andrewgazelka andrewgazelka force-pushed the andrew/connect-column-operations branch from 5ddc229 to 470d2de Compare November 20, 2024 23:28
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from 59f0592 to e319fb8 Compare November 20, 2024 23:29
@andrewgazelka andrewgazelka force-pushed the andrew/connect-column-operations branch from 470d2de to 89e89e8 Compare November 20, 2024 23:40
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from e319fb8 to b0aeaec Compare November 20, 2024 23:40
@andrewgazelka andrewgazelka force-pushed the andrew/connect-column-operations branch from 89e89e8 to 36a8d0c Compare November 21, 2024 00:13
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from b0aeaec to 1380cd4 Compare November 21, 2024 00:14
@andrewgazelka andrewgazelka marked this pull request as ready for review November 21, 2024 00:32
@andrewgazelka andrewgazelka force-pushed the andrew/connect-column-operations branch from 36a8d0c to 6cfef48 Compare November 21, 2024 00:39
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from 596e61f to de51b3c Compare December 4, 2024 02:08
@andrewgazelka andrewgazelka changed the base branch from andrew/connect-column-operations to graphite-base/3363 December 4, 2024 02:32
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from de51b3c to aefae24 Compare December 4, 2024 02:41
@andrewgazelka andrewgazelka changed the base branch from graphite-base/3363 to main December 4, 2024 02:42
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from aefae24 to 6f3d6aa Compare December 4, 2024 02:42
@github-actions github-actions bot added the enhancement New feature or request label Dec 4, 2024
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from 6f3d6aa to d35c3e7 Compare December 4, 2024 03:02
@andrewgazelka andrewgazelka removed the request for review from colin-ho December 4, 2024 03:02
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch 2 times, most recently from 0d0dbe5 to e82f9e5 Compare December 4, 2024 03:10
@andrewgazelka andrewgazelka requested a review from jaychia December 4, 2024 03:13
Copy link

codspeed-hq bot commented Dec 4, 2024

CodSpeed Performance Report

Merging #3363 will degrade performances by 66.8%

Comparing andrew/connect-create-dataframe (69b2c29) with main (d1d0fab)

Summary

⚡ 1 improvements
❌ 1 regressions
✅ 15 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main andrew/connect-create-dataframe Change
test_iter_rows_first_row[100 Small Files] 107.5 ms 323.8 ms -66.8%
test_show[100 Small Files] 23.9 ms 16 ms +48.71%

@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from c75d606 to f636d74 Compare December 4, 2024 07:31
Copy link

codecov bot commented Dec 4, 2024

Codecov Report

Attention: Patch coverage is 57.83133% with 175 lines in your changes missing coverage. Please review.

Project coverage is 77.29%. Comparing base (de4fe50) to head (69b2c29).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/daft-connect/src/translation/datatype/codec.rs 47.33% 89 Missing ⚠️
...ect/src/translation/logical_plan/local_relation.rs 61.01% 69 Missing ⚠️
daft/dataframe/dataframe.py 8.33% 11 Missing ⚠️
...-connect/src/translation/logical_plan/aggregate.rs 0.00% 5 Missing ⚠️
...daft-connect/src/translation/logical_plan/to_df.rs 94.73% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3363      +/-   ##
==========================================
- Coverage   77.35%   77.29%   -0.06%     
==========================================
  Files         696      699       +3     
  Lines       84849    85242     +393     
==========================================
+ Hits        65631    65887     +256     
- Misses      19218    19355     +137     
Files with missing lines Coverage Δ
src/daft-connect/src/lib.rs 66.34% <100.00%> (+2.13%) ⬆️
src/daft-connect/src/op/execute/root.rs 95.45% <100.00%> (-0.20%) ⬇️
src/daft-connect/src/translation/datatype.rs 20.81% <ø> (+10.40%) ⬆️
src/daft-connect/src/translation/logical_plan.rs 80.64% <100.00%> (+9.21%) ⬆️
...ft-connect/src/translation/logical_plan/project.rs 88.88% <100.00%> (ø)
...daft-connect/src/translation/logical_plan/range.rs 62.50% <100.00%> (ø)
src/daft-connect/src/translation/schema.rs 100.00% <ø> (ø)
src/daft-local-execution/src/pipeline.rs 94.48% <100.00%> (+0.02%) ⬆️
src/daft-logical-plan/src/builder.rs 91.66% <ø> (ø)
...daft-connect/src/translation/logical_plan/to_df.rs 94.73% <94.73%> (ø)
... and 4 more

... and 5 files with indirect coverage changes

@andrewgazelka andrewgazelka requested review from universalmind303 and removed request for jaychia December 4, 2024 08:54
Copy link
Contributor

@universalmind303 universalmind303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the feedback can be addressed later. But I definitely think we should remove the arrow_format dependency before merging.

Also the with_columns -> select change is needed before merging

@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch 3 times, most recently from 1d96dba to 0453921 Compare December 4, 2024 23:50
Cargo.toml Outdated Show resolved Hide resolved
@andrewgazelka andrewgazelka force-pushed the andrew/connect-create-dataframe branch from 0453921 to 69b2c29 Compare December 4, 2024 23:53
@andrewgazelka andrewgazelka enabled auto-merge (squash) December 4, 2024 23:56
@andrewgazelka andrewgazelka merged commit 86523a0 into main Dec 5, 2024
42 of 43 checks passed
@andrewgazelka andrewgazelka deleted the andrew/connect-create-dataframe branch December 5, 2024 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants