Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: data frame with lazy relation AltrepDataFrameRelation #960

Open
wants to merge 57 commits into
base: main
Choose a base branch
from

Conversation

Antonov548
Copy link
Contributor

@Antonov548 Antonov548 commented Jan 6, 2025

Closes #949

Copy link
Collaborator

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect the test output to be unchanged? Please also add rel_filter2() .

projections.push_back(std::move(dexpr));
}

duckdb::rel_extptr_t rel = cpp11::as_cpp<cpp11::decay_t<duckdb::rel_extptr_t>>(rapi_rel_from_df(con, df, false));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rapi_() functions are external, this coupling is a bit too tight. Perhaps extract an internal function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just temporarily

@@ -162,6 +162,28 @@ using namespace cpp11;
return make_external_prot<RelationWrapper>("duckdb_relation", prot, res);
}

[[cpp11::register]] SEXP rapi_rel_filter2(data_frame df, duckdb::conn_eptr_t con, list exprs) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move these to a separate .cpp file and keep them in the same order as here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still valid. Can you please follow up?

@krlmlr
Copy link
Collaborator

krlmlr commented Jan 7, 2025

I like the idea of rel_from_any_df(), please proceed.

@Antonov548
Copy link
Contributor Author

@krlmlr seems it works. Maybe it also have to checked with more examples/tests

@krlmlr
Copy link
Collaborator

krlmlr commented Jan 10, 2025

Current state:

drv <- duckdb::duckdb()
con <- DBI::dbConnect(drv)
df1 <- tibble::tibble(a = 1)

"mutate"
#> [1] "mutate"
df2 <- duckdb:::rel_project2(
  df1,
  con,
  list(
    {
      tmp_expr <- duckdb:::expr_reference("a")
      duckdb:::expr_set_alias(tmp_expr, "a")
      tmp_expr
    },
    {
      tmp_expr <- duckdb:::expr_constant(2)
      duckdb:::expr_set_alias(tmp_expr, "b")
      tmp_expr
    }
  )
)
"filter"
#> [1] "filter"
df3 <- duckdb:::rel_filter2(
  df2,
  con,
  list(
    duckdb:::expr_comparison(
      "==",
      list(
        duckdb:::expr_reference("b"),
        duckdb:::expr_constant(2)
      )
    )
  )
)

duckdb:::rel_from_altrep_df(df3)
#> DuckDB Relation: 
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Filter [(b = 2.0)]
#>   Projection [a as a, 2.0 as b]
#>     r_dataframe_scan(0x1349a2ba8)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (DOUBLE)
#> - b (DOUBLE)

# This materializes the data frame
df2
#>   a b
#> 1 1 2

# Expecting this to use a data frame scan only, without a projection
rel2 <- duckdb:::rel_from_altrep_df(df2)
rel2
#> Error in `print()`:
#> ! RebuildRelationException

# Expecting this to use filter only, with a data frame scan based on rel2
rel3 <- duckdb:::rel_from_altrep_df(df3)
rel3
#> Error in `print()`:
#> ! RebuildRelationException

Created on 2025-01-10 with reprex v2.1.1

};
return values;
}

template<>
const char* EnumUtil::ToChars<RelationType>(RelationType value) {
return StringUtil::EnumToString(GetRelationTypeValues(), 28, "RelationType", static_cast<uint32_t>(value));
return StringUtil::EnumToString(GetRelationTypeValues(), 29, "RelationType", static_cast<uint32_t>(value));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that be 255 here and below?

Suggested change
return StringUtil::EnumToString(GetRelationTypeValues(), 29, "RelationType", static_cast<uint32_t>(value));
return StringUtil::EnumToString(GetRelationTypeValues(), 255, "RelationType", static_cast<uint32_t>(value));

@krlmlr
Copy link
Collaborator

krlmlr commented Feb 14, 2025

The branch has conflicts. Can you please git merge main -X theirs and then run cpp11::cpp_register() ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create and use adapter subclass of Relational that forwards to a parent object or to a data frame scan
2 participants