-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimize the dependency on SessionState
#11420
Comments
In theory I think you are right. However, I am worried about removing SessionState from the Here is one alternate proposal #10782 (comment) (basically treat SessionState as part of the catalog API). ANother potentially hacky idea is to do something like pass SessionState as an async fn scan(
&self,
state: &Any,
projection: Option<&Vec<usize>>,
_filters: &[Expr],
_limit: Option<usize>,
) -> Result<Arc<dyn ExecutionPlan>> { Which would break the explicit dependency but implementors of let state = state.as_any().downcast_ref::<SessionState>().unwrap(); 🤔 |
@alamb I agree, removing it or using The easiest way to address this is to move I see why it is used in so many places, and I do support the idea that it is important to do so. Perhaps we could just identify all truly common types like this and move them into a sub-crate, that could be done without too much effort. Such a change so incorporate other types of common objects too. In this way anytime someone wants to break something out of |
Is your feature request related to a problem or challenge?
There are many functions in
datafusion-core
that takeSessionState
as arguments but only actually rely on portion of them. This add the additional dependency that is not necessary, therefore blocking us from extracting module out of core #10782.For example, If we want to pull
CatalogProvider
out of core, we need to pull outTableProvider
first. But because it hasscan
function that takesSessionState
which containsCatalogProviderList
therefore there is a circular dependency. Similar issues are already mentioned in #11182Describe the solution you'd like
I think we need to redesign those functions that take
SessionState
and minimize the dependencies for them.Given one of the
scan
function here, we can see that we only needstate.config_options().explain
andstate.execution_props()
instead of the wholeSessionState
datafusion/datafusion/core/src/datasource/memory.rs
Lines 207 to 245 in 4bed04e
In this case, we can create
TableProviderConext
that encapsulates a subset of the information fromSessionState
.The same idea applies to
PhysicalPlanner
, we usually just needPhysicalOptimizeRules
or other information about plan.datafusion/datafusion/core/src/physical_planner.rs
Lines 364 to 385 in 4bed04e
And,
create_initial_plan
inDefaultPhysicalPlanner
,ExecutionOptions
is all we need, nothing else.datafusion/datafusion/core/src/physical_planner.rs
Lines 582 to 585 in 4bed04e
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: