Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make OperationInitializer non-static #4919

Merged
merged 23 commits into from
Dec 14, 2023
Merged

Conversation

niloc132
Copy link
Member

@niloc132 niloc132 commented Dec 6, 2023

This patch makes the OperationInitializer part of the ExecutionContext, allowing consumers to specify a preferred instance. ThreadInitializationFactory is also changed to be built in Dagger instead of only by configuration. Combined, these changes make it easier to reason about startup, requiring that dependencies are declared explicitly rather than implicitly referenced and used.

It is likely that some of the work done here should be refactored further into some JobScheduler factory or assisted injection, but OperationInitializer is still referenced directly by InitialFilterExecution, so this would require more refactoring.

Ideally, OperationInitializer.NON_PARALLELIZABLE would be used instead of the OperationInitializerThreadPool's threadlocal, but some services restore the user's own exec context - this is a shortcoming of the patch, and possibly should be resolved before merging.

Downstream users will need to consider if they want to capture the OperationInitializer when they create new exec contexts - generally this will be desired.

Partial #4040

@niloc132 niloc132 added this to the December 2023 milestone Dec 6, 2023
Copy link
Member Author

@niloc132 niloc132 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


@Override
public Future<?> submit(Runnable runnable) {
return executorService.submit(runnable);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is consistent with existing behavior, but I wonder about checking canParallelize before doing this, since it can be false? In practice, all callers do that already, so it should be overkill, but might catch bugs, or just let a caller execute something directly without a canParallelize check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An assertion sounds reasonable; enforcing that we check encourages a code pattern that avoids creating the runnable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assertion doesn't work it turns out - SelectColumnLayer appears to deliberately submit more work from on the OITP threads to that thread pool. Crucially it doesn't block on that work being returned, so this should be safe.

I'm going to defer other refactoring here to minimize risk of introducing bugs that I can't guard against like this.

props/configs/src/main/resources/dh-defaults.prop Outdated Show resolved Hide resolved
props/configs/src/main/resources/dh-defaults.prop Outdated Show resolved Hide resolved
@@ -83,20 +84,23 @@ private static void setContext(final ExecutionContext context) {
private final QueryScope queryScope;
private final QueryCompiler queryCompiler;
private final UpdateGraph updateGraph;
private final OperationInitializer operationInitializer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to other reviewers:
I think this is a reasonable thing to add to the execution context, and that it's the right interface to add. I'd welcome consensus-building here.

/**
* @return Whether the current thread can parallelize operations using this OperationInitialization.
*/
boolean canParallelize();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we eliminate this method by requiring parallel operations to install a "same thread" initializer to prevent nest parallelism?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above with the SelectColumnLayer, this doesnt look feasible at this time.

@rcaudy
Copy link
Member

rcaudy commented Dec 6, 2023

PartitionedTableImpl L299 and L356:
We must not use the same operation initializer as the currently-installed one, unless it's the non-parallelizable one.
Only really necessary if we get rid of the isInitializationThread ThreadLocal to dictate canParallelize()... might be a nice improvement....

chipkent
chipkent previously approved these changes Dec 6, 2023
Copy link
Member

@chipkent chipkent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python LGTM

@niloc132 niloc132 enabled auto-merge (squash) December 13, 2023 20:44
@niloc132 niloc132 merged commit 2f5a054 into deephaven:main Dec 14, 2023
19 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Dec 14, 2023
@deephaven-internal
Copy link
Contributor

Labels indicate documentation is required. Issues for documentation have been opened:

How-to: https://github.com/deephaven/deephaven.io/issues/3545
Reference: https://github.com/deephaven/deephaven.io/issues/3544

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants