Change runner logic to not create pool for sequential runner #4502

merelcht · 2025-02-19T13:53:45Z

Description

Address #4486

Development notes

Verified that this fixes the issues mentioned in Galileo-Galilei/kedro-mlflow#624 and kedro-org/kedro-plugins#1012

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

Read the contributing guidelines
Signed off each commit with a Developer Certificate of Origin (DCO)
Opened this PR as a 'Draft Pull Request' if it is work-in-progress
Updated the documentation to reflect the code changes
Added a description of this change in the RELEASE.md file
Added tests to cover my changes
Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

Signed-off-by: Merel Theisen <[email protected]>

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

Signed-off-by: Merel Theisen <[email protected]>

DimedS

Thanks, @merelcht! That’s a more robust approach. One small question from my side

kedro/runner/runner.py

ElenaKhaustova · 2025-02-20T14:54:13Z

kedro/runner/runner.py

@@ -226,7 +226,30 @@ def _run(
        done = None
        max_workers = self._get_required_workers_count(pipeline)

-        with self._get_executor(max_workers) as pool:
+        pool = self._get_executor(max_workers)


It looks good to me from a logical point of view.

But I would suggest moving this into SequentialRunner._run. Otherwise, we modify the behaviour of the base class based on what is inherited from it, which is not entirely correct from the implementation point of view and AbstractRunner._run becomes too long. I understand that it will require some duplication, but in the SequentialRunner._run method, we can add a note explaining why we keep the implementation like that. But adding it to AbstractRunner._run will overload it even more.

I don't have a very strong opinion on this, but my counter argument is then wouldn't it be confusing that the thread and parallel logic is in the AbstractRunner._run method but sequential isn't?

I agree with Elena that, from a Pythonic perspective, common logic should be placed in _run() within the Abstract class. If there are exceptions, we should override the common logic with specific behavior, which is how runners worked previously. However, I thought the goal of the previous PR, which Merel is currently modifying, was to centralise the runner's logic within the Abstract class.

We had already decided that the _run() function in the abstract class would rely on _get_executor(), which would be implemented specifically in different subclasses. I don't see any issues with this approach. For me, the main question is how large and readable AbstractRunner._run() will be. As Merel pointed out, consolidating all the running logic in one place will be beneficial, which was also the intention of the previous PR.

Signed-off-by: Merel Theisen <[email protected]>

merelcht · 2025-02-25T11:01:23Z

I addressed @ElenaKhaustova 's comment about moving the sequential logic to the SequentialRunner in d6a247c . I personally don't have a huge preference for doing it that way or keeping it inside the abstract runner. They both have pros and cons. My main con for having sequential runner override the run method is only that it might be confusing that abstract runner has the logic for thread an parallel execution but not sequential, but arguably it's a bit cleaner.

@DimedS any thoughts?

Signed-off-by: Merel Theisen <[email protected]>

astrojuanlu · 2025-02-25T11:57:28Z

kedro/runner/runner.py

-                    )
-                    self._release_datasets(node, catalog, load_counts, pipeline)
+        pool = self._get_executor(max_workers)
+        if pool is not None:


Not sure if it's related to what @merelcht is saying, but for the record I was skimming this code and thought "what if pool is None?" (there's no else branch here). Just a comment from the peanut gallery.

Yeah exactly! I think it might not be immediately obvious that that logic is now inside SequentialRunner.

Signed-off-by: Merel Theisen <[email protected]>

DimedS · 2025-02-25T17:35:18Z

I addressed @ElenaKhaustova 's comment about moving the sequential logic to the SequentialRunner in d6a247c . I personally don't have a huge preference for doing it that way or keeping it inside the abstract runner. They both have pros and cons. My main con for having sequential runner override the run method is only that it might be confusing that abstract runner has the logic for thread an parallel execution but not sequential, but arguably it's a bit cleaner.

@DimedS any thoughts?

I agree, I also think that, given the current state, it would be beneficial to keep all the running logic in the Abstract class. I have provided a full comment in the thread: #4502 (comment).

Signed-off-by: Merel Theisen <[email protected]>

Change runner logic to not create pool for sequential runner

941e132

Signed-off-by: Merel Theisen <[email protected]>

merelcht requested a review from Copilot February 19, 2025 13:53

Copilot AI reviewed Feb 19, 2025

View reviewed changes

merelcht linked an issue Feb 19, 2025 that may be closed by this pull request

SequentialRunner might fail with non thread-safe code #4486

Open

merelcht added 2 commits February 19, 2025 14:59

Merge branch 'main' into fix/sequential-runner-threads

e610ce5

Merge branch 'main' into fix/sequential-runner-threads

d4bd76e

merelcht self-assigned this Feb 19, 2025

Update release notes

7b3bd81

Signed-off-by: Merel Theisen <[email protected]>

merelcht mentioned this pull request Feb 20, 2025

Improve complexity of run algorithm #4373

Closed

merelcht requested review from DimedS and ElenaKhaustova February 20, 2025 09:47

DimedS approved these changes Feb 20, 2025

View reviewed changes

kedro/runner/runner.py Show resolved Hide resolved

merelcht requested a review from astrojuanlu February 20, 2025 13:05

ElenaKhaustova reviewed Feb 20, 2025

View reviewed changes

merelcht and others added 2 commits February 25, 2025 11:34

Merge branch 'main' into fix/sequential-runner-threads

4c474f4

Move sequential runner logic to sequential runner class

d6a247c

Signed-off-by: Merel Theisen <[email protected]>

merelcht added 2 commits February 25, 2025 12:02

Fix lint

89c236b

Signed-off-by: Merel Theisen <[email protected]>

Pass on get executor for SequentialRunner

d765a72

Signed-off-by: Merel Theisen <[email protected]>

astrojuanlu reviewed Feb 25, 2025

View reviewed changes

Make _get_executor method not abstract

f1390b8

Signed-off-by: Merel Theisen <[email protected]>

merelcht and others added 2 commits February 26, 2025 12:36

Merge branch 'main' into fix/sequential-runner-threads

135eb0c

Revert sequential running logic back to AbstractRunner

d7a7d03

Signed-off-by: Merel Theisen <[email protected]>

merelcht requested a review from ElenaKhaustova February 26, 2025 11:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change runner logic to not create pool for sequential runner #4502

Change runner logic to not create pool for sequential runner #4502

merelcht commented Feb 19, 2025 •

edited

Loading

DimedS left a comment

ElenaKhaustova Feb 20, 2025

merelcht Feb 25, 2025

DimedS Feb 25, 2025

merelcht commented Feb 25, 2025

astrojuanlu Feb 25, 2025

merelcht Feb 25, 2025

DimedS commented Feb 25, 2025

Change runner logic to not create pool for sequential runner #4502

Are you sure you want to change the base?

Change runner logic to not create pool for sequential runner #4502

Conversation

merelcht commented Feb 19, 2025 • edited Loading

Description

Development notes

Developer Certificate of Origin

Checklist

Choose a reason for hiding this comment

DimedS left a comment

Choose a reason for hiding this comment

ElenaKhaustova Feb 20, 2025

Choose a reason for hiding this comment

merelcht Feb 25, 2025

Choose a reason for hiding this comment

DimedS Feb 25, 2025

Choose a reason for hiding this comment

merelcht commented Feb 25, 2025

astrojuanlu Feb 25, 2025

Choose a reason for hiding this comment

merelcht Feb 25, 2025

Choose a reason for hiding this comment

DimedS commented Feb 25, 2025

merelcht commented Feb 19, 2025 •

edited

Loading