Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WPF Parallelization fails #687

Open
sezelt opened this issue Sep 18, 2024 · 3 comments
Open

WPF Parallelization fails #687

sezelt opened this issue Sep 18, 2024 · 3 comments

Comments

@sezelt
Copy link
Member

sezelt commented Sep 18, 2024

There have been reports of the mpire distributed WPF optimization failing with errors like

AttributeError: Can't pickle local object 'WholePatternFit._fit_distributed.<locals>.f

This appears to only affect Windows systems.

I think the fix is to enable dill serialization:

with WorkerPool(
n_jobs=num_jobs,
shared_objects=fit_opts,
) as pool:
results = pool.map(

should be modified to become:

        with WorkerPool(
            n_jobs=num_jobs,
            shared_objects=fit_opts,
            use_dill=True,
        ) as pool:

mpire produces threads differently on Windows vs UNIX and so there can be serialization errors that only show up one one platform when transmitting complicated objects to threads. Unfortunately I do not have a Windows machine to test this on at the moment, so someone else will have to try this and let us know.

@gvarnavi
Copy link
Member

Unfortunately, I don't think this is as simple as setting use_dill=True since "copy-on-write" is not available on the default Windows "spawn" start_method. See more detailed explanation on related SSB PR here.

@sezelt
Copy link
Member Author

sezelt commented Jan 18, 2025

I don't think this is related to copy-on-write behavior (though I could be wrong). A potentially important difference is that in the SSB PR each worker is writing its results into the shared array, while in WPF the worker functions return their results and it all gets collected after map finishes.

@gvarnavi
Copy link
Member

To clarify what I meant, simply adding use_dill=True will (probably) work in that it won't give an error -- but it will still need to use the "spawn" start_method and thus will be much slower than on unix where we can use the "fork" method which features copy-on-write.

Essentially, your comment just above it will no longer be true.

        # hopefully the data entries remain as views until dispatch time...
        fit_inputs = ...

That might be fine for WPF (which anyway has a slow runtime, and initialization might be negligible), but this was not the case for SSB in my tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants