Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiprocessing with Pool won't work on Mac with M1 chip #87

Open
Jayshil opened this issue Oct 10, 2022 · 3 comments
Open

Multiprocessing with Pool won't work on Mac with M1 chip #87

Jayshil opened this issue Oct 10, 2022 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@Jayshil
Copy link
Contributor

Jayshil commented Oct 10, 2022

Hi @nespinoza,

If I try to use nthreads option in juliet.fit for using multiprocessing with dynesty, I would get the following error:

RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

At first, I thought it is the multiprocessing implemented in dynesty that is causing this issue. However, upon digging into the details I found that this has rather do with how multiprocessing is implemented in juliet: the culprit is line 1723. A proper way to do this (according to this StackOverflow answer) is following:

from multiprocessing import get_context
...
with contextlib.closing(get_context("fork").Pool(processes=self.nthreads - 1)) as executor:
    sampler = DynestySampler(self.loglike,\
        self.prior_transform_r,\
        self.data.nparams,
        pool=executor,
        queue_size=self.nthreads,
        **d_args)
    sampler.run_nested(**ds_args)
    results = sampler.results

(instead of directly calling Pool from multiprocessing, one should call it from multiprocessing.get_context("fork").Pool). Making this change would solve the above issue.

However, I am not opening a pull request since it looks like this issue is specific to Mac users with an M1 chip. And I have no idea how implementing this would affect other systems which don't have M1. Also since this is the issue with Pool implementation, it should also affect emcee and zeus samplers which also use Pool for multiprocessing (though I haven't tested for samplers other than dynesty).

Cheers,
Jayshil

@nespinoza
Copy link
Owner

Hi @Jayshil! Thanks for this. Can you confirm that the way you propose to do multiprocesing is the way to make this work? Also, to change this, would imply to also test on non-M1-chip computers to ensure this is not going to fix the M1-chip users, but make everyone else's code to break.

I'll leave it as an enhancement for now, but if this is investigated further, I would be happy to change the way in which pool is handled.

N.

@nespinoza nespinoza self-assigned this Feb 15, 2023
@nespinoza nespinoza added the enhancement New feature or request label Feb 15, 2023
@Jayshil
Copy link
Contributor Author

Jayshil commented Feb 15, 2023

Hi @nespinoza,

I am not really sure if this is the only way to make this work or not. There may be other ways to resolve this issue that I am unaware of. But I can confirm that this is at least one way to make it work for M1 chip computers (working smoothly on my machine). I also don't know if this would work for non-M1-chip computers or not. Unfortunately, I do not have enough time to thoroughly investigate this.

I think your suggestion is appropriate to leave this as an enhancement for now. This will let users know that there is an issue with pool in M1 chip Macs and there is a possible way to resolve this.

Cheers,
Jayshil

@Jayshil
Copy link
Contributor Author

Jayshil commented Mar 14, 2023

Hi @nespinoza,

A colleague of mine pointed out that instead of editing the juliet source code, one can simply add the following two lines in their code to make multiprocessing work with M1 chip macs:

import multiprocessing
multiprocessing.set_start_method('fork')

This is at least working with dynesty.
So, I suggest we leave juliet source code as it is, but put this "hack" somewhere in the documentation. We can close this issue after that.

Cheers,
Jayshil

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants