Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent failures in test_parameter_optimization_ddm[LLVM-optuna_cmaes_sampler] #2874

Open
jvesely opened this issue Jan 4, 2024 · 5 comments
Assignees
Labels
bug Should work but doesn't PEC Parameter Estimation Composition triaged The issue has been understood and the root cause is known

Comments

@jvesely
Copy link
Collaborator

jvesely commented Jan 4, 2024

The test fails by selecting the wrong result ~3/100 times.

PRNG determinism in PECOptimizationFunction is done (in a hacky way) by replacing optuna's PRNG instance:
psyneulink/core/components/functions/nonstateful/fitfunctions.py:802

                # We need to hook into Optuna's random number generator here so that we can allow PsyNeuLink's RNS to
                # determine the seed for Optuna's RNG. Pretty hacky unfortunately.
                opt_func._rng = np.random.RandomState(self.owner.initial_seed)

which is both insufficient and incorrect.

It is insufficient because it only works for instances of optuna.RandomSampler. However, this test uses CmaEsSampler.
Moreover only 1 dimension of sampling is used and optuna switches to an embedded RandomSampler, which uses a different PRNG. CmaEsSampler is also using PRNG in a different member;

cmaes instantiates its PRNG in opt_func._cma_rng and the independent sampler PRNG is in opt_func._independent_sampler._rng.

The hacky assignment also causes issues when updating to optuna 3.4.0 and 3.5.0 since the numpy PRNG instance was moved to _rng.rng leading to the following error:

self._rng.rng.uniform(trans.bounds[:, 0], trans.bounds[:, 1])
E       AttributeError: 'numpy.random.mtrand.RandomState' object has no attribute 'rng'

The immediate solution to the intermittency is to instantiate optuna samplers with a fixed seed.

The hacky assignment of PRNG should also be removed. If using the same seed as parent OCM (self.owner.initial_seed) is required, the PECOptimizationFunction needs to instance new optuna samplers using the correct seed.

@jvesely jvesely added bug Should work but doesn't PEC Parameter Estimation Composition labels Jan 4, 2024
jvesely added a commit to jvesely/PsyNeuLink that referenced this issue Jan 4, 2024
…er instances

The test is using instantiated samplers so it needs to provide fixed
seeds to guarantee deterministic behaviour.
Passes 100 iterations of
test_parameter_optimization_ddm[LLVM-optuna_cmaes_sampler] without
failures.

Bug: PrincetonUniversity#2874
Signed-off-by: Jan Vesely <[email protected]>
jvesely added a commit to jvesely/PsyNeuLink that referenced this issue Jan 4, 2024
…er instances

The test is using instantiated samplers so it needs to provide fixed
seeds to guarantee deterministic behaviour.
Passes 100 iterations of
test_parameter_optimization_ddm[LLVM-optuna_cmaes_sampler] without
failures.

Bug: PrincetonUniversity#2874
Signed-off-by: Jan Vesely <[email protected]>
@davidt0x
Copy link
Collaborator

davidt0x commented Jan 4, 2024

Hey @jvesely,

Yeah, I meant to revisit, thanks for catching it. I think I have worked out a better way with this method. The issue is that I want to allow users to pass in instantiated optuna samplers but the RandomState is created when the sampler is created. However, I didn't notice that they have a reseed_rng() method, so I can change the seed and call reseed_rng() to update the actual random state. I think this is cleaner approach. Though, now that I am thinking, if the user passed a seed to the sampler when instantiating it we should probably respect that seed over the PNL seed I guess. I have the proposed change here: https://github.com/PrincetonUniversity/PsyNeuLink/tree/fix_optuna_rng. What do you think?

Also, not sure this will fix the intermittent failure issue. Do you think that is what is causing this? I can't figure that one out for the life of me. Mainly because I can't reproduce locally at all.

@davidt0x
Copy link
Collaborator

davidt0x commented Jan 4, 2024

Oh, I just saw your closed PR where you say fixing the seed issue did fix the intermittent failures. Can you test the above approach where the RNG is reseeded when using the PNL seed? Are you just running the test 100 times?

@jvesely
Copy link
Collaborator Author

jvesely commented Jan 4, 2024

I don't see how calling reseed_rng helps here. The implementation is not reading the seed from anywhere [0,1].
The only way I found to set a custom seed is in the sampler constructor.

Yeah:

for i in `seq 100`; do echo $i; pytest -k cmaes tests/composition/test_parameterestimationcomposition.py -n0; done | tee pec.out

showed 3 failures in initial testing.

[0] https://github.com/optuna/optuna/blob/master/optuna/samplers/_random.py#L43
[1] https://github.com/optuna/optuna/blob/master/optuna/samplers/_cmaes.py#L363

@davidt0x
Copy link
Collaborator

davidt0x commented Jan 5, 2024

Oh your right, I figured it cached the seed. Well, maybe we will just have to settle for something like a warning message when seed is passed to PEC with an Optuna sampler that states that the seed needs to be specified on the sampler as well to get deterministic behavior. Anyway, thanks for figuring this all out!

@jvesely
Copy link
Collaborator Author

jvesely commented Jan 5, 2024

Oh your right, I figured it cached the seed. Well, maybe we will just have to settle for something like a warning message when seed is passed to PEC with an Optuna sampler that states that the seed needs to be specified on the sampler as well to get deterministic behavior. Anyway, thanks for figuring this all out!

to make things more interesting CmaEs sampler reseeds the PRNG regularly, so it looks like we have very little control over the random sequences unless we want to restrict allowed samplers to those that can be hacked around.

I'll leave this issue open to do something about the

opt_func._rng = np.random.RandomState(self.owner.initial_seed)

line. I think it can just be removed since it only works for RandomSampler, and even for that it breaks in optuna 3.4.0/3.5.0.

@jvesely jvesely added the triaged The issue has been understood and the root cause is known label Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Should work but doesn't PEC Parameter Estimation Composition triaged The issue has been understood and the root cause is known
Projects
None yet
Development

No branches or pull requests

2 participants