Return incomplete results when simulation errors out #212

brandon-holt · 2024-04-18T14:54:13Z

Hi! I was wondering if it would be possible to add a feature where simulations will still return the results compiled up to the point of an error?

The situation I'm running into when running on larger datasets is a botorch error to the tune of
All attempts to fit the model have failed.

I am in the process of troubleshooting what about the dataset is causing the failure, but in the meantime it would be nice to see the results up to that point, which should include dozens of batches of experiments.

Also, if you have any experience with what might be causing an error like this, that would be helpful!

Referring to this comment in a botorch thread: pytorch/botorch#1226 (comment)
I initially wondered if this could be my issue, but baybe should prevent this from being an issue since it identifies duplicate parameter values and randomly picks one.

The text was updated successfully, but these errors were encountered:

brandon-holt · 2024-04-18T16:51:32Z

Just as a temp local fix I am adding a catch-all except in the DOE loop and breaking to return the incomplete results.

while k_iteration < limit:
        # Get the next recommendations and corresponding measurements
        try:
            measured = campaign.recommend(batch_size=batch_size)
        except:
            break

AdrianSosic · 2024-04-19T07:34:30Z

Hi @brandon-holt 👋🏼 One part that can be easily answered: your suggestion of providing access to partial simulation results sounds absolutely reasonable and I think I can confidently say that we'll incorporate some appropriate mechanism into the refactored module (so far, we simply haven't had the need for it because the simulations always succeeded). The small challenge is see here is that a clean handling requires more than just returning the incomplete dataframe (your workaround) or passing through the exception (current logic) because:

In the former, we lose the error message plus it's not trivial to even identify that an error occurred.
In the latter, we lose the data.

Also, the mechanism needs to be compatible with all simulation layers we offer (i.e. simulating a single campaign vs simulating multiple campaigns, etc). However, I think I already have some good ideas how this can be accomplished.

That said, I've nothing against providing a quick workaround to unblock you, as long as the changes do not cause backward compatibility issues later. Let me draft a quick PR and see what my colleagues think about it 👍🏼 will tag you there.

AdrianSosic · 2024-04-19T07:35:56Z

Now, the more worrisome part. So far, I haven't experience any of the problems you describe. While I can offer trying to debug/investigate the botorch internals if we can come up with a minimal reproducing example, I would only do it as a last resort and first see if we get a better understanding of what's going on.

So here a few things we should consider first:

Have you already tried what has been suggested in the thread and checked if it mitigates the problem?
You are right that only one of the parameter duplicates is picked per simulation step, but be aware that the recommender might still suggest the same configuration in a later iteration in case the predictive model favors it! (From a statistical perspective, this is indeed the correct thing to do if we assume a noisy environment.) You can explicitly forbid this to happen by setting the allow_recommending_already_measured to False, which all our "pure" recommenders support. That way, we know for certain that no duplicates can appear in the training data throughout the simulation.
Perhaps the issue of "being close" is not only to be interpreted on the level of parameter configurations but instead also on individual feature columns, i.e. I could imagine that numerical issues can appear when columns are highly correlated or have show only little variation in their values. We have an open workitem that will allow to activate/control automatic feature decorrelation, but we are not fully there yet. Still, it could help to have a look at the comp_df of your searchspace to see if there is anything suspicious ...

brandon-holt · 2024-04-20T10:43:29Z

Heyo! These are good points, I will look into them and let you know what I find!!

brandon-holt · 2024-04-20T13:45:36Z

@AdrianSosic Okay so after some quick and dirty initial testing, it looks like allow_recommending_already_measured actually causes it to fail sooner, which is potentially interesting! But I am running more extensive tests to make meaningful comparisons, which should be done in a day or so and I'll lyk what I find.

In the meantime, I'm looking into the features in my comp_df for each parameter in my search space to see if any features are highly correlated. Attaching here if you're curious!

comp_df.zip

AdrianSosic · 2024-04-23T14:47:10Z

You mean it fails sooner if you set the attribute to False? That's indeed a bit surprising. Curios what's going on here... 🤔 I'll let you first finish your tests and we can have a look 👍🏼

brandon-holt · 2024-04-25T14:03:48Z

@AdrianSosic Yes, so it definitely appears that setting allow_recommending_already_measured=False causes the model to reach a failure point sooner. In my tests with my dataset that would take ~1000 iterations to test every datapoint, the model fails in:

Setting	Result
`allow_recommending_already_measured=True`	>200 iterations
`allow_recommending_already_measured=False`	>100 iterations

This could make sense because when the model isn't allowed to pick 'repeated' measurements, it is more likely to reach the model-breaking outliers/datapoints faster. However, our hypothesis was that the 'repeated' measurements that have the same features but disparate target values were in fact the ones that were breaking the model.

brandon-holt · 2024-04-26T16:58:18Z

@AdrianSosic Hey just adding a repro for you in case it helps. Just a heads up running as is will take ~200-300 GB of RAM. If that's problematic, you could bump percent_discretize up to 20-50 to reduce the amount of memory required. Just note that I think there may be a point above which the issue disappears, so keep that in mind if you go to higher discretization factors.

The only concern is that by replacing my molecules SMILES with random ones, it may change what solves the issue, but after running some initial tests on my end, it looks like the behavior is similar (as the results shown in the table above).

baybe212_repeat_measurements.zip

AdrianSosic · 2024-04-30T11:09:03Z

Hi @brandon-holt, thanks for sharing. I would like to have a look but need to postpone this until next week. Currently, we are a bit swamped with open PRs and features to be merged + need to release 0.9.0 asap, which will keep me busy for a while. Will let you know once I've had a chance to look 🙃

brandon-holt · 2024-04-30T13:28:38Z

@AdrianSosic No worries, thanks for the heads up!

Fixes #212 by returning incomplete simulation results with a warning instead of terminating the simulation when encountering an exception.

AdrianSosic added enhancement Expand / change existing functionality question Further information is requested labels Apr 19, 2024

AdrianSosic mentioned this issue Apr 19, 2024

Enable returning incomplete simulation results #213

Merged

AdrianSosic self-assigned this Apr 26, 2024

AdrianSosic closed this as completed in #213 May 13, 2024

AdrianSosic added a commit that referenced this issue May 13, 2024

Enable returning incomplete simulation results (#213)

fad3324

Fixes #212 by returning incomplete simulation results with a warning instead of terminating the simulation when encountering an exception.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return incomplete results when simulation errors out #212

Return incomplete results when simulation errors out #212

brandon-holt commented Apr 18, 2024

brandon-holt commented Apr 18, 2024

AdrianSosic commented Apr 19, 2024

AdrianSosic commented Apr 19, 2024

brandon-holt commented Apr 20, 2024

brandon-holt commented Apr 20, 2024

AdrianSosic commented Apr 23, 2024

brandon-holt commented Apr 25, 2024

brandon-holt commented Apr 26, 2024 •

edited

Loading

AdrianSosic commented Apr 30, 2024

brandon-holt commented Apr 30, 2024

Return incomplete results when simulation errors out #212

Return incomplete results when simulation errors out #212

Comments

brandon-holt commented Apr 18, 2024

brandon-holt commented Apr 18, 2024

AdrianSosic commented Apr 19, 2024

AdrianSosic commented Apr 19, 2024

brandon-holt commented Apr 20, 2024

brandon-holt commented Apr 20, 2024

AdrianSosic commented Apr 23, 2024

brandon-holt commented Apr 25, 2024

brandon-holt commented Apr 26, 2024 • edited Loading

AdrianSosic commented Apr 30, 2024

brandon-holt commented Apr 30, 2024

brandon-holt commented Apr 26, 2024 •

edited

Loading