Small Refactorings to the DoE Strategy #514

jduerholt · 2025-02-06T13:49:28Z

I was looking to the DoEStrategyand spotted some issues and possible improvements:

Make strategy.set_candidates versatile enough to be able to deal also with partially fixed one without overwriting the method just for the DoEStrategy.
For dealing with categoricals, a one-hot encoding is performed and new continuous features are introduced. We should use there the already implemented method for one-hot encoding of categoricals, so that we do not duplicate code. Furthermore, the continuous features are named based on the category names, but this will break when you have two categorical variables which share at least one category with the same name. We should follow there the general convention to name it always <feature.key>_<category_name>.
Currently, it is only conditioning the generated experiments on the candidates. It should also do this for self.experiments, as this is the usual way most ACB APIs work, by using ask and tell and conditioning on self.experiments. For this, I assume that it needs to be combined with the partially_fixed_candidates under the hood.

As I am not so deep in the DoEStrategy, I only started to work on these improvements. Maybe you @Osburg have time to proceed on it. I assume that you are much faster on this as I am. I was struggling especially on how to combine the candidates and experiments. The rest is in principle done. If you do not have time, just feel free to tell, then I dig deeper ;)

cc: @dlinzner-bcs

Osburg · 2025-02-07T09:37:16Z

Hey Johannes :)

thanks for pointing this out! I can definitely do it, but I cannot promise when I will have time for it (certainly not until the end of next week :/). If it is not urgent for you, we can assign this to me and I'll work on it when I have time for it. If it is urgent, I think it is better to declare this an open task for @dlinzner-bcs, you and me, and the first person who has time and feels like it works on it. What do you think?

Cheers
Aaron

jduerholt · 2025-02-07T22:28:41Z

Thanks @Osburg, it is not urgent, if we can get it sorted out somewhere over the next month, it would be great.

Two other questions in this context (where the first is kind of urgent :))

For what do you need this sorting here:

bofire/bofire/strategies/doe/objective.py

Line 724 in f693891

torch.sum(torch.sort(torch.pdist(X))[0][: self.n_experiments]).backward()

I do not understand it. If not needed, we can also get rid of it. An answer here is a bit more ugent as I try to understand it for trying out something similar.
Why do you go here over pandas:

bofire/bofire/strategies/doe/objective.py

Line 733 in f693891

X = pd.DataFrame(

It is not needed, or? And creates only overhead, or?

jduerholt · 2025-02-07T22:49:44Z

First question is clarified ;)

Osburg · 2025-02-07T23:01:01Z

Towards the second: I think you are right, this is not necessary and I guess we can remove this.

refactoring concept

c859d72

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small Refactorings to the DoE Strategy #514

Small Refactorings to the DoE Strategy #514

jduerholt commented Feb 6, 2025

Osburg commented Feb 7, 2025

jduerholt commented Feb 7, 2025 •

edited

Loading

jduerholt commented Feb 7, 2025

Osburg commented Feb 7, 2025

Small Refactorings to the DoE Strategy #514

Are you sure you want to change the base?

Small Refactorings to the DoE Strategy #514

Conversation

jduerholt commented Feb 6, 2025

Osburg commented Feb 7, 2025

jduerholt commented Feb 7, 2025 • edited Loading

jduerholt commented Feb 7, 2025

Osburg commented Feb 7, 2025

jduerholt commented Feb 7, 2025 •

edited

Loading