Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small Refactorings to the DoE Strategy #514

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jduerholt
Copy link
Contributor

I was looking to the DoEStrategyand spotted some issues and possible improvements:

  • Make strategy.set_candidates versatile enough to be able to deal also with partially fixed one without overwriting the method just for the DoEStrategy.
  • For dealing with categoricals, a one-hot encoding is performed and new continuous features are introduced. We should use there the already implemented method for one-hot encoding of categoricals, so that we do not duplicate code. Furthermore, the continuous features are named based on the category names, but this will break when you have two categorical variables which share at least one category with the same name. We should follow there the general convention to name it always <feature.key>_<category_name>.
  • Currently, it is only conditioning the generated experiments on the candidates. It should also do this for self.experiments, as this is the usual way most ACB APIs work, by using ask and tell and conditioning on self.experiments. For this, I assume that it needs to be combined with the partially_fixed_candidates under the hood.

As I am not so deep in the DoEStrategy, I only started to work on these improvements. Maybe you @Osburg have time to proceed on it. I assume that you are much faster on this as I am. I was struggling especially on how to combine the candidates and experiments. The rest is in principle done. If you do not have time, just feel free to tell, then I dig deeper ;)

cc: @dlinzner-bcs

@Osburg
Copy link
Collaborator

Osburg commented Feb 7, 2025

Hey Johannes :)

thanks for pointing this out! I can definitely do it, but I cannot promise when I will have time for it (certainly not until the end of next week :/). If it is not urgent for you, we can assign this to me and I'll work on it when I have time for it. If it is urgent, I think it is better to declare this an open task for @dlinzner-bcs, you and me, and the first person who has time and feels like it works on it. What do you think?

Cheers
Aaron

@jduerholt
Copy link
Contributor Author

jduerholt commented Feb 7, 2025

Thanks @Osburg, it is not urgent, if we can get it sorted out somewhere over the next month, it would be great.

Two other questions in this context (where the first is kind of urgent :))

  • For what do you need this sorting here:
    torch.sum(torch.sort(torch.pdist(X))[0][: self.n_experiments]).backward()
    I do not understand it. If not needed, we can also get rid of it. An answer here is a bit more ugent as I try to understand it for trying out something similar.
  • Why do you go here over pandas:
    X = pd.DataFrame(
    It is not needed, or? And creates only overhead, or?

@jduerholt
Copy link
Contributor Author

First question is clarified ;)

@Osburg
Copy link
Collaborator

Osburg commented Feb 7, 2025

Towards the second: I think you are right, this is not necessary and I guess we can remove this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants