-
Notifications
You must be signed in to change notification settings - Fork 430
Non-equidistant discrete dimensions for optimize_acqf_mixed_alternating
#2923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2923 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 212 212
Lines 19819 19852 +33
=========================================
+ Hits 19819 19852 +33 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't go through in full detail; at a high level this makes sense, the main thing I'm worried is that if there are many dimensions and many values this could end up becoming quite slow. Do you have any timing / profiling data on this?
In principle this should be of the same speed as the original implementation, main difference is the rounding operation, are you interested in profiling of this operation? Or do you see issues with speed also in other parts of the code? |
Ah, now I see, you see the main problem in |
Here is some code for profiling for the mentioned operation: from botorch.optim.optimize_mixed import get_nearest_neighbors
import time
import torch
d = 100
nd = 90
n_discrete_values = 300
current_x = torch.rand(d)
discrete_dims = {i: list(range(n_discrete_values+1)) for i in range(nd)}
for i in range(nd):
current_x[i] = torch.randint(0, n_discrete_values, (1,)).item()
bounds = torch.tensor([[0.0] * d, [n_discrete_values] * d])
start = time.time()
neighbors = get_nearest_neighbors(current_x=current_x, discrete_dims=discrete_dims, bounds=bounds)
end = time.time()
print(f"Time taken to find nearest neighbors: {end - start:.4f} seconds") For this rather extreme case, it takes around 0.006 seconds on my macbook with a M1 processor. The original one takes for this case around 0.001 seconds, the new one is defintely slower, but I think that it will be not the rate limiting step (speeking as a chemist :D), or? |
Yeah, that seems like it should be fine. |
Hi @Balandat, I implemented your comments, so it should be ready for rereview. Best, Johannes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally this lgtm - @saitcakmak could you take a look as well since this changes some things deep down in some code that I believe you wrote?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable overall. My main concern is with clamping operations, which may be introducing a bug. Curious to hear your take on extending this to categoricals as well
t_values = torch.tensor( | ||
values, device=unique_neighbors.device, dtype=unique_neighbors.dtype | ||
) | ||
idx = unique_neighbors[:, dim].long().clamp(0, len(values) - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to clamp here? Both plus_neighbors
and minus_neighbors
have been clamped, right? If we actually need to clamp here, I think that may produce duplicate neighbors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will investigate this, after fixing the clamping from above.
plus_neighbors = current_x.repeat(num_discrete, 1) | ||
plus_neighbors[:, discrete_dims] += diag_ones | ||
plus_neighbors = current_x_int.repeat(num_discrete, 1) | ||
plus_neighbors[:, t_discrete_dims] += diag_ones | ||
plus_neighbors.clamp_(max=bounds[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to clamp to bounds anymore, since the neighbors now have index values that are between 0 & len(values) - 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, we need to clamp, but not with the bounds ;) I overlooked this. I will change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jduerholt this still needs to be changed it seems
minus_neighbors = current_x.repeat(num_discrete, 1) | ||
minus_neighbors[:, discrete_dims] -= diag_ones | ||
minus_neighbors = current_x_int.repeat(num_discrete, 1) | ||
minus_neighbors[:, t_discrete_dims] -= diag_ones | ||
minus_neighbors.clamp_(min=bounds[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question about clamping here. I think this one is particularly problematic since the bounds[0] may be some large number even though the index value has a fixed range. I guess the same argument would apply to bounds[1] as well since the bounds can be negative as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally correct, we need to clamp with zero here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
discrete_dims: A dictionary mapping indices of discrete and binary | ||
dimensions to a list of allowed values for that dimension. | ||
cat_dims: A list of indices corresponding to categorical parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the categoricals are still required to be [0, 1, 2, ...] but discrete dimensions are relaxed. I think it'd be good to treat them equally here if it's not too much extra work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this PR is only dealing with the discrete ones. My plan is to file a consecutive PR in which I introduce the same syntax for categoricals. My plan was to get first the feedback on the discrete ones to have then less work with the categoricals ;) Do you want to have the categoricals also in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to work incrementally and deal with the categoricals in a separate PR, so long as this PR clearly states hat limitation in the docstring and errors our if the input doesn't comply with that requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a couple of items are still open, o/w this is getting there!
plus_neighbors = current_x.repeat(num_discrete, 1) | ||
plus_neighbors[:, discrete_dims] += diag_ones | ||
plus_neighbors = current_x_int.repeat(num_discrete, 1) | ||
plus_neighbors[:, t_discrete_dims] += diag_ones | ||
plus_neighbors.clamp_(max=bounds[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jduerholt this still needs to be changed it seems
minus_neighbors = current_x.repeat(num_discrete, 1) | ||
minus_neighbors[:, discrete_dims] -= diag_ones | ||
minus_neighbors = current_x_int.repeat(num_discrete, 1) | ||
minus_neighbors[:, t_discrete_dims] -= diag_ones | ||
minus_neighbors.clamp_(min=bounds[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
Motivation
As dicussed in #2904, this is the first PR in planned series of PRs regarding adding additional functionality to
optimize_acqf_mixed_alternating
. In this one, non-equidistant discrete dimensions are introduced.CC: @TobyBoyne
Have you read the Contributing Guidelines on pull requests?
Yes.
Test Plan
Unit tests.