Skip to content

Conversation

@jduerholt
Copy link
Collaborator

Motivation

As I am currently refactoring our internal codebase, I had a look at @sdaulton PR regarding probabilistic reparameterization.
From my understanding one has to use it by representing the categoricals by a one hot encoding for the reparmeterized ACQF and then eventually transforming the input to a numerical represenation via OneHotToNumeric especially when one wants to use it togehter with MixedSingleTaskGP. Currently MixedSingleTaskGP is very strict on which input transforms are allowed. This PR lifts the restrictions to make it usable with OneHotToNumeric`.

Note that the transform also has to be instantiated with transform_on_train = False and train_X has to be transformed before it is passed to the constructor of MixedSingleTaskGP, else the indices for the different kernels are mixed up.

Have you read the Contributing Guidelines on pull requests?

Yes.

Test Plan

Unit tests.

Related PRs

#1534

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Dec 15, 2022
@codecov
Copy link

codecov bot commented Dec 15, 2022

Codecov Report

Merging #1568 (2cd11f2) into main (cbd5002) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main     #1568   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          153       153           
  Lines        13651     13646    -5     
=========================================
- Hits         13651     13646    -5     
Impacted Files Coverage Δ
botorch/models/gp_regression_mixed.py 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Balandat
Copy link
Contributor

@saitcakmak this is an interesting one - any thoughts how this would relate to / interact with the input transform refactor that has been on your mind?

@saitcakmak
Copy link
Contributor

@Balandat This just removes some user facing checks. It'll require the users be more careful about what input transform they use but it shouldn't affect anything else otherwise.

@jduerholt
Copy link
Collaborator Author

One other thing in the context of probabilistic reparameterization:

Based on @sdaulton PR on probabilistic reparameterization, I asked myself how one would set up parameterized categoricals via probabilistic reparameterization.

An example would be to have a categorical variable named molecule with 10 different categories (molecules). I am able to parametrize every molecule by a descriptor vector of length three, meaning three numbers representing three properties of every molecule (I think you did something like this in the original paper).

From what I understood from @sdaulton's implementation one would encode this parametrized categorical still as one-hot for the optimizer and the reparametrized acqfs, but would need an input transform OneHotToDescriptor which takes the one-hot encoded categorical and translates it to the vector representation in the spirit of OneHotToNumeric, or is this wrong?

@jduerholt
Copy link
Collaborator Author

@Balandat This just removes some user facing checks. It'll require the users be more careful about what input transform they use but it shouldn't affect anything else otherwise.

@Balandat @saitcakmak Is this fine for you or should I change something in addition before the merge?

@Balandat
Copy link
Contributor

I am fine with merging this in as is

@facebook-github-bot
Copy link
Contributor

@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@Balandat merged this pull request in 6090284.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed Do not delete this pull request or issue due to inactivity. Merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants