Add "fit only pytorch models" flag #291

radka-j · 2025-02-03T16:36:16Z

Allow users to easily select to only fit emulators that have a PyTorch backend (currently this is GPs and CNPs). This is useful in cases where a downstream task relies on this.

mastoffel · 2025-02-04T09:52:54Z

Just adding here, that when enabling this function, we should deactivate data pre-processing as this is done through sci-kit learn pipelines. We can then throw a warning to tell the user to do this manually beforehand.

mastoffel · 2025-02-07T11:05:39Z

A bit of a brain dump here so that we can discuss this: I've just had a look at this issue and #295 (i.e. running and extracting PyTorch models). It's not as straightforward as I thought. This is because both Neural Processes and GPs need objects/data outside of the PyTorch object itself and both are quite specific. AutoEmulate handles these things in the background, but by strapping away the estimator object and returning the pure PyTorch object, we lose this functionality and leave it to the user to figure it out.

The CNP forward method needs context and targets points:

def forward(self, X_context, y_context, X_target=None, context_mask=None):

AutoEmulate internally just takes the training data as context points, so in the predict method the user only has to provide targets X. To do training, the CNP uses a dataset which coordinates the sampling and is slightly weird, as it creates a meta-dataset from a normal dataset. So to do further training, the user would need that object too I guess. For a Attentive CNP the user would also need context_mask for training (which the PyTorch dataset in AutoEmulate also takes care of.

The GP PyTorch object returns a MultivariateNormal object, rather then posterior predictions. To get those, we need a likelihood function, see here. We can extract this one from the object though.

So the question is how to go ahead. A few thoughts:

extract all relevant objects and return them in a tuple. This means all PyTorch models will have different outputs and might be confusing.
extract only the main PyTorch object and write tutorials on how to use them further
something else?

Would be great to get your input here @marjanfamili @radka-j

radka-j · 2025-02-11T10:12:08Z

@mastoffel can you point me to where in the code autoemulate does these additional steps (I just don't know the codebase very well yet)?

mastoffel · 2025-02-11T10:31:47Z

the PyTorch model underlying the CNP (and attentive CNP) need X_context, y_context, X_target, where X_target is the data to predict on and the context tensors are the training data. AutoEmulate just uses the training data which it saved as attributes as context data, see the predict function in condition_neural_process.py
the PyTorch model underlying the GP works on it's own with just inputs X, but it returns a gpytorch.distributions.MultitaskMultivariateNormal which is a distribution. To get actual posterior values we need to provide a likelihood, which is done in the fit method in the wrapper class. Here, we wrap the PyTorch in a skorch ExtractGPRegressor, which we provide with the likelihood and training specifics (n epochs, optimizer etc.)

So to do a proper forward pass from inputs X to outputs y, we need additional data (contexts for CNP, Likelihood for GPs), which the user has to figure out themselves if we only provide the pure fitted PyTorch model.

radka-j · 2025-02-11T12:05:09Z

Thanks, this is really helpful!

I think we have to return all as a single torch object that has the data it needs and a predict() method that does the right things with them. How difficult would this be?

mastoffel assigned mastoffel, radka-j and marjanfamili Feb 7, 2025

mastoffel mentioned this issue Feb 12, 2025

PyTorch model extractor #298

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "fit only pytorch models" flag #291

Add "fit only pytorch models" flag #291

radka-j commented Feb 3, 2025

mastoffel commented Feb 4, 2025

mastoffel commented Feb 7, 2025

radka-j commented Feb 11, 2025

mastoffel commented Feb 11, 2025

radka-j commented Feb 11, 2025

Add "fit only pytorch models" flag #291

Add "fit only pytorch models" flag #291

Comments

radka-j commented Feb 3, 2025

mastoffel commented Feb 4, 2025

mastoffel commented Feb 7, 2025

radka-j commented Feb 11, 2025

mastoffel commented Feb 11, 2025

radka-j commented Feb 11, 2025