Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added normalization for predictions. #91

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
22 changes: 21 additions & 1 deletion aviary/predict.py
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aleatoric uncertainties would also need to be denormed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think line 108 handles that case as well, as in case the model is robust, preds will contain both (line 113) -

preds, aleat_log_std = preds.T

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I understand the problem you are pointing out, have added a fix.

Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import torch
from tqdm import tqdm

from aviary.core import Normalizer
from aviary.utils import get_metrics, print_walltime

if TYPE_CHECKING:
Expand Down Expand Up @@ -90,13 +91,32 @@ def make_ensemble_predictions(
model = model_cls(**model_params)
model.to(device)

model.load_state_dict(checkpoint["model_state"])
# some models save the state dict under a different key
state_dict_field = "model_state" if "model_state" in checkpoint else "state_dict"
model.load_state_dict(checkpoint[state_dict_field])

with torch.no_grad():
preds = np.concatenate(
[model(*inputs)[0].cpu().numpy() for inputs, *_ in data_loader]
).squeeze()

# denormalize predictions if a normalizer was used during training
if "normalizer_dict" in checkpoint:
assert (
task_type == "regression"
), "Normalization only takes place for regression."
normalizer = Normalizer.from_state_dict(
checkpoint["normalizer_dict"][target_name]
)
if model.robust:
# denorm the mean and aleatoroc uncertainties separately
mean, log_std = np.split(preds, 2, axis=1)
preds = normalizer.denorm(mean)
ale_std = np.exp(log_std) * normalizer.std
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to put this back to the log space here based on the logic below.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think it would be less code to just add the normalizer into the logic below rather than having to make a new logic block.

preds = np.column_stack([preds, ale_std])
else:
preds = normalizer.denorm(preds)

pred_col = f"{target_col}_pred_{idx}" if target_col else f"pred_{idx}"

if model.robust:
Expand Down