Fix mc_calibration plot #294

Kucharssim · 2025-01-24T09:46:28Z

The mc_calibration plot did not work. In addition to the obvious problems (wrong arg names etc), the prepare_plot_data() expected pred_models.shape == (num_datasets, num_samples, num_models) (where in this case num_samples would be 1), whereas expected_calibration_error() expected pred_models.shape == (num_datasets, num_models). I loosened the check in prepare_plot_data such that targets can be either posterior samples (num_datasets, num_samples, num_variables) or point estimates (num_datasets, num_variables).

I would be happy to write a test for this but I am not sure what is the desired standard for testing plotting functionality in this project. Please let me know.

You can test the PR with:

import keras
import bayesflow as bf

num_datasets=1000
num_models=3

true_models = keras.random.categorical(logits=keras.ops.zeros((num_datasets, num_models)), num_samples=1)
true_models = keras.ops.one_hot(true_models, num_classes=num_models)
true_models = keras.ops.reshape(true_models, (num_datasets, num_models))
true_models = keras.ops.convert_to_numpy(true_models)

pred_models=keras.random.normal(shape=(num_datasets, 3), mean=true_models)
pred_models=keras.ops.softmax(pred_models)
pred_models=keras.ops.convert_to_numpy(pred_models)

f=bf.diagnostics.mc_calibration(pred_models=pred_models, true_models=true_models)

paul-buerkner · 2025-01-24T10:50:00Z

Thanks! I have not so much experience with the diagnostics module myself unfortunately. @stefanradev93 could you take a quick look?

stefanradev93 · 2025-01-26T01:06:29Z

Hi Simon, thanks for taking this on! Indeed, the model comparison components have been somewhat neglected recently, so I am happy you noticed these bugs. Testing plots is pretty hard and would amount to testing the fidelity of the underlying data - so for now we just rely on using them across a variety of applications. Please, fix the linter problems by including the pre-commit hook for auto-checking and fixing.

Kucharssim · 2025-01-26T09:05:52Z

Thanks @stefanradev93!

Regarding the linter, I've already ran the pre-commits hooks before making the PR, and running ruff locally passes all checks. The log of the GitHub action here shows that the problematic files are those that I didn't touch, any idea what's happening here?

stefanradev93 · 2025-01-27T23:38:41Z

That is pretty strange indeed. @LarsKue have you seen this before?

LarsKue · 2025-01-28T08:08:50Z

I think it's a non-issue since the formatter and linter also check files that are not changed. If the target branch breaks them, so will the PR. We should fix this directly on dev.

Please merge dev into this branch before merging back into dev. Otherwise, I am ok with merging.

Kucharssim · 2025-01-28T10:12:12Z

Please merge dev into this branch before merging back into dev.

Ok, done. Now one of the tests failed on windows 😄 I think that's unrelated too though.

Kucharssim requested review from paul-buerkner, stefanradev93 and LarsKue January 24, 2025 09:46

paul-buerkner removed their request for review January 24, 2025 15:33

fix mc_calibration plot

123ce33

Kucharssim force-pushed the minor-fix-diagnostics branch from 608fdd3 to 123ce33 Compare January 28, 2025 09:19

LarsKue approved these changes Jan 28, 2025

View reviewed changes

LarsKue merged commit d990340 into bayesflow-org:dev Jan 28, 2025
12 of 13 checks passed

han-ol pushed a commit to han-ol/bayesflow that referenced this pull request Jan 28, 2025

fix mc_calibration plot (bayesflow-org#294)

cb785b5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix mc_calibration plot #294

Fix mc_calibration plot #294

Kucharssim commented Jan 24, 2025

paul-buerkner commented Jan 24, 2025

stefanradev93 commented Jan 26, 2025

Kucharssim commented Jan 26, 2025

stefanradev93 commented Jan 27, 2025

LarsKue commented Jan 28, 2025

Kucharssim commented Jan 28, 2025

Fix mc_calibration plot #294

Fix mc_calibration plot #294

Conversation

Kucharssim commented Jan 24, 2025

paul-buerkner commented Jan 24, 2025

stefanradev93 commented Jan 26, 2025

Kucharssim commented Jan 26, 2025

stefanradev93 commented Jan 27, 2025

LarsKue commented Jan 28, 2025

Kucharssim commented Jan 28, 2025