Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement proper truncation for prior distributions #335

Open
wants to merge 10 commits into
base: develop
Choose a base branch
from

Conversation

dweindl
Copy link
Member

@dweindl dweindl commented Dec 7, 2024

Currently, when sampled startpoints are outside the bounds, their value is set to the upper/lower bounds. This may put too much probability mass on the bounds.

With these changes, we properly sample from the respective truncated distributions.

Closes #330.

This also evaluates all priors on the model parameter scale (instead of parameterScale scale, see PEtab-dev/PEtab#402.

👀 https://petab--335.org.readthedocs.build/projects/libpetab-python/en/335/example/distributions.html

@dweindl dweindl self-assigned this Dec 7, 2024
@dweindl dweindl force-pushed the 330_truncated branch 3 times, most recently from 90946f3 to f4b5153 Compare December 11, 2024 15:53
Currently, when sampled startpoints are outside the bounds, their value is set to the upper/lower bounds. This may put too much probability mass on the bounds.

With these changes, we properly sample from the respective truncated distributions.

Closes PEtab-dev#330.
@codecov-commenter
Copy link

codecov-commenter commented Dec 11, 2024

Codecov Report

Attention: Patch coverage is 86.50794% with 17 lines in your changes missing coverage. Please review.

Project coverage is 74.25%. Comparing base (6a433e0) to head (6f005b8).

Files with missing lines Patch % Lines
petab/v1/distributions.py 85.05% 11 Missing and 2 partials ⚠️
petab/v1/priors.py 89.47% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #335      +/-   ##
===========================================
- Coverage    74.66%   74.25%   -0.42%     
===========================================
  Files           56       56              
  Lines         5573     5647      +74     
  Branches       976      990      +14     
===========================================
+ Hits          4161     4193      +32     
- Misses        1040     1084      +44     
+ Partials       372      370       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dweindl dweindl marked this pull request as ready for review December 11, 2024 18:42
@dweindl dweindl requested review from m-philipps and a team as code owners December 11, 2024 18:42
Comment on lines 61 to 69
:param bounds_truncate: Whether the generated prior will be truncated
at the bounds.
If ``True``, the probability density will be rescaled
accordingly and the sample is generated from the truncated
distribution.
If ``False``, the probability density will not account for the
bounds, but any parameter samples outside the bounds will be set to
the value of the closest bound. In this case, the PDF might not match
the sample.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True: new behavior
False: old behavior

PEtab specs are ambiguous there (https://github.com/PEtab-dev/PEtab/blob/b9e141dd75798d179c17262f085ed6cef8555b3e/doc/v1/documentation_data_format.rst?plain=1#L527-L529):

Sampled points are clipped to lie inside the parameter boundaries specified by lowerBound and upperBound.

While I think the new behavior is more correct, I will wait another while before merging this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEtab specs are ambiguous there (https://github.com/PEtab-dev/PEtab/blob/b9e141dd75798d179c17262f085ed6cef8555b3e/doc/v1/documentation_data_format.rst?plain=1#L527-L529):

Sampled points are clipped to lie inside the parameter boundaries specified by lowerBound and upperBound.

While I think the new behavior is more correct, I will wait another while before merging this.

I agree, but I would also be in favor of removing the old behavior entirely. Or "fix" it by resampling out-of-bounds samples.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that we should get rid of that. Happy to remove this option completely.

I will wait for some feedback to PEtab-dev/PEtab#591 before proceeding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to some v1 subfolder? Now or later is fine. But I think priors will change a lot in v2

Copy link
Member Author

@dweindl dweindl Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about moving it to https://github.com/PEtab-dev/PEtab/ at some point. It might also be helpful for non-python petab users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

@@ -151,15 +156,18 @@
{
"metadata": {},
"cell_type": "markdown",
"source": "To prevent the sampled parameters from exceeding the bounds, the sampled parameters are clipped to the bounds. The bounds are defined in the parameter table. Note that the current implementation does not support sampling from a truncated distribution. Instead, the samples are clipped to the bounds. This may introduce unwanted bias, and thus, should only be used with caution (i.e., the bounds should be chosen wide enough):",
"source": "The given distributions are truncated at the bounds defined in the parameter table:",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add something like "This results in a constant shift in the probability density, compared to the non-truncated version (https://en.wikipedia.org/wiki/Truncated_distribution), such that the probability density still sums to 1."

petab/v1/distributions.py Outdated Show resolved Hide resolved
def _undo_log(self, x: np.ndarray | float) -> np.ndarray | float:
"""Undo the log transformation.
def _exp(self, x: np.ndarray | float) -> np.ndarray | float:
"""Exponentiate / undo the log transformation according.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_undo_log made sense to me, since the point is to take the inverse of the log, but fine to change too

Suggested change
"""Exponentiate / undo the log transformation according.
"""Exponentiate / undo the log transformation if applicable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it too complicated, as exp is well understood, I think.

petab/v1/distributions.py Outdated Show resolved Hide resolved
petab/v1/distributions.py Outdated Show resolved Hide resolved
:param x: The value at which to evaluate the CDF.
:return: The value of the CDF at ``x``.
"""
return self._cdf_transformed_untruncated(x) - self._cd_low
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, shouldn't the CDF "grow" faster when the PDF is truncated? e.g. for a normal distribution, the CDF reaches 1 at +infty. For a truncated normal distribution, the CDF reaches 1 in a finite interval... so is it enough to just subtract the lower bound CDF value? Could you add a test/sanity check that the CDF is 0 at the lower bound (trivially correct here), and 1 at the upper bound?

Copy link
Member Author

@dweindl dweindl Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I missed the normalization.

Thanks, fixed.

petab/v1/distributions.py Outdated Show resolved Hide resolved
petab/v1/priors.py Outdated Show resolved Hide resolved
@dweindl
Copy link
Member Author

dweindl commented Dec 11, 2024

This does not yet address the issue of whether all priors are defined in the linear scale or the parameter scale PEtab-dev/PEtab#402.

@dweindl dweindl linked an issue Dec 11, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proper handling of parameter bounds in prior / startpoint sampling
3 participants