Improve Numerical Stability of Bernoulli CDF functions #2784

andrjohns · 2022-07-08T04:00:09Z

Summary

This PR updates the Bernoulli CDF functions (_cdf, _lcdf, and _lccdf) to operate on the log scale as much as possible, to avoid issues with underflow and resolution around 1

Tests

Additional mix/prob tests have been added to ensure that the gradients aren't impacted (prim behaviour covered by the distribution tests)

Side Effects

N/A

Release notes

Improved numerical stability of Bernoulli CDF functions

Checklist

Math issue Improve Bernoulli (LC)CDF Numerical Stability #2783
Copyright holder: Andrew Johnson

The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
- unit tests pass (to run, use: ./runTests.py test/unit)
- header checks pass, (make test-headers)
- dependencies checks pass, (make test-math-dependencies)
- docs build, (make doxygen)
- code passes the built in C++ standards checks (make cpplint)
the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested

bob-carpenter

Thanks---looks great. I think there are a couple name changes that would make this much easier to follow.

stan/math/prim/prob/bernoulli_lcdf.hpp

stan/math/prim/prob/bernoulli_cdf.hpp

stan/math/prim/prob/bernoulli_lcdf.hpp

bob-carpenter · 2022-07-12T14:15:11Z

@andrjohns: please let me know when this is ready to review and merge. Thanks!

andrjohns · 2022-07-13T07:13:46Z

Thanks @bob-carpenter! This is ready for another look. I've updated the vectorisation by adding a prim implementation of the select function that is currently used by the OpenCL code for the same purpose (ternary operations that can be agnostic between scalar and Eigen inputs)

andrjohns · 2022-10-21T07:30:13Z

@SteveBronder when you have a minute (no rush at all), can you have a look at this PR? It involves re-implementing the the OpenCL code's select function, so would be great to have eyes on from someone that knows how it should be behaving

spinkney · 2022-10-26T09:42:32Z

@bob-carpenter are you able to re-review this?

@SteveBronder are you able to double check the opencl select code?

spinkney · 2022-10-26T09:43:45Z

dismiss the closed/reopened, I hit the trackpad on my laptop by accident

don't think I understand the current C++ well enough

bob-carpenter · 2022-10-27T10:06:42Z

Thanks for the heads up. I just dismissed my review so that someone else could review it. I still don't feel I understand our new C++ conventions well enough to review PRs.

SteveBronder · 2022-10-27T21:56:21Z

Sorry didn't have time today but Tuesday I can look at this

SteveBronder

Few Qs around the new version of select. I also think we should just write an any() function that for bool just returns the input and for Eigen types holding bools calls the .any() method. Would make things simpler to read

SteveBronder · 2022-11-01T15:01:24Z

stan/math/prim/fun/select.hpp

+inline auto select(const bool c, const T_true y_true, const T_false y_false) {
+  return c ? y_true : y_false;
+}


Not sure if @t4c1 still checks github, but I'm not sure if we need common_type here or if auto is fine? I wouldn't mind just using return_type_t<>, though that will only work with arithmetic types since return_type_t has a minimum of double as the returned type. We could just write another another overload to handle the double integral case though

I still get notifications if pinged. auto will here be same as T_true (that is how ternary operator works), so some common type is a better idea. Not sure if retrun_type will do promotion to var even if neither T_true nor T_false are var, but we do not want that here.

auto will here be same as T_true

I've done some tests and it doesn't look like an issue when mixing types: https://godbolt.org/z/dvcxvvxhs

But let me know if I've missed something basic!

SteveBronder · 2022-11-01T15:04:09Z

stan/math/prim/fun/select.hpp

+  return y_true
+      .binaryExpr(y_false, [&](auto&& x, auto&& y) { return c ? x : y; })
+      .eval();


If c is constant here should we just be returning y_true or y_false? We just need to use promotion rules on the output types scalar value with promote_scalar_t<return_type_t<T_true, T_false>>

SteveBronder · 2022-11-01T15:15:48Z

stan/math/prim/fun/select.hpp

+  if (c) {
+    return y_true;
+  }
+
+  return y_true.unaryExpr([&](auto&& y) { return y_false; });
+}


I'd use

if () { } else { }

with promote_type_t again.

That is true for all of them.

SteveBronder · 2022-11-01T15:17:35Z

stan/math/prim/fun/select.hpp

+inline auto select(const T_bool c, const T_true y_true, const T_false y_false) {
+  return c.select(y_true, y_false).eval();
+}


Does this work if y_true has a double scalar type and y_false has an integer scalar type?

SteveBronder · 2022-11-01T15:19:50Z

stan/math/prim/prob/bernoulli_lccdf.hpp

+  }
+  if (sum(n_arr >= 1)) {


Suggested change

}

if (sum(n_arr >= 1)) {

} else if (sum(n_arr >= 1)) {

SteveBronder · 2022-11-01T15:21:14Z

stan/math/prim/prob/bernoulli_lcdf.hpp

+  if (sum(n_arr < 0)) {
+    return ops_partials.build(NEGATIVE_INFTY);


We could just write an any() function that takes in a scalar or vector and returns true or false. Think that would just be easier to read imo

andrjohns · 2022-11-15T09:29:19Z

@SteveBronder I'll add an any() function, since I agree that would be super handy. Should I open a separate PR for the select() and any() functions and tests, or are you happy for me to include them in this one?

andrjohns · 2022-12-10T15:52:16Z

I'll update this PR once the helper functions from #2852 have been added and merged

andrjohns · 2023-08-15T16:56:50Z

@SteveBronder would you mind having another look at this when you get a minute? No rush

SteveBronder

lgtm!

WardBrian · 2023-09-30T15:51:11Z

develop tests have been failing in the distribution tests since this was merged (https://jenkins.flatironinstitute.org/blue/organizations/jenkins/Stan%2FMath/detail/develop/194/pipeline/500)

I'm guessing this is due to -DSTAN_TEST_ROW_VECTORS which we don't use in CI for PRs

In file included from test/prob/bernoulli/bernoulli_cdf_00000_generated_ffv_test.cpp:3:
In file included from ./test/prob/test_fixture_distr.hpp:4:
In file included from ./stan/math/mix.hpp:4:
In file included from ./stan/math/mix/meta.hpp:6:
In file included from ./stan/math/fwd/core.hpp:4:
In file included from ./stan/math/fwd/core/fvar.hpp:4:
In file included from ./stan/math/prim/meta.hpp:72:
In file included from ./stan/math/prim/meta/append_return_type.hpp:4:
In file included from ./stan/math/prim/fun/Eigen.hpp:22:
In file included from lib/eigen_3.4.0/Eigen/Dense:1:
In file included from lib/eigen_3.4.0/unsupported/Eigen/../../Eigen/Core:295:
lib/eigen_3.4.0/Eigen/src/Core/PlainObjectBase.h:970:7: error: static_assert failed "INVALID_MATRIX_TEMPLATE_PARAMETERS"
      EIGEN_STATIC_ASSERT((EIGEN_IMPLIES(MaxRowsAtCompileTime==1 && MaxColsAtCompileTime!=1, (int(Options)&RowMajor)==RowMajor)
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
lib/eigen_3.4.0/Eigen/src/Core/util/StaticAssert.h:33:40: note: expanded from macro 'EIGEN_STATIC_ASSERT'
    #define EIGEN_STATIC_ASSERT(X,MSG) static_assert(X,#MSG);
                                       ^             ~
lib/eigen_3.4.0/Eigen/src/Core/Map.h:159:30: note: in instantiation of member function 'Eigen::PlainObjectBase<Eigen::Array<stan::math::var_value<double, void>, 1, -1, 0, 1, -1> >::_check_template_params' requested here
      PlainObjectType::Base::_check_template_params();
                             ^
./stan/math/rev/core/arena_matrix.hpp:62:9: note: in instantiation of member function 'Eigen::Map<Eigen::Array<stan::math::var_value<double, void>, 1, -1, 0, 1, -1>, 0, Eigen::Stride<0, 0> >::Map' requested here
      : Base::Map(

@andrjohns are you able to take a look at this soon? Otherwise I think we may need to revert this to unblock our CI

andrjohns · 2023-09-30T16:24:54Z

Ah damn, yeah I'll have a look now

andrjohns and others added 4 commits July 8, 2022 06:11

Numerically stable bernoulli cdf functions

2b8ae14

CDF initial value

c2022be

cpplint

a9d0800

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

bcfaec1

bob-carpenter previously requested changes Jul 8, 2022

View reviewed changes

stan/math/prim/prob/bernoulli_lcdf.hpp Outdated Show resolved Hide resolved

stan/math/prim/prob/bernoulli_cdf.hpp Outdated Show resolved Hide resolved

stan/math/prim/prob/bernoulli_lcdf.hpp Outdated Show resolved Hide resolved

andrjohns and others added 11 commits July 10, 2022 19:40

Add select function, vectorise cdf

143f707

Vectorise lcdf and lccdf

62d8640

Update doc

0edeb5f

Merge commit 'e4b9bdece4250e3455d663e3155c1d3d4965c10d' into HEAD

fd6d39c

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

f03c35f

Fix headers

2b3cf01

Fix return types

a371c2b

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

2bade77

Ignore Eigen deprecation warning in expression tests

409c971

Update compiler flags

6220f14

Update includes

2c11c42

Merge branch 'stan-dev:develop' into issue-2783-bernoulli-cdf-stable

136fc9c

spinkney closed this Oct 26, 2022

spinkney reopened this Oct 26, 2022

Merge branch 'stan-dev:develop' into issue-2783-bernoulli-cdf-stable

ccde5d5

SteveBronder requested changes Nov 1, 2022

View reviewed changes

Merge branch 'develop' into issue-2783-bernoulli-cdf-stable

a4b384c

andrjohns and others added 2 commits November 14, 2022 14:42

review comments

da1f7a8

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

abfa364

Merge branch 'stan-dev:develop' into issue-2783-bernoulli-cdf-stable

6103b16

This was referenced Dec 7, 2022

Add select() and any() helper functions #2852

Closed

Add vectorised select(), any(), and all() functions #2853

Merged

andrjohns added 4 commits August 11, 2023 15:53

Merge branch 'develop' into issue-2783-bernoulli-cdf-stable

53ce103

Tidy includes

9872fcf

Missing headers

bd11e54

Reduce unnecessary computation

8f5cdb8

andrjohns added 2 commits August 16, 2023 21:12

Remove select broadcast hack

cabafd6

Fix broadcast logic

47c4f9a

SteveBronder approved these changes Sep 6, 2023

View reviewed changes

andrjohns merged commit 9f2689e into stan-dev:develop Sep 19, 2023
7 checks passed

andrjohns deleted the issue-2783-bernoulli-cdf-stable branch September 19, 2023 13:23

andrjohns mentioned this pull request Sep 30, 2023

Fix Row-Vector Distribution test failures #2954

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Numerical Stability of Bernoulli CDF functions #2784

Improve Numerical Stability of Bernoulli CDF functions #2784

andrjohns commented Jul 8, 2022

bob-carpenter left a comment

bob-carpenter commented Jul 12, 2022

andrjohns commented Jul 13, 2022

andrjohns commented Oct 21, 2022

spinkney commented Oct 26, 2022

spinkney commented Oct 26, 2022

bob-carpenter commented Oct 27, 2022

SteveBronder commented Oct 27, 2022

SteveBronder left a comment

SteveBronder Nov 1, 2022

t4c1 Nov 1, 2022

andrjohns Nov 14, 2022

SteveBronder Nov 1, 2022

SteveBronder Nov 1, 2022

SteveBronder Nov 1, 2022

SteveBronder Nov 1, 2022

SteveBronder Nov 1, 2022

SteveBronder Nov 1, 2022

andrjohns commented Nov 15, 2022

andrjohns commented Dec 10, 2022

andrjohns commented Aug 15, 2023

SteveBronder left a comment

WardBrian commented Sep 30, 2023

andrjohns commented Sep 30, 2023

		if (sum(n_arr < 0)) {
		return ops_partials.build(NEGATIVE_INFTY);

Improve Numerical Stability of Bernoulli CDF functions #2784

Improve Numerical Stability of Bernoulli CDF functions #2784

Conversation

andrjohns commented Jul 8, 2022

Summary

Tests

Side Effects

Release notes

Checklist

bob-carpenter left a comment

Choose a reason for hiding this comment

bob-carpenter commented Jul 12, 2022

andrjohns commented Jul 13, 2022

andrjohns commented Oct 21, 2022

spinkney commented Oct 26, 2022

spinkney commented Oct 26, 2022

bob-carpenter commented Oct 27, 2022

SteveBronder commented Oct 27, 2022

SteveBronder left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrjohns commented Nov 15, 2022

andrjohns commented Dec 10, 2022

andrjohns commented Aug 15, 2023

SteveBronder left a comment

Choose a reason for hiding this comment

WardBrian commented Sep 30, 2023

andrjohns commented Sep 30, 2023