-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inconsistency between Poisson loglikelihood value, gradient (when using multiplicative factors) and Hessian due to thresholding #1472
Comments
Note that in most cases, Side note: the code has separate handling for the Current impact estimation:
Nevertheless, these thresholds should be consistent, and in any case are probably surprising to most people (as not documented). Also, the current gradient strategy is sensitive to the scale of |
After discussion with @mehrhardt, proposed strategy in first instance is
Things to consider:
|
There's another possible choice for the objective function: This choice has a few advantages (and is also preferred by @mehrhardt):
After more discussions with @NicoleJurjew, two other things came up:
ProposalFor with @mehrhardt @gschramm any comments? |
Other things noted while investigating all this:
|
To me, [5] / [6] with a default eps = 0, feel most "natural". How often does the "log(0)" problem occur in real PET/SPECT systems where there should be always a finite amount of contamination (except for SPECT scans without scatter)? |
In "traditional" algorithms without precorrections, and the background model is somewhat accurate, then I don't think it should ever occur. However, there are a few cases where it could occur
The challenge for STIR is that most people expect it to "just work" even if they throw weird stuff at it... |
I see. I think some negative pixels, as long as the forward model stays positive, are ok. Wouldn't it make more sense to return infinity as soon as a single bin in the expectation is <= 0? |
I don't disagree, but that's a different optimisation problem from what most people will expect.
Not all architectures can represent infinity (although these days, that's possibly less of a worry, as the IEEE floating point standard is now very widely adapted and I see that even CUDA represents infinity). Clearly, the above thresholding strategies all require checks and will slow things down, and indeed the proposed function is not differentiable. An alternative is to just not do any checks, and let the result be undefined. In my experience, that throws up problems very quickly though. |
Summarising the proposal: for a user-configurable
All of these can be computed by summing over list-mode events, aside from the "approximate" version, which is essentially not available (unless you histogram first), There's a corner case where Obviously, once infinity is returned, ugly things will happen. The only generic solution for that is to set |
Actually, I see now that |
@KrisThielemans another possibility that came to my mind yesterday. with I think the only problem is that this cannot be evaluated in listmode :-( |
yeah... I prefer a strategy where results are identical in both, at least in principle. It simplifies testing as well! |
Continuing from #1461...
We attempt to avoid zero or negatives in the
log
(or division be zero or negatives in the gradient) by thresholding the quotient. Unfortunately, the code is not consistent, even after many attempts.First introducing notation:
$T(e, y) = max(e, y/Q)$ $e$ ), and (full) forward projection $e = P\lambda+b$ .
(i.e. lower threshold on
formulas
A strategy could be
This function has a gradient w.r.t.$\lambda$ (with some element-wise multiplications etc., and ignoring the non-differentiable points)
with
This gradient is equivalent to back-projecting
e >= y/Q ? y/e : 0
[4]Log-likelihood code
The function [1] is what is computed by
PoissonLogLikelihoodWithLinearModelForMeanAndProjData
for the value viaSTIR/src/buildblock/recon_array_functions.cxx
Lines 367 to 371 in 12bfa87
where the arguments are computed
STIR/src/recon_buildblock/PoissonLogLikelihoodWithLinearModelForMeanAndProjData.cxx
Lines 1451 to 1467 in 12bfa87
Gradient code
For the gradient we use the optimisation that any multiplicative factors drop out.
STIR/src/recon_buildblock/PoissonLogLikelihoodWithLinearModelForMeanAndProjData.cxx
Lines 1407 to 1433 in 12bfa87
Unfortunately, the
divide_and_truncate
method (which implements the division [4]) does not know about the multiplicative factors, and is therefore using a different threshold than the logL value.Gradient formulas
Writing the full forward projection as$e= P\lambda + b = m (G\lambda+a)$ , and the forward projection without $m$ as $ f = G(\lambda+a)$ , the current gradient computation is
G.back_project((f >= y/Q ? y/f : 0) - m)
This is equivalent to
P.back_project((e >= m*y/Q ? y/e : 0) - 1)
Conclusion
There are 2 inconsistencies:
divide_and_truncate
andaccumulate_likelihood
,Q=10000
, such that the quotient in the gradient computation is not consistent with the log-term in the logL value wherever only one of those reaches the threshold, so ifThe text was updated successfully, but these errors were encountered: