cost function mentioned in main_OSEM.py #71

gschramm · 2024-07-21T12:33:36Z

Line 93 in d9fbdc0

    
                   NB: It should be `sum(prompts * log(acq_model.forward(self.x)) - self.x * sensitivity)` across all subsets.

The cost function in main_OSEM.py is sum(prompts * log(acq_model.forward(self.x)) - self.x * sensitivity).

Shouldn't that be:
sum(prompts * log(acq_model.forward(self.x)) - acq_model.forward(self.x)) (plus prior)?

The text was updated successfully, but these errors were encountered:

KrisThielemans · 2024-07-22T08:46:16Z

OSEM does NOT implement a MAP recon.

PETRIC/main_OSEM.py

Line 36 in d9fbdc0

    
               NB: OSEM does not use `data.prior` and thus does not converge to the MAP reference used in PETRIC.

This example is provided to show yet another alternative how to implement a familiar algorithm, as well as giving the opportunity to compare OSEM metrics iwth MAP metrics.

Apologies if this isn't clear in the documentation, suggestions welcome (or maybe you didn't read it :smile)

gschramm · 2024-07-22T08:50:34Z

Indeed. But the cost function mentioned there also does not match the Poisson logL as far as I can see, which could confuse people if they want to evaluate cost (as sum of data fidelity + prior) themselves.

KrisThielemans · 2024-07-22T09:06:23Z

But the cost function mentioned there also does not match the Poisson logL as far as I can see, which could confuse people if they want to evaluate cost (as sum of data fidelity + prior) themselves.

Are you talking about the difference between
sum(prompts * log(acq_model.forward(self.x)) - self.x * sensitivity) and sum(prompts * log(m*acq_model.forward(self.x)) - self.x * sensitivity) (where the latter is the one mentioned on the wiki). That is indeed different. However, the difference is an image-independent constant, so should not affect any optimisation (and doesn't affect our metrics). One of the reasons I implemented main_OSEM.py that way is that it is faster (one multiplication and one division less per gradient).

Aside from thresholding, the former is an implementation of [6] UCL/STIR#1472 (comment), while the latter is one of [5]. Unfortunately, I'm planning to switch to [6] in STIR soon, which indeed will confuse people.

gschramm · 2024-07-22T09:17:12Z

Indeed. After writing it down, I see that the difference is only the sum of the known additive contamination, so indeed image-independent.

main_OSEM.py is much appreciated and very helpful :)

KrisThielemans · 2024-07-22T09:23:09Z

I see that the difference is only the sum of the known additive contamination, so indeed image-independent.

multiplicative term m really: the difference is sum(prompts * log(m)). The additive term cannot be taken out of the log.

By the way, this "trick" is our reason to write the accquisition model as $D(m) (A x+a)$

gschramm · 2024-07-22T09:37:51Z

Hm, now I am confused.

If we have an affine fwd mode:

$$ \bar{y} = Ax + b $$

and emission data $y$, the Poisson logL up to constant terms is:

$$ logL = \sum_i y_i \log{\bar{y}_i} - \sum_i \bar{y}_i $$

If I understand your comment in the code correctly, you calculate (mathjaxx bug in a_{ij} ...)

$$ logL = \sum_i y_i \log{\bar{y}_i} - \sum_j x_j s_j = \sum_i y_i \log{\bar{y}_i} - \sum_j x_j \sum_i a_ij 1 $$

which after swapping the last two sums is

$$ logL = \sum_i y_i \log{\bar{y}_i} - \sum_i \sum_j a_ij x_j $$

which is

$$ logL = \sum_i y_i \log{\bar{y}_i} - \sum_i (\bar{y}_i - b_i) $$

isn't it?

KrisThielemans · 2024-07-22T10:52:25Z

nope... First of all, our wiki notation is different from yours. We use

$$ \hat{y} = D(m)(Ax + a) $$

so your $A$ is not ours... (and your \bar{y} is our \hat(y))

Sticking to our notation, the main_OSEM.py suggests to compute
$O = \sum_i y_i \log{[A(x+a)]_i} - \sum_i \sum_j a_{ij} x_j$

which is (unless I'm making a typo!), with $b = m*a $

$$ O= \sum_i y_i (\log{\hat{y}_i - \log m_i)} - \sum_i (\hat{y}_i - b_i) = logL + csts $$

gschramm · 2024-07-22T11:38:34Z

In the first definition of $O$, the operator $A$ is applied to $(x + a)$ (image + additive contamination in data space) which I guess is not correct? I think it should be $[Ax + a]_i$
I find it slightly confusing to use $a$ for the contaminations, if we use $a_{ij}$ for the matrix elements of $A$.

KrisThielemans · 2024-07-22T11:47:27Z

In the first definition of O, the operator A is applied to (x+a) (image + additive contamination in data space) which I guess is not correct? I think it should be [Ax+a]i

Indeed. Sorry

$O = \sum_i y_i \log{[(Ax)+a)]_i} - \sum_i \sum_j a_{ij} x_j$

I find it slightly confusing to use a for the contaminations, if we use aij for the matrix elements of A.

Apologies. We called it $a$ for "additive". I have not found any agreements on how to call this. (SIRF uses "background" for $b$).

gschramm closed this as completed Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cost function mentioned in main_OSEM.py #71

cost function mentioned in main_OSEM.py #71

gschramm commented Jul 21, 2024

KrisThielemans commented Jul 22, 2024

gschramm commented Jul 22, 2024

KrisThielemans commented Jul 22, 2024

gschramm commented Jul 22, 2024

KrisThielemans commented Jul 22, 2024

gschramm commented Jul 22, 2024

KrisThielemans commented Jul 22, 2024 •

edited

Loading

gschramm commented Jul 22, 2024

KrisThielemans commented Jul 22, 2024

cost function mentioned in main_OSEM.py #71

cost function mentioned in main_OSEM.py #71

Comments

gschramm commented Jul 21, 2024

KrisThielemans commented Jul 22, 2024

gschramm commented Jul 22, 2024

KrisThielemans commented Jul 22, 2024

gschramm commented Jul 22, 2024

KrisThielemans commented Jul 22, 2024

gschramm commented Jul 22, 2024

KrisThielemans commented Jul 22, 2024 • edited Loading

gschramm commented Jul 22, 2024

KrisThielemans commented Jul 22, 2024

KrisThielemans commented Jul 22, 2024 •

edited

Loading