Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapter 7: done... ugggghhhhh #12

Open
oharac opened this issue Aug 27, 2018 · 1 comment
Open

Chapter 7: done... ugggghhhhh #12

oharac opened this issue Aug 27, 2018 · 1 comment

Comments

@oharac
Copy link
Contributor

oharac commented Aug 27, 2018

finished the unending chapter. I struggled to figure out how to approach 7.7, and not at all sure I got it "correct" though I got something that resembled the basic concepts.

Where I struggled:

  • For a given set of data, the NLL is the sum of all the neg log likelihood NLLi for each observation within the dataset, i.e. how likely is a given observation to occur given the parameters.
  • to find the best set of parameters for the data set, just find the params that give the lowest sum(NLLi).

So far so good I think?

Here our parameters are r, p, q, to calculate an index of abundance, basically the expected number of observed individuals Iobs for an actual population D. So the deterministic index Idet = max[0, (p + qD)/(1 + rD)].

  • This Idet can be zero, if p is negative and qD is small.
  • The NLLi equation for a given observation should be Idet,i - Iobs,i * log(Idet,i) + log(Iobs,i!)
  • if Idet,i = 0, then NLLi = NA.

So here's the problem: even if we set p = -3 and r = 0.03, and let q vary around 1, different values of q*D will result in different numbers of NAs. Same if we let p and r vary, but the first case is easier.

  • Then adding all the NLLi values, excluding NAs, means we are summing with different numbers of elements.
  • So more NAs means a lower sum, because you just have fewer numbers in the sum.
  • So the sums aren't really comparable! mean() might help some, though since the Idet is dependent on D, mean() doesn't do a great job either...

Anyway, I got something that vaguely resembled Figure 7.9, but only vaguely. So maybe I'm making some unfounded assumption that f's it all up? Let me know what you think when you get to pseudocode 7.7...

Cheers

@oharac
Copy link
Contributor Author

oharac commented Aug 27, 2018

an option to avoid this problem entirely is to ignore p = -3, and just set p >= 0 (and then your testing range over p for 7.7B should stay in the positive range too). The plots look better!

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant