-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split up epichains classes #107
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #107 +/- ##
==========================================
+ Coverage 98.63% 98.90% +0.27%
==========================================
Files 8 8
Lines 511 549 +38
==========================================
+ Hits 504 543 +39
+ Misses 7 6 -1 ☔ View full report in Codecov by Sentry. |
1e568e2
to
f037500
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I'm still slightly concerned if it's confusing to have both simulate_tree
and simulate_summary
. The main reason for simulate_summary
to exist is that it's faster than simulate_tree
and thus it's useful to have internally especially when approximating likelihoods where it may be run ~1,000,000s of times within e.g. an MCMC. But for the user I wonder if there is ever a situation where this would matter, and where we could otherwise have just a simulate
function that returns the tree.
But I'm not sure, and I'm not sure what the best solution is. Perhaps something to come out when others review the package.
If sticking with the current setup (which may well be best) then I think it would be a good idea for |
I have a few thoughts on alternatives.
I'm inclined to go with this option 1. |
I agree. One thing I've been wondering is how much of a speed difference there actually is. Perhaps we could add |
Definitely much faster. See #114 (comment). |
Interesting! Another option would be
Still doesn't help with the code duplication unless there is some of it which could be turned into a function. |
How about
|
Am I wrong for interpreting #107 (comment) to mean option 4? 🤔 As in, "simulate()" returns an Also, will we be masking |
Yes, they're basically the same (+rename)
Ah yes. We could extend it (as it's a generic) but we'd have to construct an object to simulate from first. Given all of this my inclination would be to just go ahead with #107 (comment) and keep |
Agreed. Looking at the I will go ahead with #107 (comment). Thanks for helping to brainstorm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could leave simulate_summary() but add the clarification, "The main reason for simulate_summary to exist is that it's faster than simulate_tree and thus it's useful to have internally especially when approximating likelihoods where it may be run ~1,000,000s of times within e.g. an MCMC". Downside is #44.
Just looking into this PR as the instigator of #79 --- as an external developer, I haven't been able to understand the difference between simulate_tree()
and simulate_summary()
. Perhaps I'm confused because they both return objects with some shared inheritance, and I'm conditioned by R syntax to assume that a 'summary' is a condensed version of another object.
Would it help to rename simulate_summary()
to say, sample_tree_metrics()
? This would then reserve 'simulate' for code that gives the tree structure. However, running summary(simulate_tree())
should also give the same-ish output as sample_tree_metrics()
for the same ntrees
and offspring_dist
.
Also, would it help to pick between 'tree' and 'chain' and use one, or do they mean different things?
It's not really sampling. The current |
Co-authored-by: Sebastian Funk <[email protected]>
0c89a08
to
febd504
Compare
This PR closes #66, closes #78, and closes #79.
It does this by:
<epichains>
class into<epichains_tree>
andepichains_summary
, which inherit from<data.frame>
and<vector>
respectively.<aggregate_epichains_df>
class as it is no longer deemed necessary.print()
andsummary()
methods.