Accumulators stage 2 #925

mhauru · 2025-05-21T08:32:07Z

I'm making various changes to accumulators as I integrate Turing.jl with them. This isn't ready for review yet, but @penelopeysm, can I ask you for your thoughts on the design of the proposed changes to LogDensityFunction? It would now have a new field, given as the second constructor argument, called getlogdensity. By default it's getlogjoint, but you can set it to getlogprior or getloglikelihood or any other function that takes an AbstractVarInfo. Its return value will be the return value of logdensity_and_gradient etc. The default VarInfo, if one isn't given by the user, is also now set based on getlogdensity, to make sure it has the right accumulators.

github-actions · 2025-05-21T08:41:31Z

Benchmark Report for Commit `175b633`

Computer Information

Julia Version 1.11.5
Commit 760b2e5b739 (2025-04-14 06:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 4 × AMD EPYC 7763 64-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Benchmark Results

|                 Model | Dimension |  AD Backend |      VarInfo Type | Linked | Eval Time / Ref Time | AD Time / Eval Time |
|-----------------------|-----------|-------------|-------------------|--------|----------------------|---------------------|
| Simple assume observe |         1 | forwarddiff |             typed |  false |                  8.6 |                 1.8 |
|           Smorgasbord |       201 | forwarddiff |             typed |  false |                690.4 |                35.9 |
|           Smorgasbord |       201 | forwarddiff | simple_namedtuple |   true |                450.5 |                45.4 |
|           Smorgasbord |       201 | forwarddiff |           untyped |   true |               1271.2 |                27.3 |
|           Smorgasbord |       201 | forwarddiff |       simple_dict |   true |               8242.7 |                19.6 |
|           Smorgasbord |       201 | reversediff |             typed |   true |               1546.8 |                26.0 |
|           Smorgasbord |       201 |    mooncake |             typed |   true |               1012.4 |                 4.9 |
|    Loop univariate 1k |      1000 |    mooncake |             typed |   true |               5831.2 |                 3.9 |
|       Multivariate 1k |      1000 |    mooncake |             typed |   true |               1039.8 |                 8.6 |
|   Loop univariate 10k |     10000 |    mooncake |             typed |   true |              64818.4 |                 3.7 |
|      Multivariate 10k |     10000 |    mooncake |             typed |   true |               8737.8 |                 9.8 |
|               Dynamic |        10 |    mooncake |             typed |   true |                145.0 |                11.8 |
|              Submodel |         1 |    mooncake |             typed |   true |                 13.1 |                 6.4 |
|                   LDA |        12 | reversediff |             typed |   true |               1199.1 |                 1.8 |

codecov · 2025-05-21T08:45:19Z

Codecov Report

Attention: Patch coverage is 65.51724% with 10 lines in your changes missing coverage. Please review.

Please upload report for BASE (breaking@d4ef1f2). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
src/logdensityfunction.jl	69.23%	4 Missing ⚠️
src/simple_varinfo.jl	0.00%	4 Missing ⚠️
src/test_utils/ad.jl	0.00%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             breaking     #925   +/-   ##
===========================================
  Coverage            ?   81.11%           
===========================================
  Files               ?       37           
  Lines               ?     4035           
  Branches            ?        0           
===========================================
  Hits                ?     3273           
  Misses              ?      762           
  Partials            ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

penelopeysm · 2025-05-21T10:46:41Z

Hmmm, in terms of design, the other potential option I see is to make the new getlogdensity a keyword argument (which means that varinfo can't depend on it)

user specifies...	`getlogdensity` as positional argument	`getlogdensity` as keyword argument
`+getlogdensity +varinfo`	have to make sure they are consistent	have to make sure they are consistent
`+getlogdensity -varinfo`	generates the most efficient varinfo	uses default varinfo, potentially inefficient
`-getlogdensity +varinfo`	not possible	preserves current behaviour
`-getlogdensity -varinfo`	preserves current behaviour	preserves current behaviour

So it seems the main tradeoff is that having it as a positional argument allows for the most efficient varinfo (one without a likelihood accumulator) to be generated if you do e.g. LogDensityFunction(model, getlogprior), but in return, you can't call LogDensityFunction(model, varinfo), you have to do LogDensityFunction(model, getlogjoint, varinfo).

I think my first instinct is that I prefer the latter, and you could either insert a check in the inner constructor to see if the accumulators are consistent (i.e. warn if you have extra accs, or error if you don't have enough accs), or just straight-up call setaccs in the inner constructor to force consistency. I'm generally not a huge fan of Julia's optional positional arguments because they aren't truly optional – you can only omit them starting from the end (unless you start declaring 2^N methods).

Do you prefer the former? I'll mull over it a bit more too and see if any ideas come up.

mhauru · 2025-05-21T11:30:40Z

I wouldn't even bother checking for consistency if the user provides both getlogdensity and varinfo. If you're getting that hands-on with your LogDensityFunction I assume you know what you're doing and if not that's on you. If there's a mismatch, you'll probably just get an error saying "no accumulator X available in your VarInfo".

I would be happy to give up the case of LogDensityFunction(model, varinfo) and force people to do LogDensityFunction(model, getlogjoint, varinfo) because giving the varinfo argument I think means you're dealing with some pretty low-level stuff, most likely either trying to use a non-standard subtype of AbstractVarInfo or a linked VarInfo. If you're thinking about that sorts of things, I think having to give it a few seconds and one more line of code to specify which log probability you are after (a much more "user-level"/statistical concern) seems fine to me. (Similarly this is why I think the context should be the last argument, as it is now, because that's getting even deeper into the internals of DPPL.)

There's also something about type dispatch treating keyword args differently. Sometimes methods are specialised on them and sometimes they are not, and I don't understand the details.

penelopeysm

I thought about it and I think I would prefer to be able to keep the new getlogdensity argument optional by declaring the extra method:

function LogDensityFunction(
    model::Model,
    varinfo::AbstractVarInfo,
    context::AbstractContext=DefaultContext();
    adtype::Union{ADTypes.AbstractADType,Nothing}=nothing
)
    return LogDensityFunction(model, getlogjoint, varinfo, context; adtype)
end

This is the 2^N method problem but in this case N = 1 so I'm still fine with it. Main reason is because (in my experience working with this codebase) LogDensityFunction(model, varinfo) is used far more commonly than LogDensityFunction(model, varinfo, PriorContext()) (or LikelihoodContext), so it's a nice convenience.

Also I think LogDensityFunction is likely to become more user-facing so that would likely be useful for many people.

mhauru · 2025-05-27T14:43:40Z

It's not the worst, but I'm not a big fan of the behaviour of LogDensityFunction(model, x) being very different for different types of x. It means that there's no answer to the question of "what are the semantics of the second argument".

Also I think LogDensityFunction is likely to become more user-facing so that would likely be useful for many people.

If this is true, then I agree, but I'm not sure it is. How often do users want to customise their VarInfos?

If we were making this from scratch, what would we do? LogDensityFunction(model) makes an LDF for a Turing model, that makes plenty of sense. All the other arguments are because there are different flavours of LDFs:

They can evaluate different notions of log density, e.g. log joint or log likelihood.
They can use different types of AbstractVarInfos internally. This includes specifying that you are using a linked varinfo.
They can use a different evaluation context, which is to say the user may customise the model's behaviour post-construction in arbitrary ways.
They can use different AD backends internally.

I think of these 1. is a thing that makes sense to a statistician/modeller, and should thus be more prominent than the others. EDIT: It's also the only one that changes the output of logdensity, except if you're using a weird context. The others are only there because of implementation details and quirks of Turing/Julia, except debatably the linked varinfo case. I might make 1. an optional positional argument and the others keyword arguments, or have them all as optional positionals in the order above. What would your blank-slate design be?

penelopeysm · 2025-05-27T22:23:55Z

It means that there's no answer to the question of "what are the semantics of the second argument".

Yeah, that's fair -- I don't really like it too and I feel like I'm kind of stumbling around trying to find what I really want...

What would your blank-slate design be?

...I thought about it and I think my real hang-up is not so much convenience, but I think that the information about which part of logp is to be calculated is being specified twice — once in the function, and once in the VarInfo's accumulators. And I don't like that there's potential inconsistency between the two sources of information — and although we can resolve it by declaring the function to be the source of truth — this isn't obvious from the interface, and future readers will have to look into this file to figure it out.*

In my ideal world,† I would want there to be a single source of truth (this would necessarily be the varinfo, since you can't create an LDF without a varinfo) and for the user to specify this information (by setting the accumulators themselves). i.e instead of

LogDensityFunction(model, getlogprior, varinfo, ...)

they would call

LogDensityFunction(model, setaccs!!(VarInfo(model), (LogPriorAccumulator(),)), ...)

and to make that easier I'd probably make setaccs!!(...) a convenience function, in the same way that contextualize used to be a convenience function back when this was specified using contexts. Or maybe, VarInfo(model, ...) could take an extra argument to specify which logp accs to include.

* It's still overall better than the previous world where the context was magically doing things to modify how varinfo was being changed.

† Okay, in my ideal-ideal world everything would be a keyword argument, but Julia probably doesn't like this world.

mhauru · 2025-05-28T09:08:19Z

Or maybe, VarInfo(model, ...) could take an extra argument to specify which logp accs to include.

A bit off topic, but I want to do something like this regardless of what we do with LDF, but I've so far struggled with the fact that VarInfo constructors are already quite overloaded, leaving little syntax available. Ideas very welcome.

I see what you mean by a single source of truth. The way I explain this to myself, which is definitely imperfect, is that getlogdensity defines what is to be gotten from the varinfo and returned, and then which accumulators varinfo carries is more like a technicality of "make sure you've packed everything you need if that's where you want to go". We could make it such that the LDF constructor overwrites the accumulators of the provided varinfo, but that seems like a restriction we might come to regret at some point, and really starts to beg the question of why are we giving LDF an AbstractVarInfo in the first place if we discard both the parameter values and the accumulators in it.

Note that accumulators and possible getlogdensity functions aren't necessarily one-to-one. For instance, getlogjoint needs two accumulators. I was considering having the getter function be determined somehow from the accumulators but the thing that made me go with having them separate was having to, for MAP/MLE optimisation, implement LogPriorWithoutJacobianAccumulator, and then also have a getlogjoint_withoutjacobian(vi) = getlogprior_withoutjacobian(vi) + getloglikelihood(vi).

EDIT: Maybe a better example of getlogdensity and accumulators not being one-to-one would be a getter equivalent to MiniBatchContext, which returns getlogprior(x) + weight * getloglikelihood(x) where weight is a parameter of the getter.

mhauru added 4 commits May 20, 2025 17:38

Give LogDensityFunction the getlogdensity field

1f1ec85

Allow missing LogPriorAccumulator when linking

e7077ba

Trim whitespace

216789c

Run formatter

59e22a2

github-actions bot assigned mhauru May 21, 2025

Fix a few typos

175b633

penelopeysm requested changes May 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Accumulators stage 2 #925

Accumulators stage 2 #925

Uh oh!

mhauru commented May 21, 2025

Uh oh!

github-actions bot commented May 21, 2025 •

edited

Loading

Uh oh!

codecov bot commented May 21, 2025 •

edited

Loading

Uh oh!

penelopeysm commented May 21, 2025 •

edited

Loading

Uh oh!

mhauru commented May 21, 2025

Uh oh!

penelopeysm left a comment

Uh oh!

mhauru commented May 27, 2025 •

edited

Loading

Uh oh!

penelopeysm commented May 27, 2025 •

edited

Loading

Uh oh!

mhauru commented May 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Accumulators stage 2 #925

Are you sure you want to change the base?

Accumulators stage 2 #925

Uh oh!

Conversation

mhauru commented May 21, 2025

Uh oh!

github-actions bot commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Report for Commit 175b633

Computer Information

Benchmark Results

Uh oh!

codecov bot commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

penelopeysm commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mhauru commented May 21, 2025

Uh oh!

penelopeysm left a comment

Choose a reason for hiding this comment

Uh oh!

mhauru commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

penelopeysm commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mhauru commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented May 21, 2025 •

edited

Loading

Benchmark Report for Commit `175b633`

codecov bot commented May 21, 2025 •

edited

Loading

penelopeysm commented May 21, 2025 •

edited

Loading

mhauru commented May 27, 2025 •

edited

Loading

penelopeysm commented May 27, 2025 •

edited

Loading

mhauru commented May 28, 2025 •

edited

Loading