Integration with InferenceObjects.jl #381

sethaxen · 2022-08-19T10:26:53Z

On Twitter, @yebai suggested adding integration with InferenceObjects to MCMCChains: https://twitter.com/Hong_Ge2/status/1560343482216103938. I'm opening this issue for further discussion.

InferenceObjects.InferenceData is the storage format for Monte Carlo draws used by ArviZ.jl. Along with Python's arviz.InferenceData, it follows the cross-language InferenceData schema. PyMC uses Python's implementation as its official sample storage format. InferenceData can be serialized to NetCDF to standardize communicating results of Bayesian analyses across languages and PPLs. In Julia, it is built on DimensionalData. See example usage and plotting examples (using the Tables interface).

@yebai's suggestion is ultimately to deprecate Chains to instead use InferenceData. I see several upsides of this approach:

Chains is based on the somewhat outdated AxisArrays, while DimensionalData is more modern.
Chains flattens all draws and sampling statistics into a single 3D float array, which discards a lot of the structure of the sampled types (which may themselves be multidimensional or have non-float eltypes, such as Int or even Cholesky).
InferenceData's features are a superset of Chains. It can get closer to the original structure of the user's samples with named dimensions, but it also supports storing other metadata and can store prior, predictive, log-likelihood, and warmup draws, as well as the original data.
InferenceObjects is a relatively light dependency (~0.120-0.2s load time on Julia v1.7-1.8 vs MCMCChains with 1.7-3.6s) so would not add much to MCMCChains's load time.

Currently ArviZ.jl has a converter from_mcmcchains, which is used to convert Chains to InferenceData. Integration between Chains and InferenceData might look like the following steps:

Move ArviZ.from_mcmcchains here (with a better name)
Make InferenceData a supported chain_type for AbstractMCMC.sample (https://beta.turing.ml/AbstractMCMC.jl/dev/api/#Chains), which would bypass Chains's flattening entirely. I'm not sure this should live here, but it should not live in InferenceObjects.

The text was updated successfully, but these errors were encountered:

sethaxen · 2022-08-19T12:53:45Z

Make InferenceData a supported chain_type for AbstractMCMC.sample (https://beta.turing.ml/AbstractMCMC.jl/dev/api/#Chains), which would bypass Chains's flattening entirely. I'm not sure this should live here, but it should not live in InferenceObjects.

It looks like this would involved implementing AbstractMCMC.bundle_samples for specific samplers. For Chains this is done in Turing itself, not here, e.g. https://github.com/TuringLang/Turing.jl/blob/9f8a9c4c476095d45246b4924d5cd542f3f8d506/src/inference/Inference.jl#L320-L391

sethaxen · 2022-10-28T20:32:22Z

Since there have been no objections to these steps, I'm going to move forward with opening a PR for Step 1.

cpfiffer · 2022-10-28T22:50:54Z

Okay, thank you!

sethaxen · 2022-11-24T10:41:39Z

I wonder actually if this is going the wrong way about this. MCMCChains destructively flattens draws into one large array, and a converter then needs to infer from the variable names how to unflatten the draws, whereas Turing can return NamedTuples containing unflattened draws. It might be better to just directly implement DynamicPPL/AbstractMCMC interfaces for InferenceData storage, so instead of converting a Chains to an InferenceData, anyone sampling with those interfaces can just build the InferenceData directly.

I've opened a draft PR at TuringLang/Turing.jl#1913

This was referenced Oct 28, 2022

Adding output_format options for InferenceObjects StanJulia/StanSample.jl#60

Closed

Moving converters to other packages arviz-devs/ArviZ.jl#239

Open

sethaxen self-assigned this Oct 29, 2022

devmotion mentioned this issue Nov 21, 2022

Changes to dimension ordering TuringLang/MCMCDiagnosticTools.jl#49

Closed

sethaxen mentioned this issue Nov 24, 2022

Add InferenceObjects as a chain_type TuringLang/Turing.jl#1913

Closed

sethaxen mentioned this issue Feb 17, 2023

InferenceObjects integration TuringLang/DynamicPPL.jl#464

Open

goedman closed this as completed Apr 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration with InferenceObjects.jl #381

Integration with InferenceObjects.jl #381

sethaxen commented Aug 19, 2022 •

edited

Loading

sethaxen commented Aug 19, 2022

sethaxen commented Oct 28, 2022

cpfiffer commented Oct 28, 2022

sethaxen commented Nov 24, 2022

Integration with InferenceObjects.jl #381

Integration with InferenceObjects.jl #381

Comments

sethaxen commented Aug 19, 2022 • edited Loading

sethaxen commented Aug 19, 2022

sethaxen commented Oct 28, 2022

cpfiffer commented Oct 28, 2022

sethaxen commented Nov 24, 2022

sethaxen commented Aug 19, 2022 •

edited

Loading