Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inferencedata errors when model contains matrix parameters #75

Open
sethaxen opened this issue Nov 8, 2023 · 4 comments
Open

inferencedata errors when model contains matrix parameters #75

sethaxen opened this issue Nov 8, 2023 · 4 comments

Comments

@sethaxen
Copy link
Contributor

sethaxen commented Nov 8, 2023

julia> using StanSample, InferenceObjects

julia> model = """
       parameters {
         matrix[2, 3] x;
       }
       model {
         for (i in 1:2)
           x[i,:] ~ std_normal();
       }
       """;

julia> sm = SampleModel("foo", model);

julia> rc = stan_sample(sm);

julia> inferencedata(sm)
ERROR: ArgumentError: no valid permutation of dimensions
Stacktrace:
  [1] permutedims(B::Array{Float64, 4}, perm::Tuple{Int64, Int64, Int64})
    @ Base ./multidimensional.jl:1596
  [2] extract(chns::Array{Float64, 3}, cnames::Vector{String}; permute_dims::Bool)
    @ StanSample ~/.julia/packages/StanSample/tYGEA/src/utils/namedtuples.jl:40
  [3] extract
    @ StanSample ~/.julia/packages/StanSample/tYGEA/src/utils/namedtuples.jl:7 [inlined]
  [4] convert_a3d(a3d_array::Array{Float64, 3}, cnames::Vector{String}, ::Val{:permuted_namedtuples})
    @ StanSample ~/.julia/packages/StanSample/tYGEA/src/utils/namedtuples.jl:106
  [5] read_csv_files(m::SampleModel, output_format::Symbol; include_internals::Bool, chains::UnitRange{…}, start::Int64, kwargs::@Kwargs{})
    @ StanSample ~/.julia/packages/StanSample/tYGEA/src/stansamples/read_csv_files.jl:116
  [6] read_csv_files
    @ ~/.julia/packages/StanSample/tYGEA/src/stansamples/read_csv_files.jl:23 [inlined]
  [7] #read_samples#10
    @ ~/.julia/packages/StanSample/tYGEA/src/stansamples/read_samples.jl:93 [inlined]
  [8] read_samples
    @ ~/.julia/packages/StanSample/tYGEA/src/stansamples/read_samples.jl:84 [inlined]
  [9] inferencedata(m::SampleModel; include_warmup::Bool, log_likelihood_var::Nothing, posterior_predictive_var::Nothing, predictions_var::Nothing, kwargs::@Kwargs{})
    @ InferenceObjectsExt ~/.julia/packages/StanSample/tYGEA/ext/InferenceObjectsExt.jl:85
 [10] inferencedata(m::SampleModel)
    @ InferenceObjectsExt ~/.julia/packages/StanSample/tYGEA/ext/InferenceObjectsExt.jl:76
 [11] top-level scope
    @ REPL[25]:1
Some type information was truncated. Use `show(err)` to see complete types.

Environment

julia> versioninfo()
Julia Version 1.10.0-rc1
Commit 5aaa9485436 (2023-11-03 07:44 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
  Threads: 11 on 8 virtual cores
Environment:
  JULIA_CMDSTAN_HOME = /home/sethaxen/software/cmdstan/2.33.1/
  JULIA_NUM_THREADS = auto
  JULIA_EDITOR = code

(jl_WirtqY) pkg> st
Status `/tmp/jl_WirtqY/Project.toml`
  [b5cf5a8d] InferenceObjects v0.3.13
  [c1514b29] StanSample v7.4.5
@goedman
Copy link
Collaborator

goedman commented Nov 9, 2023

Hi Seth,

Thanks for filing an issue and a MWE. Definitely not working.

I will take a look asap. Hopefully later today, but likely tomorrow.

Best, Rob

@goedman
Copy link
Collaborator

goedman commented Nov 10, 2023

StanSample.jl v7.5.0 contains a fix for this issue. I've added a limited test for a matrix variable (as in your posted issue) but would like to test this for arrays in general as well.

You've probably seen Brian's suggestion to move Stan related I/O to a separate package. I'm still considering the pros and cons of such an effort, but a huge pro would be to clean up code that has been updated for many, many years.

It would also be a good opportunity to add support for complex variables to JSON input files and handling tuple (and complex?) outputs in generated CSV files. I will probably try these out in the current setup first.

@sethaxen
Copy link
Contributor Author

StanSample.jl v7.5.0 contains a fix for this issue.

Thanks! Indeed, it works for me!

You've probably seen Brian's suggestion to move Stan related I/O to a separate package.

Thanks for the pointer, I hadn't seen that yet. From the ArviZ perspective, it's a bit tricky to support variables that cannot be trivially flattened into an array of reals. There are effectively 3 useful representations of draws:

  • Something close to the data structure the user created. If the variable was represented as a tuple of arrays, then the draws would be an array of tuples of arrays.
  • Something useful for analysis and long-term storage. Virtually all standard analyses require real marginals or tables. Same with plots. So the most useful representation here is flattening all data structures to real numbers or arrays.
  • Something like MonteCarloMeasurements.jl or posterior's var, where the marginal draws are packed into something representing a real number, which allows again for data structures that mimic what the user created in the PPL.

From the perspective of ArviZ.jl, the 2nd is by far the most useful. But for Julia PPLs, where draws can technically be arbitrary Julia types, it would be useful to support the 1st option as well and support interconversion. This was low priority in the past, but Turing now has Cholesky objects as recommended variables, so we need to decide how to support this. Stan's tuple support also makes this high priority for support. I haven't decided how to do this yet, but something like arviz-devs/InferenceObjects.jl#27 is a possibility.

@goedman
Copy link
Collaborator

goedman commented Nov 25, 2023

Thanks Seth,

Your 2nd argument is spot on (maybe a key reason why I always in the end seem to switch back to DataFrames).

My current goal for StanIO.jl is to flesh out the :output_format=:nesteddataframe (which is trivial to convert to a NamedTuple). Complex vars are easy to deal with given the .imag and .real name extensions. Arrays are also fairly easy.

Pure tuples are also ok, tuples with mixed in arrays (and vice versa) is a bit more complex.

Rob

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants