Skip to content

Latest commit

 

History

History
280 lines (192 loc) · 10.6 KB

README.md

File metadata and controls

280 lines (192 loc) · 10.6 KB

Coverage Status DOI

CumulantsUpdates.jl

Updates following statistics of n-variate data

  • d'th moment tensor,
  • an array of moment tensors of order 1,2,...,d.

Given t realisations of n-variate data: X ∈ ℜᵗˣⁿ, and the update X + ∈ ℜᵘˣⁿ returns array of updated cumulant tensors of order 1,2,...,d.

To store symmetric tensors uses a SymmetricTensors type, requires SymmetricTensors.jl. To convert to array, run:

julia> Array(st::SymmetricTensors{T, d})

to convert back, run:

julia>  SymmetricTensor(a::Array{T,d})

Requires Cumulants.jl.

As of 01/01/2017 kdomino is the lead maintainer of this package.

Installation

Within Julia, run

pkg> add CumulantsUpdates

to install the files. Julia 1.0 or later is required.

Parallel computation

For parallel computation just run

julia> addprocs(n)
julia> @everywhere using CumulantsUpdates

Functions

Data update

To update data X ∈ ℜᵗˣⁿ by the update X+ ∈ ℜᵘˣⁿ in the observation window of size t and get ℜᵗˣⁿ ∋ X' = vcat(X,X+)[1+u:end, :] run:

julia> dataupdat(X::Matrix{T}, Xplus::Matrix{T}) where T<:AbstractFloat

the condition u < t must be fulfilled.

julia> a = ones(4,4)
4×4 Array{Float64,2}:
 1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0

julia> b = zeros(2,4)
2×4 Array{Float64,2}:
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

julia> dataupdat(a,b)
4×4 Array{Float64,2}:
 1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

Moment update

To update the moment tensor M::SymmetricTensor{T, d} for data X ∈ ℜᵗˣⁿ, given the update X+ ∈ ℜᵘˣⁿ where u < t run

julia> momentupdat(M::SymmetricTensor{T, d}, X::Matrix{T}, Xplus::Matrix{T}) where {T<:AbstractFloat, d}

Returns a SymmetricTensor{T, d} of the moment tensor of updated multivariate data: ℜᵗˣⁿ ∈ X' = dataupdat(X,X+), i.e.: M = moment(X, d), momentupdat(M, X, X+) = moment(X', d). If u ≪ t momentupdat() is much faster than a recalculation.

julia> x = ones(6, 2);

julia> SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0]

[1.0 1.0; 1.0 1.0]], 2, 1, 2, true)


julia> y = 2*ones(2,2);

julia> momentupdat(m, x, y)
SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[3.33333 3.33333; 3.33333 3.33333]

[3.33333 3.33333; 3.33333 3.33333]], 2, 1, 2, true)

Function momentarray(X, d) can be used to compute an array of moments of order 1, ..., d of data X ∈ ℜᵗˣⁿ

julia> momentarray(X::Matrix{T}, d::Int, b::Int) where {T<:AbstractFloat, d}

b - is a SymmetricTensor parameter, a block size.

One can update an array of moments by calling:

julia> momentupdat(M::Array{SymmetricTensor{T, d}}, X::Matrix{T}, Xplus::Matrix{T}) where {T<:AbstractFloat, d}

Returns an Array{SymmetricTensor{T, d}} of moment tensors of order 1, ..., d of updated multivariate data: ℜᵗˣⁿ ∋ X' = dataupdat(X,X+), i.e. Mₐᵣ = momentarray(X, d), momentupdat(Mₐᵣ, X, X+) = momentarray(X', d).

Cumulants update

Presented functions are design for sequent update of multivariate cumulant tensors. Hence it can be applied in a data streaming scheme. Suppose one has data X ∈ ℜᵗˣⁿ and subsequent coming updates X+ ∈ ℜᵘˣⁿ such that u ≪ t. Suppose one want to compute cumulant tensors in an observation window of size t each time the update comes. To store data amd moments we use the following structure mutable struct DataMoments{T <: AbstractFloat} with following fields: X - data, d - maximal order of a moment series, b - a block size, M - moment series. To initialise, we can use the following constructor:

julia> DataMoments(X::Matrix{T}, d::Int, b::Int) where T <: AbstractFloat

here an array of moments will be computed. To update a DataMoments structure and compute updated cumulants run:

julia> cumulantsupdate!(dm::DataMoments{T}, Xplus::Matrix{T}) where T <: AbstractFloat

The function updates a DataMoment structure and returns a series of cumulants of order 1, ..., dm.d for updated data. See:

julia> x = ones(10,2);

julia> s = DataMoments(x, 4, 2);

julia> y = zeros(4,2);

julia> cumulantsupdate!(s,y)[4]
SymmetricTensor{Float64,4}(Union{Nothing, Array{Float64,4}}[[-0.1056 -0.1056; -0.1056 -0.1056]

[-0.1056 -0.1056; -0.1056 -0.1056]

[-0.1056 -0.1056; -0.1056 -0.1056]

[-0.1056 -0.1056; -0.1056 -0.1056]], 2, 1, 2, true)

To save the DataMoments structure use:

julia> savedm(dm::DataMoments, dir::String)

To load it use

julia> loaddm(dir::String)

returns a DataMoments structure stored in a given directory.

The p-norm

julia> norm(st::SymmetricTensor{T, d}, p::Union{Float64, Int}) where {T<:AbstractFloat, d}

Returns a p-norm of the tensor stored as SymmetricTensors, supported for k ≠ 0. The output of norm(st, p) = norn(convert(Array, st),p). However norm(st::SymmetricTensor, p) uses the block structure implemented in SymmetricTensors, hence is faster and decreases the computer memory requirement.

julia> te = [-0.112639 0.124715 0.124715 0.268717 0.124715 0.268717 0.268717 0.046154];

julia> st = SymmetricTensor(reshape(te, (2,2,2)));

julia> norm(st)
0.5273572868359742

julia> norm(st, 2.5)
0.4468668679541424

julia> norm(st, 1)
1.339089

Convert cumulants to moments and moments to cumulants

Given M a vector of moments of order 1, ..., d to change it to a vector of cumulants of order 1, ..., d using

julia> function moms2cums!(M::Vector{SymmetricTensor{T}}) where T <: AbstractFloat

One can convert a vector of cumulants c to a vector of moments by running

julia> function cums2moms(c::Vector{SymmetricTensor{T}}) where T <: AbstractFloat
julia> m = momentarray(ones(20,3), 3, 2)
3-element Array{SymmetricTensor{Float64,N} where N,1}:
 SymmetricTensor{Float64,1}(Union{Nothing, Array{Float64,1}}[[1.0, 1.0], [1.0]], 2, 2, 3, false)                                                                                                                              
 SymmetricTensor{Float64,2}(Union{Nothing, Array{Float64,2}}[[1.0 1.0; 1.0 1.0] [1.0; 1.0]; nothing [1.0]], 2, 2, 3, false)                                                                                                   
 SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0]
[1.0 1.0; 1.0 1.0] nothing; nothing nothing]
Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0] [1.0; 1.0]; nothing [1.0]], 2, 2, 3, false)


julia> moms2cums!(m)

julia> m
3-element Array{SymmetricTensor{Float64,N} where N,1}:
 SymmetricTensor{Float64,1}(Union{Nothing, Array{Float64,1}}[[1.0, 1.0], [1.0]], 2, 2, 3, false)                                                                                                                          
 SymmetricTensor{Float64,2}(Union{Nothing, Array{Float64,2}}[[0.0 0.0; 0.0 0.0] [0.0; 0.0]; #undef [0.0]], 2, 2, 3, false)                                                                                                
 SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[0.0 0.0; 0.0 0.0]
[0.0 0.0; 0.0 0.0] #undef; #undef #undef]
Union{Nothing, Array{Float64,3}}[[0.0 0.0; 0.0 0.0] [0.0; 0.0]; #undef [0.0]], 2, 2, 3, false)


julia> cums2moms(m)
3-element Array{SymmetricTensor{Float64,N} where N,1}:
 SymmetricTensor{Float64,1}(Union{Nothing, Array{Float64,1}}[[1.0, 1.0], [1.0]], 2, 2, 3, false)                                                                                                                          
 SymmetricTensor{Float64,2}(Union{Nothing, Array{Float64,2}}[[1.0 1.0; 1.0 1.0] [1.0; 1.0]; #undef [1.0]], 2, 2, 3, false)                                                                                                
 SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0]
[1.0 1.0; 1.0 1.0] #undef; #undef #undef]
Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0] [1.0; 1.0]; #undef [1.0]], 2, 2, 3, false)


Performance tests

To analyse the computational time of cumulants updates vs Cumulants.jl recalculation, we supply the executable script comptimes.jl. The script saves computational times to the res/*.jld file. The scripts accept following parameters:

  • -d (Int): cumulant's maximum order, by default d = 4,
  • -n (vararg Int): numbers of marginal variables, by default n = 40,
  • -t (Int): number of realisations of random variable, by default t = 500000,
  • -u (vararg Int): number of realisations of update, by default u = 10000, 15000, 20000,
  • -b (Int): blocks size, by default b = 4,
  • -p (Int): numbers of processes, by default p = 3.

To analyse the computational time of cumulants updates for different block sizes 1 < b ≤ Int(√n)+2, we supply the executable script comptimesblocks.jl. The script saves computational times to the res/*.jld file. The scripts accept following parameters:

  • -d (Int): cumulant's order, by default d = 4,
  • -n (Int): numbers of marginal variables, by default n = 48,
  • -u (vararg Int): number of realisations of the update, by default u = 10000, 20000.
  • -p (Int): numbers of processes, by default p = 3.

To analyse the computational time of cumulants updates for different number of workers, we supply the executable script comptimesprocs.jl. The script saves computational times to the res/*.jld file. The scripts accept following parameters:

  • -d (Int): cumulant's order, by default d = 4,
  • -n (Int): numbers of marginal variables, by default n = 48,
  • -u (vararg Int): number of realisations of the update, by default u = 10000, 20000,
  • -b (Int): blocks size, by default b = 4,
  • -p (Int): maximal numbers of processes, by default p = 6.

To plot computational times run executable res/plotcomptimes.jl on chosen *.jld file.

Citing this work

Krzysztof Domino, Piotr Gawron, An algorithm for arbitrary–order cumulant tensor calculation in a sliding window of data streams, International Journal of Applied Mathematics and Computer Science, vol. 29, issue 1, pp. 206, 2019 (https://doi.org/10.2478/amcs-2019-0015 )

This project was partially financed by the National Science Centre, Poland – project number 2014/15/B/ST6/05204.