Skip to content

Tensor Trains, mostly as probability distributions


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation


Build Status codecov

⚠️ This package is still work in progress, some breaking changes should be expected.

What is a Tensor Train?

A Tensor Train is a type of tensor factorization involving the product of 3-index tensors organized on a one-dimensional chain. In the context of function approximation and probability, a function of $L$ discrete variables is in Tensor Train format if it is written as

$$f(x^1, x^2, \ldots, x^L) = \sum_{a^1,a^2,\ldots,a^{L-1}} [A^1(x^1)]_{a^1}[A^2(x^2)]_{a^1,a^2}\cdots [A^{L-1}(x^{L-1})]_{a^{L-2},a^{L-1}}[A^L(x^L)]_{a^{L-1}}$$

where, for every choice of $x^l$, $A^l(x^l)$ is a real-valued matrix and the matrix sizes must be compatible. The first matrix must have 1 row and the last matrix should have 1 column, such that the product correctly returns a scalar.

The Tensor Train factorization can be used to parametrize probability distributions, which is the main focus of this package. In this case, $f$ should be properly normalized and always return a non-negative value.

Tensor Trains with Periodic Boundary Conditions

A slight generalization, useful to describe systems with periodic boundary conditions is the following:

$$f(x^1, x^2, \ldots, x^L) = \sum_{a^1,a^2,\ldots,a^{L}} [A^1(x^1)]_{a^1,a^2}[A^2(x^2)]_{a^2,a^3}\cdots [A^{L-1}(x^{L-1})]_{a^{L-1},a^{L}}[A^L(x^L)]_{a^{L},a^1}$$

In other words, to evaluate $f$ one takes the trace of the product of matrices.

Notation and terminology

Tensor Trains are the most basic type of Tensor Network. Tensor networks are a large family of tensor factorizations which are often best represented in diagrammatic notation. For this reason, the term bond is used interchangeably as index. The indices $a^1,a^2,\ldots,a^{L-1}$ are usually called the virtual indices, while $x^1, x^2, \ldots, x^L$ are the physical indices.

Tensor Trains are used to parametrize wavefunctions in many-body quantum physics. The resulting quantum state is called Matrix Product State. In such context, the entries are generally complex numbers, and a probability can be obtained for a given state by taking the squared absolute value of the wavefunction.

In this package we focus on the "classical" case where the Tensor Train directly represents a probability distribution $p(x^1, x^2, \ldots, x^L)$.

Efficient computation

Given a Tensor Train some simple recursive strategies can be employed to do the following operations in time $\mathcal O (L)$

Compute the normalization

$$Z = \sum_{x^1, x^2, \ldots, x^L} \sum_{a^1,a^2,\ldots,a^{L-1}} [A^1(x^1)]_{a^1}[A^2(x^2)]_{a^1,a^2}\cdots [A^{L-1}(x^{L-1})]_{a^{L-2},a^{L-1}}[A^L(x^L)]_{a^{L-1}}$$

such that

$$\begin{aligned} 1&=\sum_{x^1, x^2, \ldots, x^L}p(x^1, x^2, \ldots, x^L)\\&=\sum_{x^1, x^2, \ldots, x^L}\frac1Z \sum_{a^1,a^2,\ldots,a^{L-1}} [A^1(x^1)]_{a^1}[A^2(x^2)]_{a^1,a^2}\cdots [A^{L-1}(x^{L-1})]_{a^{L-2},a^{L-1}}[A^L(x^L)]_{a^{L-1}} \end{aligned}$$

Compute marginals


$$p(x^l=x) = \sum_{x^1, x^2, \ldots, x^L} p(x^1, x^2, \ldots, x^L) \delta(x^l,x)$$

and two-variable

$$p(x^l=x, x^m=x') = \sum_{x^1, x^2, \ldots, x^L} p(x^1, x^2, \ldots, x^L) \delta(x^l,x)\delta(x^m,x')$$

Extract samples

Via hierarchical sampling

$$p(x^1, x^2, \ldots, x^L) = p(x^1)p(x^2|x^1)p(x^3|x^1,x^2)\cdots p(x^L|x^1,x^2,\ldots,x^{L-1})$$

by first sampling $x^1\sim p(x^1)$, then $x^2\sim p(x^2|x^1)$ and so on.

What can this package do?

This small package provides some utilities for creating, manipulating and evaluating Tensor Trains interpreted as functions, with a focus on the probabilistic side. Each variable $x^l$ is assumed to be multivariate. Whenever performing some probability-related operation, it is responsability of the user to make sure that the Tensor Train always represents a valid probability distribution.

Common operations are:

  • evaluate a Tensor Train at a given set of indices
  • orthogonalize_left!, orthogonalize_right!: bring a Tensor Train to left/right orthogonal form
  • compress! a Tensor Train using SVD-based truncations
  • normalize! a Tensor Train in the probability sense (not in the $L_2$ norm sense!), see above
  • sample from a Tensor Train intended as a probability ditribution, see above
  • +,-: take the sum/difference of two TensorTrains


Let's construct and initialize at random a Tensor Train of the form

$$f\left((x^1,y^1), (x^2,y^2), (x^3,y^3)\right) = \sum_{a^1,a^2} [A^1(x^1,y^1)]_{a^1}[A^2(x^2,y^2)]_{a^1,a^2}[A^3(x^3,y^3)]_{a^2}$$

where $x^l\in\{1,2\}, y^l\in\{1,2,3\}$.

using TensorTrains
L = 3        # length
q = (2, 3)   # number of values taken by x, y
d = 5        # bond dimension
A = rand_tt(d, L, q...)    # construct Tensor Train with random positive entries
xy = [[rand(1:qi) for qi in q] for _ in 1:L]    # random set of indices
p = evaluate(A, xy)    # evaluate `A` at `xy`
compress!(A; svd_trunc = TruncThresh(1e-8));    # compress `A` to reduce the bond dimension
pnew = evaluate(A, xy)
ε = abs( (p - pnew)/p )


Related packages

  • TensorTrains.jl: conceived for the application of Tensor Train decomposition to elliptic PDEs, does not cover anything related to probability
  • Tensor-Train-Julia: less lightweight, mostly designed for quantum applications, still WIP
  • Itensors.jl: a full-fledged Tensor Network library, mostly designed for quantum applications. Interface is more intuitive, but likely less efficient if all you need to do is simple operations on 1D Tensor Networks