diff --git a/_posts/2024-01-20-flow-matching.md b/_posts/2024-01-20-flow-matching.md index cebbb94..af1a456 100644 --- a/_posts/2024-01-20-flow-matching.md +++ b/_posts/2024-01-20-flow-matching.md @@ -96,6 +96,10 @@ draft: true color: red; text-decoration: line-through; } + +main .image-container .caption { + text-align: center; +} @@ -139,7 +143,13 @@ $$ \nonumber $$ -# Flow matching +# Table of contents +{:.no_toc} + +1. placeholder +{:toc} + +# Introduction @@ -147,6 +157,7 @@ $$ ## Generative Modelling +{:.no_toc} Let's assume we have data samples $x_1, x_2, \ldots, x_n$ from a distribution of interest $q_1(x)$, which density is unknown. We're interested in using these samples to learn a probabilistic model approximating $q_1$. In particular, we want efficient generation of new samples (approximately ) distributed from $q_1$. This task is referred to as **generative modelling**. @@ -186,10 +197,11 @@ The advancement in generative modelling methods over the past decade has been no ## Outline +{:.no_toc} Flow Matching (FM) models are in nature most closely related to (Continuous) Normalising Flows (CNFs). Therefore, we start this blogpost by briefly recapping the core concepts behind CNFs. We then continue by discussing the difficulties of CNFs and how FM models address them. -## Basics: Normalising Flows +# Normalising Flows Let $\phi: \mathbb{R}^d \rightarrow \mathbb{R}^d$ be a continuously differentiable function which transforms elements of $\mathbb{R}^d$, with a continously differentiable inverse $\phi^{-1}: \mathbb{R}^d \to \mathbb{R}^d$. Let $q_0(x)$ be a density on $\mathbb{R}^d$ and let $p_1(\cdot)$ be the density induced by the following sampling procedure @@ -208,12 +220,13 @@ $$ \begin{align} \label{eq:changevar} p_1(y) &= q_0(\phi^{-1}(y)) \abs{\det\left[\frac{\partial \phi^{-1}}{\partial y}(y)\right]} \\ +\label{eq:changevar-alt} &= \frac{q_0(x)}{\abs{\det\left[\frac{\partial \phi}{\partial x}(x)\right]}} \quad \text{with } x = \phi^{-1}(y) \end{align} $$ where the last equality can be seen from the fact that $\phi \circ \phi^{-1} = \Id$ and a simple application of the chain rule[^chainrule]. The quantity $\frac{\partial \phi^{-1}}{\partial y}$ is the Jacobian of the inverse map. It is a matrix of size $d\times d$ containing $J_{ij} = \frac{d\phi^{-1}_i}{dx_j}$. -Depending on the task at hand, evaluation of likelihood or sampling, one of the two formulation of $\eqref{eq:changevar}$ is preferred. +Depending on the task at hand, evaluation of likelihood or sampling, the formulation in $\eqref{eq:changevar}$ or $\eqref{eq:changevar-alt}$ is preferred (Friedman, 1987; Chen & Gopinath, 2000).