Student Perspectives: An introduction to normalising flows

A post by Dan Ward, PhD student on the Compass programme.

Normalising flows are black-box approximators of continuous probability distributions, that can facilitate both efficient density evaluation and sampling. They function by learning a bijective transformation that maps between a complex target distribution and a simple distribution with matching dimension, such as a standard multivariate Gaussian distribution.

Transforming distributions

Before introducing normalising flows, it is useful to introduce the idea of transforming distributions more generally. Lets say we have two uniform random variables, u \sim \text{Uniform}(0, 1), and x \sim \text{Uniform}(0, 2). In this case, it is straight forward to define the bijective transformation T that maps between these two distributions, as shown below.

If we wished to sample p(x), but could not do so directly, we could instead sample u \sim p(u), and then apply the transformation x=T(u). If we wished to evaluate the density of p(x), but could not do so directly, we can rewrite p(x) in terms of p(u)

p(x)= p(u=T^{-1}(x)) \cdot 2^{-1},

where p(u=T^{-1}(x)) is the density of the corresponding point in the u space, and dividing by 2 accounts for the fact that the transformation T stretches the space by a factor of 2, “diluting” the probability mass. The key thing to notice here, is that we can describe the sampling and density evaluation operations of one distribution, p(x), based on a bijective transformation of another, potentially easier to work with distribution, p(u). (more…)

Skip to toolbar