A post by Ollie Baker, PhD student on the Compass programme.

Introduction

A random geometric graph is a model of a network with a geometric embedding, or a latent space embedding. They were originally introduced by Gilvert in 1961, who called them ‘random planar networks’ [3]. If we are studying a high dimensional dataset, then we may be interested in modelling it using some form of network structure which contains information about similarity of data points in the latent space. This can be represented as points in a high-dimensional space connected based on their closeness according to some distance metric. What can we say about the distribution of these random graph models when the dimension of the space gets large? The answer to this question has applications in clustering, and graph compression once we consider the information theory of these network ensembles.

Figure 1: An example of a 500 node Random Geometric Graph on the unit square with connection range 0.1.

Let $(\Omega, \rho)$ be a metric space, where $\Omega \subset \mathbb{R}^d$ is compact. In this post, we will only consider $\Omega = [0,1]^d$ with $\rho = \|\cdot\|$ the Euclidean metric, and $\Omega = [0,1]^d$ with $\rho = \rho_t$, with $\rho_t(x,y) = \left(\sum_{i=1}^d\min(|x-y|, 1-|x-y|)^2 \right)^{\frac{1}{2}}$. This second metric space is known as the unit $d$-torus, and is denoted $\mathbb{T}^d$. To form a Random Geometric Graph (RGG) $G$, we distribute $n$ points at random according to a probability density $\nu$ in $\Omega$ to form the vertex set, and connect nodes $i$, $j$ if they have a mutual distance less than $r_0$. That is, if $X_1,…,X_n$ are our points in $\Omega$, $i\sim j$ in $G$ if and only if $\rho(X_i,X_j) \leq r_0$. Figure 1 shows an example realisation of such a graph. We are interested in densities of the form:

\begin{equation}
\label{general_distribution}
\nu(\{x_i\}_{i=1}^d) = \prod_{i=1}^d \pi(x_i)
\end{equation}
That is, the coordinates of a point $x$ are i.i.d. \\ \newline
We are interested in computing the probability of a graph $G$ with adjacency matrix $A=\{a_{ij}\}_{i,j=1}^n$:
\begin{equation}
\mathbb{P}(G) = \int_{[0,D]^{\binom{n}{2}}} f_{\vec{R}}(\vec{r})\prod_{i<j} \left(a_{ij}\mathbb{I}(r_{ij}<r_0)+(1-a_{ij})\mathbb{I}(r>r_0)\right)d\vec{r}
\end{equation}
where $D$ is the diameter of (largest possible distance between two points in) $\Omega$, $f_{\vec{R}}$ is the joint density of pair distances in $\Omega$ (which is well defined if $d>n+2$), and $d\vec{r}=\prod_{i<j}dr_{ij}$. Clearly, this is intractable for most choices of $\Omega$ and $n$. However, we can simplify things if we take the limit $d\rightarrow\infty$. An \textit{ensemble} $\mathcal{G}$ of RGGs is the set of all possible RGGs that can be constructed with a fixed $r_0, n, \Omega, \nu$ and $\rho$. A sequence of ensembles $\{\mathcal{G}_d\}$ (with all parameters except $n$ dependent on $d$) equipped with probability measures $\mathbb{P}_d$ \textit{converges in distribution} to another ensemble $\mathcal{G}$ as $d\rightarrow\infty$ if
\begin{equation}
\mathbb{P}_d(g) \rightarrow \mathbb{P}(g)
\end{equation}
for all graphs $g$ with $n$ nodes.

Central Limit Theorem for High-Dimensional Distance

We can use a central limit theorem (CLT) to prove that the vector of pair distances in $\Omega$ converges in distribution to a multivariate Gaussian as $d\rightarrow\infty$, which is a generalisation of what is done in [2].

Theorem 1:

Let $(\Omega, \rho)$ be a metric space, and $\nu$ be a node density as descibed above, and $X_1,…,X_n$ be random vectors distributed according to $\nu$. Define $R_{ij}^{(k)} = \rho(X_i^{(k)}, X_j^{(k)})^2$, the distance between $X_i$ and $X_j$ in each coordinate, and $\mu := \mathbb{E}[R_{12}^{(1)}]$. Let
\begin{equation}
q_{ij} := \frac{1}{\sqrt{d}}\sum_{k=1}^d (R_{ij}^{(k)}-\mu)
\end{equation}
Then, as $d\rightarrow\infty$ the vector $\vec{q} = \{q_{ij}\}_{1\leq i<j\leq n} \in \mathbb{R}^{\binom{n}{2}}$ satisfies
\begin{equation}
\vec{q} \rightarrow Z \sim N(0_{\binom{n}{2}},\Sigma)
\end{equation}
where $\Sigma$ is the covariance matrix indexed by multi-indexes $(i,j)$ given by $\Sigma_{(i,j),(i,j)} = \alpha := \mathbb{E}[(\rho(X_i^{(1)},X_j^{(1)})^2-\mu^2)^2]$, and $\Sigma_{(i,j),(j,k)} = \beta := \mathbb{E}[(\rho(X_i^{(1)},X_j^{(1)})^2-\mu^2)(\rho(X_j^{(1)},X_k^{(1)})^2-\mu^2)]$, and $\Sigma_{(i,j),(k,l)} = 0$.

Essentially, this means that when $d\rightarrow$, we can replace the intractable joint density $f_{\vec{R}}$ with a multivariate Gaussian density, which will make our calculations much easier.

Erdös-Rényi Random Graphs

The Erdös-Rényi (ER) random graph, also denoted as $G(n,p)$ is arguably the simplest model of a random graph. We take $n$ nodes, and connect each pair with fixed probability $p$. The probability of an ER graph $G$ is given by the binomial probability
\begin{equation}
\mathbb{P}(G) = p^{k}(1-p)^{\binom{n}{2}-k}
\end{equation}
where $k = \sum_{i<j} a_{ij}$ is the number of edges in $g$. Clearly, this model is a non-spatial network model, which is not good for modelling networks with a latent space structure! Therefore, we would like to guarantee that our model does not converge to $G(n,p)$ as $d\rightarrow\infty$, otherwise we would lose information about the spatial or latent space correlation of our data.

When Does the RGG Converge to $G(n,p)$?

For the main results, we will provide a condition on the distribution of nodes in $\Omega$ for when we see convergence in distribution to $G(n,p)$. For a RGG with connection range $r_0$ in the metric space $(\Omega, \rho)$ and $\mu = \mathbb{E}[\rho(X_i^{(1)},X_j^{(1)})^2]$, define the \textit{normalised connection range} $t = \frac{r_0^2}{\sqrt{d}} – \mu \sqrt{d}$. Recall that the probability of a graph is given by
\begin{equation}
\mathbb{P}(G) = \int_{[0,D]^{\binom{n}{2}}} f_{\vec{R}}(\vec{r})\prod_{i<j} \left(a_{ij}\mathbb{I}(r_{ij}<r_0)+(1-a_{ij})\mathbb{I}(r>r_0)\right)d\vec{r}
\end{equation}
If the random distances converge to a Gaussian, then we have (after some algebra)
\begin{equation}
\mathbb{P}(G) \rightarrow \int_{\mathcal{A}} N(0,\Sigma)(\vec{q})d\vec{q}
\end{equation}
where $\mathcal{A} = \bigotimes_{i<j} A_{ij}$ where $\bigotimes$ denotes the Cartesian product of sets, and $A_{ij}$ is the set:
\begin{equation}
A_{ij} := \begin{cases}
(-\infty, t] & a_{ij} =1 \\
(t,\infty) & a_{ij} = 0
\end{cases}
\end{equation}
If $\Sigma$ is diagonal, then the integral above splits up into its marginals, and
\begin{equation}
\mathbb{P}(G) = \prod_{i<j} \bar{p}(t)^k(1-\bar{p}(t))^{\binom{n}{2}-k}
\end{equation}
with $\bar{p}(t) := \int_{-\infty}^t N(0, \alpha)(q)dq$ with $\alpha$ being the diagonal elements of $\Sigma$. Note this is the exact probability of a graph in $G(n, \bar{p}(t))$. If there are non-zero off-diagonal elements, then there are correlations between edges, and we do not have convergence to $G(n,p)$.

The RGG in $[0,1]^d$

Suppose now that $(\Omega, \rho) = ([0,1]^d, \|\cdot\|)$. We will need that our node distribution $\nu$ is of the form we described earlier, where the marginals $\pi$ have a kurtosis greater that 1. It can be shown that the only distributions with unit kurtosis are the Bernoulli distribution with parameter $1/2$, and constant distributions.

Theorem 2 [1]

Suppose $\mathcal{G}$ is an ensemble of RGGs in $[0,1]^d$ with nodes distributed according to $\nu$, then, provided the kurtosis of the marginals $\pi$ is greater than 1, $\mathcal{G}$ does not converge to the ER ensemble as $d\rightarrow\infty$.

Sketch Proof:

The proof is direct. We set $\beta = 0$, which for the Euclidean distance metric means (after some rearranging),
\begin{equation}
\mathbb{E}[(X_i-\mu)^4] – \mathbb{E}[(X_i-\mu)^2]^2 = 0
\end{equation}
or equivalently, the kurtosis of $X_i$ is 1.

This means, for any `sufficiently nice’ distribution of nodes, we do not converge to $G(n,p)$, and maintain spatial properties which can be exploited in data analysis.

The RGG in $\mathbb{T}^d$

In the torus, the following theorem shows that if the distribution of nodes is uniform, then we will in fact see convergence to $G(n,p)$. However, for any other distribution, we maintain the spatial correlation.

Theorem 3 [1]

Let $\mathcal{G}$ be an ensemble of RGGs on $\mathbb{T}^d$ with nodes distributed according to $\nu$. Then as $d\rightarrow\infty$, $\mathcal{G}$ converges in distribution to $G(n,p)$ if and only if $\nu$ is the uniform distribution.

Sketch Proof

The tactic for the proof is the same as for the cube. We will find a condition for which $\beta = 0$. In the torus, we have
\begin{equation}
\beta = 0 \iff \mathbb{E}_X[(\mathbb{E}_Y[\rho_t(X,Y)^2])^2] = \mathbb{E}_X[\mathbb{E}_Y[\rho_t(X,Y)^2]]^2
\end{equation}

From which we can deduce that $\mathbb{E}_Y[\rho_t(x,Y)]$ is constant $\pi$-almost-everywhere. This implies
\begin{equation}
\int_0^1 \rho_t(x,y)^2\pi(y)dy = \mu
\end{equation}
for $\pi$ almost every $x$. We can rewrite the left hand side above as the convolution of two periodic functions, and therefore taking a Fourier transform of both sides simplifies the problem. By equating Fourier modes, we find that the Fourier transform of $\hat{\pi}$ of $\pi$ must be zero everywhere except when evaluated at $0$. This means that the original function $\pi$ must be constant on $[0,1]$.

So, if we are using a toroidal distance metric, then if our data is uniformly distributed, we will lose the latent space embedding as $d\rightarrow\infty$.

Example

Here we plot the distribution of edge counts in the high-dimension limit for RGGs with uniformly (figure 2) and Gaussian distributed (figure 3) nodes to illustrate the difference that changing the node distribution can make.

Figure 2: Comparison of the theoretical distribution as $d\rightarrow\infty$ of edge counts for RGGs in $[0,1]$ and $\mathbb{T}^d$ with uniform nodes. Top: $\mathbb{P}(\text{# of edges } = k)/\binom{n}{k}$ for RGGs in the cube for $n=3$ and $n=7$. The torus would have a uniform distribution and is therefore omitted. Bottom: $\mathbb{P}(\text{# of edges }=k)$ for RGGs with $n=7$ in $[0,1]^d$ and $\mathbb{T}^d$

Figure 3: Comparison of the theoretical distribution as $d\rightarrow\infty$ of edge counts for RGGs in $[0,1]$ and $\mathbb{T}^d$ with Gaussian distributed nodes. Top: $\mathbb{P}(\text{# of edges } = k)/\binom{n}{k}$ for RGGs in $[0,1]^d$ and $\mathbb{T}^d$ for $n=3$ and $n=7$. Bottom: $\mathbb{P}(\text{# of edges }=k)$ for RGGs with $n=7$ in $[0,1]^d$ and $\mathbb{T}^d$

Conclusion

In this blog post, we have defined the random geometric graph (RGG) with general node distributions, and proved that most of the time, the spatial correlations between edges are preserved as the dimension of the underlying geometry tends to $\infty$. In the $d$-cube, geometry is preserved as long as the distribution of our nodes is neither constant, or Bernoulli with parameter $1/2$, and in the $d$-torus, geometry is preserved provided our distribution is not uniform. The result for the torus is especially interesting, since it challenges ideas in the literature about how we should model high dimensional RGGs. In real-world data, the coordinates are unlikely to be uniformly distributed, yet the majority of theoretical high-dimensional random geometric graph studies use uniform distributions on the torus. The issue is that this is the only case where the torus behaves like a $G(n,p)$ graph. For a more in-depth explanation of this work, and extensions to the concepts of graph entropy, see our recent preprint [1].

References

[1] O. Baker and C.P. Dettmann “Entropy of Random Geometric Graphs in High and Low Dimensions”. arXiv preprint arXiv:2503.11418 (2025)

[2] V. Erba et al. “Random geometric graphs in high dimension”. Phys. Rev. E 102.1 (2020), 012306.

[3] E. N. Gilbert. “Random Plane Networks”. Journal of the Society for Industrial and Applied Mathematics 9.4 (1961), 533–543.

Author: o.baker

Student Perspectives: The Distribution of High-Dimensional Random Geometric Graphs