A post by Rachel Wood, PhD student on the Compass programme.

# Introduction

As our online lives expand, more data than we can reasonably consider at once is collected. Many of this is sparse and noisy data, needing methods which can recover information encoded in these structures. An example of these kind of datasets are networks. In this blog post, I explain how we can do this to identify changes between networks observing the same subjects (e.g. snapshots of the same graph over time).

# Problem Set-Up

We consider two undirected graphs, represented by their adjacency matrices $\mathbf{A}^{(1)}, \mathbf{A}^{(2)} \in \{0,1\}^{n \times n}$. As we can see below, there are two clusters (pink nodes form one, the yellow and blue nodes form another) in the first graph but in the second graph the blue nodes change behaviour to become a distinct third cluster.

Our question becomes, how can we detect this change without prior knowledge of the labels?

We can simply look at the adjacency matrices, but these are often sparse, noisy and computationally expensive to work with. Using dimensionality reduction, we can “denoise” the matices to obtain a $d$-dimensional latent representation of each node, which provides a natural measure of node behaviour and a simple space in which to measure change.

# Graph Embeddings

There is an extensive body of research investigating graph embeddings, however here we will focus on spectral methods.

Specifically we will compare the approaches of Unfolded Adjacency Spectral Embedding (UASE) presented in [1] and CLARITY presented in [2]. Both of these are explained in more detail below.

## UASE

UASE takes as input the unfolded adjacency matrix $\mathbf{A} = \left[ \mathbf{A}^{(1)}\big| \mathbf{A}^{(2)}\right] \in \{0,1\}^{2n \times n}$ and performs $d$ truncated SVD [3] to obtain a $d$-dimensional static and a $d$-dimensional dynamic representation:

Mathematically we can write this as:

\begin{equation*}

\mathbf{A} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^T = \mathbf{U}_{\mathbf{A}} \boldsymbol{\Sigma}_{\mathbf{A}} \mathbf{V}_{\mathbf{A}}^T + \mathbf{U}_{\perp \!\!\!\ } \ \boldsymbol{\Sigma}_{\perp \!\!\!\ } \ \mathbf{V}_{\perp \!\!\!\ }^T \ \approx \mathbf{U}_{\mathbf{A}} \boldsymbol{\Sigma}_{\mathbf{A}} \mathbf{V}_{\mathbf{A}}^T = \mathbf{X} \mathbf{Y}^T

\end{equation*}

where $\mathbf{U}_{\mathbf{A}}, \mathbf{V}_{\mathbf{A}}$ are the first $d$ columns of $\mathbf{U}$ and $\mathbf{V}$ respectively and $\boldsymbol{\Sigma}_{\mathbf{A}}$ is the diagonal matrix which forms the $d \times d$ upper left block of $\boldsymbol{\Sigma}$. This gives a static embedding $\mathbf{X} \in \mathbb{R}^{n \times d}$ and a time evolving embedding $\mathbf{Y} \in \mathbb{R}^{2n \times d}$.

The general approach in UASE literature is to measure change by comparing latent positions, which is backed by [4]. This paper gives a theoretical demonstration for longitudinal and cross-sectional stability in UASE, i.e. for observations $i$ at time $s$ and $j$ at time $t$ behaving similarly, their latent positions should be the same: $\hat Y_i^{(s)} \approx \hat Y_j^{(t)}$. This backs the general approach in the UASE literature of comparing latent positions to quantify change.

Going back to our example graphs, we apply UASE to the unfolded adjacency matrix and visualise the first two dimensions of the embedding for each of the graphs:

As we can see above, the pink nodes have retained their positions, the yellow nodes have moved a little and the blue nodes have moved the most.

## CLARITY

Clarity takes a different approach, by estimating $\mathbf{A}^{(2)}$ from $\mathbf{A}^{(1)}$. An illustration of how it is done is shown below:

Again we provide a mathmatical explanation of the method. First we perform a $d$-dimensional truncated eigendecompositionon $\mathbf{A}^{(1)}$:

\begin{equation*}

\mathbf{A}^{(1)} = \mathbf{U}^{(1)} \boldsymbol{\Sigma}^{(1)} \mathbf{U}^{(1)T} + \mathbf{U}_{\perp \!\!\!\ } \ \boldsymbol{\Sigma}_{\perp \!\!\!\ } \ \mathbf{U}_{\perp \!\!\!\ }^T \ \approx \mathbf{U}^{(1)} \boldsymbol{\Sigma}^{(1)} \mathbf{U}^{(1)T} = \hat{\mathbf{A}}^{(1)}

\end{equation*}

where $\mathbf{U} \in \mathbb{R}^{n \times d}$ is a matrix of the first $d$ eigenvectors and $\Sigma \in \mathbb{R}^{d \times d}$ is a diagonal matrix with the first $d$ eigenvalues.

Then we estimate $\mathbf{A}^{(2)}$ as

\begin{equation*}

\hat{\mathbf{A}}^{(2)} = \mathbf{U}^{(1)} \boldsymbol{\Sigma }^{(2)} \mathbf{U}^{(1)T} \hspace{1cm} \text{where} \hspace{1cm} \boldsymbol{\Sigma}^{(2)} = \mathbf{U}^{(1)T} \mathbf{A}^{(2)} \mathbf{U}^{(1)}

\end{equation*}

As opposed to UASE, Clarity examines change between $\mathbf{A}^{(1)}$ and $\mathbf{A}^{(2)}$ by a quantity called persistence. These are defined as

\begin{equation*}

\mathbf{P}_i = \sum_{j =1}^{n}\left( \mathbf{A}_{ij}^{(2)} -\hat{\mathbf{A}}_{ij}^{(2)} \right)

\end{equation*}

The intuition here is that the persistences will capture structure in $\mathbf{A}^{(2)}$ that is not present in or explained by $\mathbf{A}^{(1)}$.

Returning to our example problem, we can see heatmaps of $\mathbf{A}^{(1)}$ and $\mathbf{A}^{(2)}$ alongside their Clarity estimates:

Looking at the figure above we can see that the Clarity estimate of ${\mathbf{A}^{(2)}}$ does not capture the third cluster that appears in the second graph and therefore should identify these nodes as anomalies.

## Comparison

We can use receiver operating characteristic (ROC) curves to assess the success of our two methods. Given a score (in our case either the distance between latent positions or persistences) it plots the false positive rate against the true positive rate for a sequence of thresh-holds. We can see the ROCs below for $d = 2,3,4,5,6$

We can see that in lower dimensions UASE outperforms Clarity, but the performance degrades over time. This becomes a common problem in real world applications where the best choice for $d$ is unknown. Clarity on the other hand, does not have the same power as UASE but is more robust to dimension. Another difference between the two methods is that by allowing changes in relationship in the model, it is designed to cope with the entire graph changing a little bit.

# Conclusion

We have now introduced two methods for identifying change and compared their performance in a simple example. One method produces stronger results overall but is much more sensitive to the choice of dimension than the other. My current research looks to investigate why Clarity succeeds in this area when many other methods fail, with the ultimate goal of using this knowledge to modify more powerful methods to also have this feature.

[1] Jones, A., & Rubin-Delanchy, P. (2020). The multilayer random dot product graph. *arXiv preprint arXiv:2007.10455*.

[2] Lawson, D. J., Solanki, V., Yanovich, I., Dellert, J., Ruck, D., & Endicott, P. (2021). CLARITY: comparing heterogeneous data using dissimilarity. *Royal Society Open Science*, *8*(12), 202182.

[3] Wikipedia contributors. (2024, June 11). Singular value decomposition. In *Wikipedia, The Free Encyclopedia*. Retrieved 09:54, July 1, 2024, from https://en.wikipedia.org/w/index.php?title=Singular_value_decomposition&oldid=1228566091

[4] Gallagher, I., Jones, A., & Rubin-Delanchy, P. (2021). Spectral embedding for dynamic networks with stability guarantees. *Advances in Neural Information Processing Systems*, *34*, 10158-10170.