Our third Cohort of Compass students have confirmed their PhD projects for the next 3 years and are establishing the direction of their own research within the CDT. (more…)

# Category: Students

## Compass Conference 2022

Our first Compass Conference was held on Tuesday 13^{th} September 2022, hosted in the newly refurbished Fry Building, home to the School of Mathematics. (more…)

## Compass student publishes article in Frontiers

Compass student Dan Milner and his academic supervisors have published an article in Frontiers, one of the most cited and largest research publishers in the world. Dan’s work is funded in collaboration with ILRI (International Livestock Research Institute). (more…)

## Student Perspectives: An introduction to normalising flows

A post by Dan Ward, PhD student on the Compass programme.

Normalising flows are black-box approximators of continuous probability distributions, that can facilitate both efficient density evaluation and sampling. They function by learning a bijective transformation that maps between a complex target distribution and a simple distribution with matching dimension, such as a standard multivariate Gaussian distribution. (more…)

## Student perspectives: Neural Point Processes for Statistical Seismology

A post by Sam Stockman, PhD student on the Compass programme.

# Introduction

Throughout my PhD I aim to bridge a gap between advances made in the machine learning community and the age-old problem of earthquake forecasting. In this cross-disciplinary work with Max Werner from the School of Earth Sciences and Dan Lawson from the School of Mathematics, I hope to create more powerful, efficient and robust models for forecasting, that can make earthquake prone areas safer for their inhabitants.

For years seismologists have sought to model the structure and dynamics of the earth in order to make predictions about earthquakes. They have mapped out the structure of fault lines and conducted experiments in the lab where they submit rock to great amounts of force in order to simulate plate tectonics on a small scale. Yet when trying to forecast earthquakes on a short time scale (that’s hours and days, not tens of years), these models based on the knowledge of the underlying physics are regularly outperformed by models that are statistically motivated. In statistical seismology we seek to make predictions through looking at distributions of the times, locations and magnitudes of earthquakes and use them to forecast the future.

## Ed Davis wins poster competition

Congratulations to Ed Davis who won a poster award as part of the Jean Golding Institute’s Beauty of Data competition.

*This visualisation, entitled “The World Stage”, gives a new way of representing the positions of countries. Instead of placing them based on their geographical position, they have been placed based on their geopolitical alliances. Countries have been placed such to minimise the distance to their allies and maximise the distance to their non-allies based on 40 different alliances involving 161 countries. This representation was achieved by embedding the alliance network using Node2Vec, followed by principal component analysis (PCA) to reduce it to 2D.*

## Student perspectives: Sampling from Gaussian Processes

A post by Anthony Stephenson, PhD student on the Compass programme.

## Introduction

The general focus of my PhD research is in some sense to produce models with the following characteristics:

- Well-calibrated (uncertainty estimates from the predictive process reflect the true variance of the target values)
- Non-linear
- Scalable (i.e. we can run it on large datasets)

At a vague high-level, we can consider that we can have two out of three of those requirements without *too* much difficulty, but including the third causes trouble. For example, Bayesian linear models would satisfy good-calibration and scalability but (as the name suggests) fail at modelling non-linear functions. Similarly, neural-networks are famously good at modelling non-linear functions and much work has been spent on improving their efficiency and scalability, but producing well-calibrated predictions is a complex additional feature. I am approaching the problem from the angle of Gaussian Processes, which provide well-calibrated non-linear models; at the expense of scalability.

### Gaussian Processes (GPs)

See Conor’s blog post for a more detailed introduction to GPs; here I will provide a basic summary of the key facts we need for the rest of the post.

The functional view of GPs is that we define a distribution over functions:

where and are the mean function and kernel function respectively, which play analogous roles to the usual mean and covariance of a Gaussian distribution.

In practice, we only ever observe some finite collection of points, corrupted by noise, which we can hence view as a draw from some multivariate normal distribution:

where

with .

(Here subscript denotes dimensionality of the vector or matrix).

When we use GPs to generate predictions at some new test point we use the following equations which I will not derive here (See [1]) for the predicted mean and variance respectively:

The key point here is that both predictive functions involve the inversion of an matrix at a cost of .

## Motivation

## Student Perspectives: Application of Density Ratio Estimation to Likelihood-Free problems

A post by Jack Simons, PhD student on the Compass programme.

# Introduction

I began my PhD with my supervisors, Dr Song Liu and Professor Mark Beaumont with the intention of combining their respective fields of research; Density Ratio Estimation (DRE), and Simulation Based Inference (SBI):

- DRE is a rapidly growing paradigm in machine learning which (broadly) provides efficient methods of comparing densities without the need to compute each density individually. For a comprehensive yet accessible overview of DRE in Machine Learning see [1].
- SBI is a group of methods which seek to solve Bayesian inference problems when the likelihood function is intractable. If you wish for a concise overview of the current work, as well as motivation then I recommend [2].

Last year we released a paper, * Variational Likelihood-Free Gradient Descent * [3] which combined these fields. This blog post seeks to condense, and make more accessible, the contents of the paper.

# Motivation: Likelihood-Free Inference

Let’s begin by introducing likelihood-free inference. We wish to do inference on the posterior distribution of parameters for a specific observation , i.e. we wish to infer which can be decomposed via Bayes’ rule as

The likelihood-free setting is that, additional to the usual intractability of the normalising constant in the denominator, the likelihood, , is also intractable. In lieu of this, we require an implicit likelihood which describes the relation between data and parameters in the form of a forward model/simulator (hence *simulation* based inference!). (more…)

## Compass students attending APTS Week in Durham

## Between 4th and 8th of April 2022 Compass CDT students are attending APTS Week 2 in Durham.

**Academy for PhD Training in Statistics** (APTS) organises, through a collaboration between major UK statistics research groups, four residential weeks of training each year for first-year PhD students in statistics and applied probability nationally. Compass students attend all four APTS courses hosted by prestigious UK Universities.

For their **APTS Week in Durham** Compass students will be attending the following modules:

- Applied Stochastic Processes (Nicholas Georgiou and Matt Roberts): This module will introduce students to two important notions in stochastic processes — reversibility and martingales — identifying the basic ideas, outlining the main results and giving a flavour of some of the important ways in which these notions are used in statistics.
- Statistical Modelling (Helen Ogden): The aim of this module is to introduce important aspects of statistical modelling, including model selection, various extensions to generalised linear models, and non-linear models.

## Student Perspectives: Multi-agent sequential decision making

A post by Conor Newton, PhD student on the Compass programme.

# Introduction

My research focuses on designing decentralised algorithms for the multi-agent variant of the Multi-Armed Bandit problem. This research is jointly supervised by Henry Reeve and Ayalvadi Ganesh.

(Image credit: Microsoft Research)

Many real-world optimisation problems involve repeated rather than one-off decisions. A decision maker (who we refer to as an agent) is required to repeatedly perform actions from a set of available options. After taking an action, the agent will receive a reward based on the action performed. The agent can then use this feedback to inform later decisions. Some examples of such problems are:

- Choosing advertisements to display on a website each time a page is loaded to maximise click-through rate.
- Calibrating the temperature to maximise the yield from a chemical reaction.
- Distributing a budget between university departments to maximise research output.
- Choosing the best route to commute to work.

In each case there is a fundamental trade-off between *exploitation* and *exploration*. On the one hand, the agent should act in ways which exploit the knowledge they have accumulated to promote their short term reward, whether that’s the yield of a chemical process or click-through rate on advertisements. On the other hand, the agent should explore new actions in order to increase their understanding of their environment in ways which may translate into future rewards. (more…)