Student Perspectives: Change in the air: Tackling Bristol’s nitrogen oxide problem

A post by Dom Owens, PhD student on the Compass programme.


industry-1752876_1920

“Air pollution kills an estimated seven million people worldwide every year” – World Health Organisation

Many particulates and chemicals are present in the air in urban areas like Bristol, and this poses a serious risk to our respiratory health. It is difficult to model how these concentrations behave over time due to the complex physical, environmental, and economic factors they depend on, but identifying if and when abrupt changes occur is crucial for designing and evaluating public policy measures, as outlined in the local Air Quality Annual Status Report.  Using a novel change point detection procedure to account for dependence in time and space, we provide an interpretable model for nitrogen oxide (NOx) levels in Bristol, telling us when these structural changes occur and describing the dynamics driving them in between.

Model and Change Point Detection

We model the data with a piecewise-stationary vector autoregression (VAR) model:

In between change points the time series \boldsymbol{Y}_{t}, a d-dimensional vector, depends on itself linearly over p \geq 1 previous time steps through parameter matrices \boldsymbol{A}_i^{(j)}, i=1, \dots, p with intercepts \boldsymbol{\mu}^{(j)}, but at unknown change points k_j, j = 1, \dots, q the parameters switch abruptly. \{ \boldsymbol{\varepsilon}_{t} \in \mathbb{R}^d : t \geq 1 \} are white noise errors, and we have n observations.

(more…)

Student perspectives: Wessex Water Industry Focus Lab

A post by Michael Whitehouse, PhD student on the Compass programme.

Introduction

September saw the first of an exciting new series of Compass industry focus labs; with this came the chance to make use of the extensive skill sets acquired throughout the course and an opportunity to provide solutions to pressing issues of modern industry. The partner for the first focus lab, Wessex Water, posed the following question: given time series data on water flow levels in pipes, can we detect if new leaks have occurred? Given the inherent value of clean water available at the point of use and the detriments of leaking this vital resource, the challenge of ensuring an efficient system of delivery is of great importance. Hence, finding an answer to this question has the potential to provide huge economic, political, and environmental benefits for a large base of service users.

Data and Modelling:

The dataset provided by Wessex Water consisted of water flow data spanning across around 760 pipes. After this data was cleaned and processed some useful series, such as minimum nightly and average daily flow (MNF and ADF resp.), were extracted. Preliminary analysis carried out by our collaborators at Wessex Water concluded that certain types of changes in the structure of water flow data provide good indications that a leak has occurred. From this one can postulate that detecting a leak amounts to detecting these structural changes in this data. Using this principle, we began to build a framework to build solutions: detect the change; detect a new leak. Change point detection is a well-researched discipline that provides us with efficient methods for detecting statistically significant changes in the distribution of a time series and hence a toolbox with which to tackle the problem. Indeed, we at Compass have our very own active member of the change point detection research community in the shape of Dom Owens. The preliminary analysis gave that there are three types of structural change in water flow series that indicate a leak: a change in the mean of the MNF, a change in trend of the MNF, and a change in the variance of the difference between the MNF and ADF. In order to detect these changes with an algorithm we would need to transform the given series so that the original change in distribution corresponded to a change in the mean of the transformed series. These transforms included calculating generalised additive model (GAM) residuals and analysing their distribution. An example of such a GAM is given by:

\mathbb{E}[\text{flow}_t] = \beta_0 \sum_{i=1}^m f_i(x_i).

Where the x i ’s are features we want to use to predict the flow, such as the time of day or current season. The principle behind this analysis is that any change in the residual distribution corresponds to a violation of the assumption that residuals are independently, identically distributed and hence, in turn, corresponds to a deviation from the original structure we fit our GAM to. (more…)

Skip to toolbar