Video: The Data Science behind COVID Modelling

We are excited to share Dr Daniel Lawson’s (Compass CDT Co-Director) latest video where he will tell you about the Data Science behind Bristol’s COVID Modelling.

Mathematics has had a hidden role in predicting how we can best fight COVID-19. How is mathematics used with data science and machine learning? Why is modelling epidemics such a hard problem? How can we do it better next time? What will data science be able to do in the future, and how do you become a part of it?

Student Research Topics for 2020/21

This month, the Cohort 2 Compass students have started work on their mini projects and are establishing the direction of their own research within the CDT.

Supervised by the Institute for Statistical Science:

Anthony Stevenson will be working with Robert Allison on a project entitled Fast Bayesian Inference at Extreme Scale.  This project is in partnership with IBM Research.

Conor Crilly will be working with Oliver Johnson on a project entitled Statistical models for forecasting reliability. This project is in partnership with AWE.

Euan Enticott will be working with Matteo Fasiolo and Nick Whiteley on a project entitled Scalable Additive Models for Forecasting Electricity Demand and Renewable Production.  This project is in partnership with EDF.

Annie Gray will be working with Patrick Rubin-Delanchy and Nick Whiteley on a project entitled Exploratory data analysis of graph embeddings: exploiting manifold structure.

Ed Davis will be working with Dan Lawson and Patrick Rubin-Delanchy on a project entitled Graph embedding: time and space.  This project is in partnership with LV Insurance.

Conor Newton will be working with Henry Reeve and Ayalvadi Ganesh on a project entitled  Decentralised sequential decision making and learning.

The following projects are supervised in collaboration with the Institute for Statistical Science (IfSS) and our other internal partners at the University of Bristol:

Dan Ward will be working with Matteo Fasiolo (IfSS) and Mark Beaumont from the School of Biological Sciences on a project entitled Agent-based model calibration via regression-based synthetic likelihood. This project is in partnership with Improbable

Jack Simons will be working with Song Liu (IfSS) and Mark Beaumont (Biological Sciences) on a project entitled Novel Approaches to Approximate Bayesian Inference.

Georgie Mansell will be working with Haeran Cho (IfSS) and Andrew Dowsey from the School of Population Health Sciences and Bristol Veterinary School on a project entitled Statistical learning of quantitative data at scale to redefine biomarker discovery.  This project is in partnership with Sciex.

Shannon Williams will be working with Anthony Lee (IfSS) and Jeremy Phillips from the School of Earth Sciences on a project entitled Use and Comparison of Stochastic Simulations and Weather Patterns in probabilistic volcanic ash hazard assessments.

Sam Stockman  will be working with Dan Lawson (IfSS) and Maximillian Werner from the School of Geographical Sciences on a project entitled Machine Learning and Point Processes for Insights into Earthquakes and Volcanoes

What to know before studying Data Science

by Dr Daniel Lawson, Senior Lecturer in Data Science, University of Bristol and Compass CDT Co-Director 

For the first time in history, data is abundant and everywhere. This has created a new era for how we understand the world. Modern Data Science is new and changing the world, but it is rooted in cleverness throughout history.

What is Data Science used for today?

Data Science is ubiquitous today. Many choices about what to buy, what to watch, what news to read – these are either directly or indirectly influenced by recommender systems that match our history with that of others to show us something we might want. Machine Learning has revolutionised computer vision, automation has revolutionised industry and distribution, whilst self-driving cars are at least close. Knowledge is increasingly distributed, with distributed learning ranging from Wikipedia to spam detection.

(more…)

JGI event: Data Science Seminars

The Jean Golding Institute runs an annual series of Data Science Seminars

Upcoming seminars (if you are interested in attending you can sign up with Eventbrite using the links below):

Data Science for Vikings

Mathematics from Dr Daniel Lawson‘s group at the University of Bristol found that the World’s largest ever DNA sequencing of Viking skeletons reveals they weren’t all Scandinavian. (Link to Paper.)

Invaders, pirates, warriors – the history books taught us that Vikings were brutal predators who travelled by sea from Scandinavia to pillage and raid their way across Europe and beyond.

(more…)

Compass supervisors appointed as Heilbronn Data Science Chairs

Congratulations to Professor Anthony Lee – Unit Director for Statistical Computing 1 in the Compass CDT programme – and Professor Nick Whiteley – Compass CDT Director – who have been appointed to the position of Heilbronn Chairs in Data Science. Anthony and Nick have distinguished themselves as internationally outstanding leaders in their field and these appointments support our position as one of the top centres for statistical and data science in the UK.

Professors Anthony Lee and Nick Whiteley
Professor Anthony Lee and Professor Nick Whiteley are both Compass CDT supervisors

Harvard Prof delivers guest lectures to Compass students

We’re are delighted to welcome Pierre Jacob, Associate Professor of Statistics at Harvard University, to the University of Bristol in March.

Pierre will be delivering lectures for the COMPASS CDT students but all staff in the School of Maths are welcome to attend.

Title: Couplings and Monte Carlo

In his lectures, Pierre will cover couplings, total variation and optimal transport. He will describe the use of couplings in Monte Carlo methods, such as coupling from the past, diagnostics of convergence, and bias removal.

This event is sponsored by COMPASS – EPSRC Centre for Doctoral Training in Computational Statistics and Data Science.

Compass students take part in electricity demand forecasting hackathon

Dr Jethro Browell, Research Fellow at the University of Strathclyde, and Dr Matteo Fasiolo, Lecturer at the University of Bristol, ran a regional electricity demand forecasting hackathon for students in the COMPASS Centre for Doctoral Training yesterday.

Visiting Research Fellow Dr Browell gave students an overview of how the Great Britain electricity transmission network has changed during the last decade, with particular focus on the consequences of the increased production from small-scale renewable sources, which appear as “negative demand”.

Dr Fasiolo then introduced a dataset containing electricity demand and weather-related variables, such as wind speed and solar irradiation, from 14 regions covering the whole of Great Britain. He proposed an initial forecasting solution based on a simple Generalized Additive Model (GAM), which he used to forecast the demand in each region.

The hackathon started, with the “Jim” team being the first to propose an improved solution, based on a more sophisticated GAM model, which beat the initial GAM in terms of forecasting accuracy.

The “AGang” team then produced an even more sophisticated GAM, which took them to the top of the ranking. In the meantime, the “D&D” team was struggling to make their random forest work, and submitted a couple of poor forecasts. Toward the end of event, “AGang” produced a couple of improved GAM solutions, which further strengthened their lead.

While Dr Fasiolo and Dr Browell were wrapping up the event and preparing to award the winners, the “D&D” team caught everyone by surprise by submitting a forecast which beat all others by a margin, in terms of forecasting accuracy. Their random forest was far better than the GAMs at predicting demand in Scotland, where wind production is an important factor and the dynamics are quite different relative to the other regions.

Congratulations to the top three teams:

  1. D&D:  Doug Corbin and Dom Owens
  2. AGang: Andrea Becsek, Alex Modell and Alessio Zakaria
  3. Jim:  Michael Whitehouse, Daniel Williams and Jake Spiteri

Winning team “D&D” said:  “Given physical measurements, such as wind speeds and precipitation, as well as calendar data, we first performed a minor amount of feature engineering. Given the complex nature of the interactions between the variables, and large amount of data available, we opted to fit random forest models. These performed feature selection for us and provided some robustness from outlying observations.

“However, the models took a long time to fit. Despite parallelising the model fitting across the regions, we only just got our predictions in before the deadline. Thankfully, our model consistently outperformed the other approaches.

“Everyone taking part had a great time learning about the challenges of energy modelling, and we thrived under the pressure of friendly competition.”

Dr Browell added: “Computational statistics and data science is driving innovation in the energy sector and the technologies they enable will play a huge role in the decarbonisation. I was pleased to be able to expose the COMPASS cohort to this application and hope that they will be inspired to apply their expertise to energy and climate problems in the future.”

Member of French Academy of Sciences presents mini-series of lectures

Eric Moulines from Ecole Polytechnique is visiting University of Bristol and the School of Mathematics in January 2020.  He will present a mini-series of lectures.  

Convex optimization for machine learning

The purpose of this course is to give an introduction to convex optimization and its applications in statistical learning.

In the first part of the course, I will recall the importance of convex optimisation in statistical learning. I will briefly introduce some useful results of convex analysis. I will then analyse gradient descent algorithms for strongly convex and then convex smooth functions. I will take this opportunity to establish some results on complexity lower bounds for such problems. I will show that the gradient descent algorithm is suboptimal and does not reach the optimal possible speed of convergence. I will the present a strategy to accelerate gradient descent algorithms in order to obtain optimal speeds.

In the second part of the course, I will focus on non smooth optimisation problems. I we will introduce the proximal operator of which I will establish some essential properties. I will then study the proximal gradient algorithms and their accelerated versions.

In a third part, I will look at stochastic versions of these algorithms.

The lectures will take place at the following times:

Tuesday 28th January 11:00- 12:00
Thursday 30th January 13:00- 14:00
Friday 31st January 10:00- 11:00

Skip to toolbar