Student Perspectives: Contemporary Ideas in Statistical Philosophy

A post by Alessio Zakaria, PhD student on the Compass programme.

Introduction

Probability theory is a branch of mathematics centred around the abstract manipulation and quantification of uncertainty and variability. It forms a basic unit of the theory and practice of statistics, enabling us to tame the complex nature of observable phenomena into meaningful information. It is through this reliance that the debate over the true (or more correct) underlying nature of probability theory has profound effects on how statisticians do their work. The current opposing sides of the debate in question are the Frequentists and the Bayesians. Frequentists believe that probability is intrinsically linked to the numeric regularity with which events occur, i.e. their frequency. Bayesians, however, believe that probability is an expression of someones degree of belief or confidence in a certain claim. In everyday parlance we use both of these concepts interchangeably: I estimate one in five of people have Covid; I was 50% confident that the football was coming home. It should be noted that the latter of the two is not a repeatable event per se. We cannot roll back time to check what the repeatable sequence would result in.

(more…)

Student perspectives: How can we do data science without all of our data?

A post by Daniel Williams, Compass PhD student.

Imagine that you are employed by Chicago’s city council, and are tasked with estimating where the mean locations of reported crimes are in the city. The data that you are given only goes up to the city’s borders, even though crime does not suddenly stop beyond this artificial boundary. As a data scientist, how would you estimate these centres within the city? Your measurements are obscured past a very complex border, so regular methods such as maximum likelihood would not be appropriate.

Chicago Homicides
Figure 1: Homicides in the city of Chicago in 2008. Left: locations of each homicide. Right: a density estimate of the same crimes, highlighting where the ‘hotspots’ are.

This is an example of a more general problem in statistics named truncated probability density estimation. How do we estimate the parameters of a statistical model when data are not fully observed, and are cut off by some artificial boundary? (more…)

Video: The Data Science behind COVID Modelling

We are excited to share Dr Daniel Lawson’s (Compass CDT Co-Director) latest video where he will tell you about the Data Science behind Bristol’s COVID Modelling.

Mathematics has had a hidden role in predicting how we can best fight COVID-19. How is mathematics used with data science and machine learning? Why is modelling epidemics such a hard problem? How can we do it better next time? What will data science be able to do in the future, and how do you become a part of it?

Skip to toolbar