From Unknown Unknowns to Known Unknowns

Unravelling the beauty of uncertainty

Uncertainties are ubiquitous in everyday life. While we would usually embrace uncertainties and treat them as part of everyday life, there are some domains in which these uncertainties cannot be dismissed. For instance, in applications where occlusion of information cannot be tolerated, these uncertainties must be leveraged to ensure that there is enough margin for error.

There are two types of uncertainty: aleatoric and epistemic. Aleatoric uncertainty is the uncertainty that cannot be reduced, an example being tossing a fair coin. There is a half chance the coin lands on its head, and another half chance that it lands on its tail – there is simply no way to reduce this uncertainty however many times the coin is tossed. On the other hand, epistemic uncertainty can be reduced by collecting more data. This uncertainty is usually caused by external factors, such as noise introduced in the data collection process. Think about covid tests, you feel well, but somehow you tested positive. What you would do in this situation is probably to do another covid test, now the result is negative. To further convince yourself, you would do another test and the result of which would probably assure you about your covid status. What you are subconsciously doing here is collecting data to remove the noise in testing. Similarly, in many other scientific applications, by collecting data, we could reduce the noise which would have otherwise resulted in significant epistemic uncertainty.

In machine learning, uncertainties are often leveraged to label new data algorithmically in a sequential fashion especially in domains where the costs of data labelling are expensive. This is commonly known as active learning in the literature: a trained machine learning model would identify regions in which it is the least confident, and an oracle would then label some data in those regions before a new model is trained. By continually adding new labelled data, the model can continuously learn and improve over time in a cost-effective manner. This is a classic example of how understanding epistemic uncertainty could be used to bring about positive impacts, but there remains a question, one that has been an active research area in the machine learning community: how are we going to quantify the epistemic uncertainty?

While we are not going to dive deep into different methods of uncertainty quantification in machine learning, it is nonetheless important to highlight that no method is perfect and it is very much a domain with active research. There are different considerations when devising an uncertainty quantification method, spanning from efficacy to environmental impact to costs. For example, the crudest way of quantifying uncertainty in neural networks is to model distributions of weights, commonly known as Bayesian Neural Networks (BNNs) *. However, as you would imagine, doing so would require a lot of computational power and as such it is not widely used. An alternative approach which is more commonly adopted in the literature is Monte Carlo dropout , where dropout is turned on during inference and the standard deviation of multiple inferences is the uncertainty of the model. This approach, although requiring less computational power, has a known weakness where the in-between uncertainty is underestimated (see the figure below). A widely known approach, which forms the foundation of many neural net-based uncertainty quantification methods mentioned above, is Gaussian Processes . It is a timeless, Bayesian-inspired method that has a strong mathematical basis and is often reliable in capturing uncertainties. However, this method has seen dwindling adoption since it has difficulty scaling with data.

The image on the left is the plot when Monte Carlo dropout is used to obtain the uncertainty. The in-between uncertainty, shown by the blue band, is severely underestimated. The image on the right shows that this problem is alleviated when a Bayesian based approach is used. In both plots, the black dots are the training data, and the red sinusoidal wave is the function from which the black dots are sampled.

In a nutshell, uncertainties should be understood in many decision making processes especially in critical applications, and choosing the right approach to modelling the right uncertainty is just as important: underestimating it could potentially lead to disastrous outcomes, and overestimating it would add to unnecessary costs. The notion of uncertainty makes us uneasy at times, but to convert unknown unknowns into known unknowns is after all, the beauty of science, isn't it?

* There are different variants of BNNs, and the article here provides an amazing introduction to these methods.

Yi Heng, Research Engineer
Yi Heng has been working on cutting edge machine learning research with the team since the company's seed investment. He holds a bachelor's and master's degree in Information Engineering from the University of Cambridge.

Latest Articles

From Unknown Unknowns to Known Unknowns

Unravelling the beauty of uncertainty

Perspective on Fusion Energy with Lasers

Can Machine Learning help ease the arduous path toward a viable solution?

Days to Seconds

A new lens – imagine the future possibilities