When is a model a black box?

One of the issues which comes up frequently with mathematical modelling is the question of whether a model is a “black box”. A model based on machine learning, for example, is not something you can analyse just by peering under the hood. It is a black box even to its designers.

For this reason, many people feel more comfortable with mechanistic models which are based on causal descriptions of underlying processes. But these come with their problems too.

For example, a model of a growing tumour might incorporate a description of individual cells, their growth dynamics, their interactions with each other and the environment, their access to nutrients such as oxygen, response to drugs, and so on. A 3D model of a heart has to incorporate additional effects such as fluid dynamics, electrophysiology, and so on. In principal, all of these processes can be written out as mathematical equations, combined into a huge mathematical model, and solved. But that doesn’t make these models transparent.

One problem is that each component of the model – say an equation for the response of a cell to a particular stimulus – is usually based on approximations and is almost impossible to accurately test. In fact there is no reason to think that complex natural phenomena can be fit by simple equations at all – what works for something like gravity does not necessarily work in biology. So the fact that something has been written out as a plausible mechanistic process does not tell us much about its accuracy.

Another problem is that any such model will have a huge number of adjustable parameters. This makes the model very flexible: you can adjust the parameters to get the answer you want. Models are therefore very good at fitting past data, but they often do less well at predicting the future.

A complex mechanistic model is therefore a black box of another sort. Although we can look under its hood, and see all the working parts, that isn’t very useful, because these models are so huge – often with hundreds of equations and parameters – that it is impossible to spot errors or really understand how they work.

Of course, there is another kind of black box model, which is a model that is deliberately kept inside a black box – think for example of the trading algorithms used by hedge funds. Here the model may be quite simple, but it is kept secret for commercial reasons. The fact that it is a closely-guarded secret probably just means that it works.