COVID-19 Outbreak Models: Algorithmic vs. Analytic

April 29, 2020|AI, Data Analytics, Pandemic

Today is the 35th day of Colorado’s state ordered ‘stay at home’ to fight the SARS-Cov-2 virus. Our house has been skipped by the COVID-19 fairy, or maybe not – given asymptomatic infection. More on this later. Let’s explore the models behind the CIVID-19 curves.

We have been working from home – and a major focus has been developing and tuning a COVID-19 epidemic spread model. We have been part of a small working group drawn from healthcare, university, county, and private sector resources. In the process of developing this model (the educational version is available from the GunderFish Website) we have spend a lot of time evaluating and discussing the nature of models. One focus has been on the differences between an analytic model and an algorithmic model, and what each is good for.

Analytic Models

An analytic model is defined as a mathematical model that has a closed form solution, i.e. the solution to the equations used to describe changes in a system can be expressed as a mathematical analytic function.

So, basically, if you can write down an equation that describes the model it is an analytic model. In the time of COVID, one equation that people are thinking about is exponential growth. One person infects two people this week, those two people infect two each next week so four are infected. Four people infect eight the next week, and so on. If you want to know how many people get infected in week ‘x’, you calculate 2^x and you are done.

Voila! Your first analytic model!

Algorithmic Models

So if that’s all there is to an analytic model what is an algorithmic model? To start with a definition an algorithmic model is a model that relies on a set of carefully defined instructions that take a set of inputs, manipulate them, and produce some output. A recipe is an informal example of this. It takes a set of inputs, called ingredients, manipulates them in specific ways, and produces a (usually!) edible output. One way you could solve the contagion model above is to use a algorithm like I used to describe the spread:

To get the number of people infected in week x:

Step 1: Start with one person ‘p’, set week ‘w’ = 1,
Step 2: Multiply p times 2 (p=p*2) , that is that is the number of new infections in week ‘w’
Step 3: Add one to w (w=w+1)
Step 4: Does ‘w’ equal ‘x’?
- Step 4a: If yes, report that ‘p’ people were infected in week ‘x’
- Step 4b: If not, repeat from Step 2

Now you have a model of how many people will be infected for any week you want. You get the amount by following a set of steps – an algorithm. This is an algorithmic model that describes the same situation as the analytic model.

But the Analytic model was so much simpler!

And the analytic solution didn’t involve looping or questions, why don’t we just use that? Well, it turns out that not everything we want to model has a closed form equation to describe it. Or sometimes we just don’t know what the complete equation is, so we only have an approximate equation to use. The power of the algorithmic model is that you can model things for which we have no equation. The simple exponential growth model we just built is really simple. It assumes an infinite population, it assumes unrestricted access to new uninfected people. But the real world isn’t like that.

Real World Concerns

In the real world, the number of uninfected people starts to drop and more people get infected, so the growth starts to slow. This is the ‘herd immunity’ that we are hoping for. While it is true that there is a closed form equation for this part, people get in the way of a a complete closed form model. We all know people who stopped going out months ago, long before we talked about social distancing. And there are the people who started working from home more and more. Then there were the social distancing orders – close bars and restaurants, No groups over 50, then 25, then 10. Close the schools, close the churches. These all affect the number of opportunities that an infected person has to infect someone new. (Is this beginning to sound like some kind of Zombie movie with hundreds of the infected roaming the streets looking for brains?) In short, sometimes the closed form equation is too simple to be useful.

Modeling processes where people both affect the model and change their behavior as the situation evolves is a tricky business. This is exactly what happens during a pandemic. Someone hears about a friend or neighbor who is sick, so they go out less, wash their hands more, flatten the curve. Someone else has been cooped up for a week – they see the most recent case count. It is still going up so they conclude that social isolation has no effect and they start shopping more, and skip the mask because it is uncomfortable. Each of these choices is made tens or hundreds of thousands times a day and each one contributes to how many cases we will need next week, next month.

Complex equations can be hard to find and solve

Finding a closed form equation for these kinds of interactions is next to impossible and even if we had it we couldn’t find the numbers to put in to solve the equation. There are attempts to model the aggregate behavior by adding a ‘knob’ to the basic equation. A sort of volume control on ‘social distancing’ – but how do you set it? Suppose a state like Colorado decided to allow non-vulnerable people to stop the ‘stay at home’ requirement, and 50% of the workforce to go back to work. Does that turn the knob from 3 to 4 or does it turn the knob from 3 to 11? We don’t know, so the closed form equations rapidly become a guessing game.

So, with the algorithmic model you can create individual groups of people that react in the same way, we call them Cohorts. Each Cohort tracks it’s own information – when it arrived (so that we can model people moving around the country) when it got infected, began to show symptoms, went to the hospital, and eventually resolved. Suppose in a given Cohort, some get sick but others do not. We can simply split the Cohort into two smaller Cohorts, and track each one individually. The algorithmic model has great flexibility.

The algorithmic approach gives the modeler several advantages, but they come at a cost. The biggest cost is the potentially large computational cost associated with running the algorithm. In our COVID-19 model for the state of Colorado, we begin with a single, large Cohort of uninfected people, and pair of small Cohorts for the infected populations. Three cohorts that soon explode into tens of thousands of cohorts each having followed its own trajectory as the outbreak expands. Each of these Cohorts must be evaluated for changes and split and split again to accurately track the components that make up the model. So, algorithmic models can be very flexible – but they can also be very expensive.

Wrapping it up

In this post, we looked at two different approaches to modeling real world phenomena: analytic and algorithmic models. Each has costs and benefits, each can be applied with greater or lesser effect to almost any subject that one wishes to model. In general, if there is a known (or knowable) closed form solution, an analytic model will provide very accurate, very precise results at a low cost. On the other hand, if the subject has no known closed form, or has complex interactive aspects, the best solution.