Model definition
Introduction
How can we mathematically model the dynamics of disease transmission? The most common approach is to use the compartmental models, which we stratify the population into subgroups (compartments) and describe the changes of incoming and outgoing individuals between each compartments.
Although there could be numerous ways to setup compartments, let us start with the SIR (Susceptible-Infectious-Recovered) model which is one of the most common model with the simplest structure. In the SIR model, the population is stratified into three compartments:
- S-compartment: susceptible to new infections
- I-compartment: currently infected
- R-compartment: have recovered from infection and acquired immunity
Various types of models
There are variations in how we mathematically represent or computationally implement the SIR model. Here are some of the perspectives that we can classify the models into.
Deterministic vs. Stochastic
- Deterministic models: The results are deterministic, meaning that same values would be obtained even if the calculation is repeated. Useful to understand general dynamics of epidemic. Typically represented by difference or differential equations.
- Stochastic models: Incorporates uncertainty/randomness. Even with the same parameter values, the epidemic could either take off or extinct due to the stochasticity. If we run simulations for a sufficient enough iterations and take its mean, it should provide a close value to that was obtained from the deterministic model.
Discrete time vs. Continuous time
- Discrete time model: The timesteps are represented in discrete steps ().
- Continuous time mode: The model assumes that the changes in the system occur continuously over time.
Mixing assumptions
- Homogeneous mixing model: The model assumes that all of the individuals have a equal chance/probability of contacting with each other.
- Heterogeneous mixing model: Unlike homogeneous mixing, the contact patterns are quantitatively/qualitatively different in each individuals. For example, there could be some people who meets 20 people a day, whereas some people only meets 3. This is what is often called as the "network model," which we will cover in the later section.
Discrete time SIR model
At a discrete time step (e.g., ), the SIR model is epresented using a set of difference equations as follows,
where each symbols represent:
- : Number of susceptible individuals at time
- : Number of infected individuals at time
- : Number of recovered individuals at time
- : Total population
- : Rate of infection rate per contact
- : Rate of recovery.
This is a deterministic model in a discrete time step with the assumption of homogeneous mixing.
Note that the symbols of the parameters and could be different depending on the literature. We'll stick to this expression in this section unless otherwise noted.
Continuous time SIR model
Let us think of the continuous time version of the SIR model, which can be derived from the set of difference equations above.
For a small time step , the change in a variable can be approximated by the difference between successive time steps. For any variable , the following approximation holds:
If we assume , the difference equations become:
Taking the limit as , the left hand side of the equations becomes the derivative. Now, denoting the number of individuals in each compartments at time in a continuous timestep as , , and , respectively, we get a set of differential equations:
This is a deterministic model in a continuous time step with the assumption of homogeneous mixing.
Solving SIR model
SIR model is a nonlinear dynamical system and thus cannot be solved analytically for most settings. However, under specific assumptions, the approximate solution can analytically be derived as follows for the very initial phase of the outbreak.
In the early phase of an epidemic, only a small number of individuals are infected. Thus, the following approximation holds:
where is the total population.
With the assumption above on the susceptible population, the equation for becomes:
Since this is a first-order linear differential equation, we can solve it as
Taking exponent of both sides,
Thus we get
where is the initial number of infected individuals.
Note that this approximation only holds for
- the initial phase of epidemic
- where almost entire population is susceptible
Epidemic threshold
Equation (3) is a exponential function and its behaviour depends on the the power it has:
- Exponential Growth: If , the infection grows exponentially.
- Exponential Decay: If (or ), the infection decays (decreases) exponentially.
Due to the threshold property of determining the growth of epidemic, we call the following relationship as the epidemic threshold:
The fractional form is also known as the basic reproduction number .
Basic reproduction number
Basic reproduction number (pronounced "R-naught") is the average number of secondary infections made by a single infected case when the population is fully susceptible. If the offspring distribution is given or infered, can be calculated as the first moment (mean) of the distribution.
In the homogeneous SIR model, is expressed as:
For respiratory emerging diseases including SARS-CoV-2, the offspring distribution is known to be modeled well by the negative binomial distribution where is the dispersion parameter. Dispersion parameter for SARS-CoV-2 is known to be very small (around ), which indicates the highly heterogeneous distribution (i.e., overdispersion) of the number of secondary transmissions.
Also note that the formula for is different when we assume heterogeneous mixing (in the network).
def calc_R0_homogeneous(beta, gamma):
return beta/gamma
calc_R0_homogeneous(beta=0.4, gamma=0.2)
2.0