Skip to main content

Model definition

Introduction

How can we mathematically model the dynamics of disease transmission? The most common approach is to use the compartmental models, which we stratify the population into subgroups (compartments) and describe the changes of incoming and outgoing individuals between each compartments.

Although there could be numerous ways to setup compartments, let us start with the SIR (Susceptible-Infectious-Recovered) model which is one of the most common model with the simplest structure. In the SIR model, the population is stratified into three compartments:

  • S-compartment: susceptible to new infections
  • I-compartment: currently infected
  • R-compartment: have recovered from infection and acquired immunity

Various types of models

There are variations in how we mathematically represent or computationally implement the SIR model. Here are some of the perspectives that we can classify the models into.

Deterministic vs. Stochastic

  • Deterministic models: The results are deterministic, meaning that same values would be obtained even if the calculation is repeated. Useful to understand general dynamics of epidemic. Typically represented by difference or differential equations.
  • Stochastic models: Incorporates uncertainty/randomness. Even with the same parameter values, the epidemic could either take off or extinct due to the stochasticity. If we run simulations for a sufficient enough iterations and take its mean, it should provide a close value to that was obtained from the deterministic model.

Discrete time vs. Continuous time

  • Discrete time model: The timesteps are represented in discrete steps (t=0,1,2,t=0,1,2,\cdots).
  • Continuous time mode: The model assumes that the changes in the system occur continuously over time.

Mixing assumptions

  • Homogeneous mixing model: The model assumes that all of the individuals have a equal chance/probability of contacting with each other.
  • Heterogeneous mixing model: Unlike homogeneous mixing, the contact patterns are quantitatively/qualitatively different in each individuals. For example, there could be some people who meets 20 people a day, whereas some people only meets 3. This is what is often called as the "network model," which we will cover in the later section.

Discrete time SIR model

At a discrete time step (e.g., t=0,1,2,t=0,1,2,\cdots), the SIR model is epresented using a set of difference equations as follows,

{St+1=StβItNStIt+1=It+βItNStμItRt+1=Rt+μIt\begin{equation*} \left\{ \begin{aligned} S_{t+1} &= S_t - \beta \frac{I_t}{N} S_t \\ I_{t+1} &= I_t + \beta \frac{I_t}{N} S_t - \mu I_t \\ R_{t+1} &= R_t + \mu I_t \end{aligned} \right. \tag{1} \end{equation*}

where each symbols represent:

  • StS_t: Number of susceptible individuals at time t t
  • ItI_t: Number of infected individuals at time t t
  • RtR_t: Number of recovered individuals at time t t
  • N=St+It+RtN = S_t + I_t + R_t: Total population
  • β\beta: Rate of infection rate per contact
  • μ\mu: Rate of recovery.

This is a deterministic model in a discrete time step with the assumption of homogeneous mixing.

Note that the symbols of the parameters β\beta and μ\mu could be different depending on the literature. We'll stick to this expression in this section unless otherwise noted.

Continuous time SIR model

Let us think of the continuous time version of the SIR model, which can be derived from the set of difference equations above.

For a small time step Δt\Delta t, the change in a variable can be approximated by the difference between successive time steps. For any variable XX, the following approximation holds:

dXdtXt+1XtΔt.\frac{dX}{dt} \approx \frac{X_{t+1} - X_t}{\Delta t}.

If we assume Δt=1\Delta t = 1, the difference equations become:

St+1StΔtβItNStIt+1ItΔtβItNStμItRt+1RtΔtμIt\begin{aligned} \frac{S_{t+1} - S_t}{\Delta t} &\approx -\beta \frac{I_t}{N} S_t \\ \frac{I_{t+1} - I_t}{\Delta t} &\approx \beta \frac{I_t}{N} S_t - \mu I_t \\ \frac{R_{t+1} - R_t}{\Delta t} &\approx \mu I_t \end{aligned}

Taking the limit as Δt0\Delta t \to 0, the left hand side of the equations becomes the derivative. Now, denoting the number of individuals in each compartments at time tt in a continuous timestep as S(t)S(t), I(t)I(t), and R(t)R(t), respectively, we get a set of differential equations:

{dS(t)dt=βI(t)NS(t)dI(t)dt=βI(t)NS(t)μI(t)dR(t)dt=μI(t)\begin{equation*} \left\{ \begin{aligned} \frac{dS(t)}{dt} &= -\beta \frac{I(t)}{N}S(t) \\ \frac{dI(t)}{dt} &= \beta \frac{I(t)}{N}S(t) - \mu I(t) \\ \frac{dR(t)}{dt} &= \mu I(t) \end{aligned} \right. \tag{2} \end{equation*}

This is a deterministic model in a continuous time step with the assumption of homogeneous mixing.

Solving SIR model

SIR model is a nonlinear dynamical system and thus cannot be solved analytically for most settings. However, under specific assumptions, the approximate solution can analytically be derived as follows for the very initial phase of the outbreak.

In the early phase of an epidemic, only a small number of individuals are infected. Thus, the following approximation holds:

S(t)S(0)NS(t) \approx S(0) \approx N

where NN is the total population.

With the assumption above on the susceptible population, the equation for I(t)I(t) becomes:

dI(t)dt=βI(t)NS(t)μI(t)βI(t)NNμI(t)=(βμ)I(t)\begin{aligned} \frac{dI(t)}{dt} &= \beta \frac{I(t)}{N}S(t) - \mu I(t) \\ &\approx \beta \frac{I(t)}{N}N - \mu I(t) \\ &= (\beta - \mu) I(t) \end{aligned}

Since this is a first-order linear differential equation, we can solve it as

1I(t)dI(t)dt=(βμ)I(0)I(t)1IdI=0t(βμ)dt[lnI]I(0)I(t)=[(βμ)u]0tlnI(t)lnI(0)=(βμ)tlnI(t)I(0)=(βμ)t\begin{aligned} \frac{1}{I(t)}\frac{dI(t)}{dt} &= (\beta - \mu) \\ \int_{I(0)}^{I(t)} \frac{1}{I}dI &= \int_0^t (\beta - \mu) {dt} \\ \bigg[ \ln I \bigg]_{I(0)}^{I(t)} &= \bigg[ (\beta - \mu) u \bigg]_0^t \\ \ln I(t) - \ln I(0) &= (\beta - \mu) t \\ \ln \frac{I(t)}{I(0)} &= (\beta - \mu) t \\ \end{aligned}

Taking exponent of both sides,

I(t)I(0)=exp((βμ)t)\frac{I(t)}{I(0)} = \exp \bigg( (\beta - \mu) t \bigg)

Thus we get

I(t)=I(0)exp((βμ)t)(3)\begin{aligned} I(t) = I(0) \exp \bigg( (\beta - \mu) t \bigg) \tag{3} \end{aligned}

where I(0)I(0) is the initial number of infected individuals.

Note that this approximation only holds for

  • the initial phase of epidemic
  • where almost entire population is susceptible

Epidemic threshold

Equation (3) is a exponential function and its behaviour depends on the the power it has:

  • Exponential Growth: If βμ>0\beta - \mu > 0, the infection grows exponentially.
  • Exponential Decay: If βμ<0\beta - \mu < 0 (or βμ<1\frac{\beta}{\mu} < 1), the infection decays (decreases) exponentially.

Due to the threshold property of determining the growth of epidemic, we call the following relationship as the epidemic threshold:

βμ>0    R0=βμ>1\beta - \mu > 0 \iff R_0 = \frac{\beta}{\mu} > 1

The fractional form βμ\frac{\beta}{\mu} is also known as the basic reproduction number R0R_0.

Basic reproduction number

Basic reproduction number R0R_0 (pronounced "R-naught") is the average number of secondary infections made by a single infected case when the population is fully susceptible. If the offspring distribution is given or infered, R0R_0 can be calculated as the first moment (mean) of the distribution.

In the homogeneous SIR model, R0R_0 is expressed as:

R0=βμR_0 = \frac{\beta}{\mu}

For respiratory emerging diseases including SARS-CoV-2, the offspring distribution is known to be modeled well by the negative binomial distribution NegBin(R0,k)\textrm{NegBin}(R_0, k) where kk is the dispersion parameter. Dispersion parameter kk for SARS-CoV-2 is known to be very small (around k0.1k \approx 0.1), which indicates the highly heterogeneous distribution (i.e., overdispersion) of the number of secondary transmissions.

Also note that the formula for R0R_0 is different when we assume heterogeneous mixing (in the network).

def calc_R0_homogeneous(beta, gamma):
return beta/gamma
calc_R0_homogeneous(beta=0.4, gamma=0.2)
2.0