Skip to main content

Generating random graphs from HSCM with Pareto Distribution

Introduction

Let ρ=Pareto(kˉ,γ)\rho=\textrm{Pareto}(\bar{k}, \gamma) for a HSCM(n,ρ\rho). A random graph in such HSCM can be generated using multiple methods, and this note describes the derivation process from the kk-generator to the uu-generator (inverse CDF method). See chapter 9 of the class notes from Network Science 2 for more details.

Pareto Distribution

PDF

The probability density function (PDF) of the Pareto distribution is given by:

P(k)=αkminαk(α+1)\begin{align*} P(k) &= \alpha {k_{\min}}^{\alpha} k^{-(\alpha+1)} \\ \end{align*}

where kmink_{\min} is the minimum value of kk, α>0\alpha > 0 is the shape parameter.

CDF

The cumulative distribution function (CDF) of the Pareto distribution can be derived from the PDF as follows:

C(k)=kminkP(x)dx=kminkαkminαx(α+1)dx=αkminαkminkx(α+1)dx=1ααkminαxα=kminα[xα]kmink=kminα(kαkminα)=kminαkα+1=1(kmink)α\begin{align*} C(k) &= \int_{k_{\min}}^{k} P(x) dx \\ &= \int_{k_{\min}}^{k} \alpha {k_{\min}}^{\alpha} x^{-(\alpha+1)} dx \\ &= \alpha {k_{\min}}^{\alpha} \int_{k_{\min}}^{k} x^{-(\alpha+1)} dx \\ &= -\frac{1}{\alpha} \alpha {k_{\min}}^{\alpha} x^{-\alpha} \\ &= - {k_{\min}}^{\alpha} \bigg[ x^{-\alpha} \bigg]_{k_{\min}}^{k} \\ &= - {k_{\min}}^{\alpha} ( k^{-\alpha} - k_{\min}^{-\alpha} ) \\ &= - {k_{\min}}^{\alpha} k^{-\alpha} + 1 \\ &= 1 - \bigg( \frac{k_{\min}}{k} \bigg) ^{\alpha} \\ \end{align*}

Note that the CDF takes values in the interval [0,1][0, 1] for kkmink \geq k_{\min}.

Mean

The mean of the Pareto distribution kˉ\bar{k} can be calculated as follows:

kˉ=kminxP(x)dx=kminxαkminαx(α+1)dx=kminαkminαxαdx=αkminα[1α+1xα+1]kmin=(αkminα)(1α+1kminα+1)=αα+1kmin \begin{align*} \bar{k} &= \int_{k_{\min}}^{\infty} x P(x) dx \\ &= \int_{k_{\min}}^{\infty} x \cdot \alpha {k_{\min}}^{\alpha} x^{-(\alpha+1)} dx \\ &= \int_{k_{\min}}^{\infty} \alpha {k_{\min}}^{\alpha} x^{-\alpha} dx \\ &= \alpha {k_{\min}}^{\alpha} \bigg[ \frac{1}{-\alpha+1} x^{-\alpha+1} \bigg]_{k_{\min}}^{\infty} \\ &= \bigg( \alpha {k_{\min}}^{\alpha} \bigg) \bigg( - \frac{1}{-\alpha+1} {k_{\min}}^{-\alpha+1} \bigg)\\ &= - \frac{\alpha}{-\alpha+1} k_{\min} \ \end{align*}

Since γ=α+1\gamma = \alpha + 1, i.e., α=γ1\alpha = \gamma - 1, we can rewrite the above equation for kˉ\bar{k} as follows:

kˉ=γ1γ+2kmin=kmin(γ1γ2)\begin{align*} \bar{k} &= - \frac{\gamma-1}{-\gamma+2} k_{\min} \\ &= k_{\min} \bigg( \frac{\gamma-1}{\gamma-2} \bigg) \\ \end{align*}

Rearranging the above equation, we can express kmink_{\min} in terms of kˉ\bar{k} and γ\gamma as follows:

kmin=kˉ(γ2γ1)\begin{align*} k_{\min} &= \bar{k} \bigg( \frac{\gamma-2}{\gamma-1} \bigg) \tag{1} \end{align*}

kk-generator

If we could directly sample from the Pareto distribution, we could generate a random graph by connecting nodes ii and jj independently with probability pk(ki,kj)p_k(k_i, k_j):

pk(ki,kj)=1kˉnkikj+1\begin{align*} p_k(k_i, k_j) = \frac{1}{\frac{\bar{k}n}{k_i k_j} + 1} \tag{2} \end{align*}

kik_i is independently sampled from the Pareto distribution as

kiPareto[kˉ,γ](k)\begin{align*} k_i \sim \text{Pareto}[\bar{k}, \gamma](k) \end{align*}

where

ki[kmin,]\begin{align*} k_i \in [k_{\min}, \infty] \end{align*}

uu-generator: Inverse CDF sampling

Overview

Practically, we often use inverse CDF method to sample from the Pareto distribution. The inverse CDF method involves the following steps:

  1. Generate a uniform random number uUniform(0,1)u \sim \text{Uniform}(0, 1). Note that 1uUniform(0,1)1-u \sim \text{Uniform}(0, 1) as well, so we can use uu or 1u1-u interchangeably.
  2. Use the inverse function of the CDF to find the corresponding value of kk from the uniform random number uu.

Derivation of pu(ui,uj)p_u(u_i, u_j)

The CDF can be expressed with uu as a function of kk as follows:

u=1(kmink)α\begin{align*} u = 1 - \bigg( \frac{k_{\min}}{k} \bigg) ^{\alpha} \end{align*}

Solving for kk, we get the inverse CDF as follows:

(kmink)α=1ukmink=(1u)1αk=kmin(1u)1α \begin{align*} \bigg( \frac{k_{\min}}{k} \bigg) ^{\alpha} &= 1 - u \\ \frac{k_{\min}}{k} &= (1 - u)^{\frac{1}{\alpha}} \\ k &= k_{\min} (1 - u)^{-\frac{1}{\alpha}} \end{align*}

Plugging equation (3) into equation (2), the probability of connecting two nodes ii and jj with degrees kik_i and kjk_j respectively can be expressed in terms of uiu_i and uju_j as follows:

pk(ki,kj)=1kˉnkikj+1    pu(ui,uj)=1kˉnkmin(1ui)1αkmin(1uj)1α+1=1kˉnkmin2(1ui)1α(1uj)1α+1=1kˉnkmin2((1ui)(1uj))1α+1=1kˉnkmin2(1ui)(1uj)1α+1\begin{align*} p_k(k_i, k_j) &= \frac{1}{\frac{\bar{k}n}{k_i k_j}+1} \\ \iff p_u(u_i, u_j) &= \frac{1}{\frac{\bar{k}n}{k_{\min} (1 - u_i)^{-\frac{1}{\alpha}} \cdot k_{\min} (1 - u_j)^{-\frac{1}{\alpha}}}+1} \\ &= \frac{1}{\frac{\bar{k}n}{{k_{\min}}^2 (1 - u_i)^{-\frac{1}{\alpha}} (1 - u_j)^{-\frac{1}{\alpha}}}+1} \\ &= \frac{1}{\frac{\bar{k}n}{{k_{\min}}^2 \big( (1 - u_i)(1 - u_j) \big)^{-\frac{1}{\alpha}}}+1} \\ &= \frac{1} {\frac{\bar{k}n}{{k_{\min}}^2} (1 - u_i)(1 - u_j)^{\frac{1}{\alpha}}+ 1} \tag{3} \end{align*}

Now, let ν=kˉkmin2\nu = \frac{\bar{k}}{{k_{\min}}^2}. Since kmin=kˉγ2γ1k_{\min} = \bar{k}\frac{\gamma-2}{\gamma-1} from equation (1),

ν=kˉkmin2=kˉ(kˉγ2γ1)2=kˉkˉ2(γ2γ1)2=1kˉ(γ1γ2)2\begin{align*} \nu &= \frac{\bar{k}}{{k_{\min}}^2} \\ &= \frac{\bar{k}}{\bigg( \bar{k} \cdot \frac{\gamma-2}{\gamma-1} \bigg)^2} \\ &= \bar{k} \cdot \bar{k}^{-2} \cdot \bigg( \frac{\gamma-2}{\gamma-1} \bigg)^{-2} \\ &= \frac{1}{\bar{k}} \bigg( \frac{\gamma-1}{\gamma-2} \bigg)^{2} \\ \end{align*}

Recalling that uu is interchangable with 1u1-u, we can rewrite the equation for pu(ui,uj)p_u(u_i, u_j) as follows:

pu(ui,uj)=1ν((1ui)(1uj))1α+1=1ν(uiuj)1α+1\begin{align*} p_u(u_i, u_j) &= \frac{1}{\nu \big( (1 - u_i)(1 - u_j) \big)^{\frac{1}{\alpha}} + 1} \\ &= \frac{1}{\nu ( u_i u_j )^{\frac{1}{\alpha}} + 1} \tag{4} \end{align*}