Generating random graphs from HSCM with Pareto Distribution

Introduction

Let $\rho=\textrm{Pareto}(\bar{k}, \gamma)$ for a HSCM(n, $\rho$ ). A random graph in such HSCM can be generated using multiple methods, and this note describes the derivation process from the $k$ -generator to the $u$ -generator (inverse CDF method). See chapter 9 of the class notes from Network Science 2 for more details.

Pareto Distribution

PDF

The probability density function (PDF) of the Pareto distribution is given by:

\begin{align*} P(k) &= \alpha {k_{\min}}^{\alpha} k^{-(\alpha+1)} \\ \end{align*}

where $k_{\min}$ is the minimum value of $k$ , $\alpha > 0$ is the shape parameter.

CDF

The cumulative distribution function (CDF) of the Pareto distribution can be derived from the PDF as follows:

\begin{align*} C(k) &= \int_{k_{\min}}^{k} P(x) dx \\ &= \int_{k_{\min}}^{k} \alpha {k_{\min}}^{\alpha} x^{-(\alpha+1)} dx \\ &= \alpha {k_{\min}}^{\alpha} \int_{k_{\min}}^{k} x^{-(\alpha+1)} dx \\ &= -\frac{1}{\alpha} \alpha {k_{\min}}^{\alpha} x^{-\alpha} \\ &= - {k_{\min}}^{\alpha} \bigg[ x^{-\alpha} \bigg]_{k_{\min}}^{k} \\ &= - {k_{\min}}^{\alpha} ( k^{-\alpha} - k_{\min}^{-\alpha} ) \\ &= - {k_{\min}}^{\alpha} k^{-\alpha} + 1 \\ &= 1 - \bigg( \frac{k_{\min}}{k} \bigg) ^{\alpha} \\ \end{align*}

Note that the CDF takes values in the interval $[0, 1]$ for $k \geq k_{\min}$ .

Mean

The mean of the Pareto distribution $\bar{k}$ can be calculated as follows:

\begin{align*} \bar{k} &= \int_{k_{\min}}^{\infty} x P(x) dx \\ &= \int_{k_{\min}}^{\infty} x \cdot \alpha {k_{\min}}^{\alpha} x^{-(\alpha+1)} dx \\ &= \int_{k_{\min}}^{\infty} \alpha {k_{\min}}^{\alpha} x^{-\alpha} dx \\ &= \alpha {k_{\min}}^{\alpha} \bigg[ \frac{1}{-\alpha+1} x^{-\alpha+1} \bigg]_{k_{\min}}^{\infty} \\ &= \bigg( \alpha {k_{\min}}^{\alpha} \bigg) \bigg( - \frac{1}{-\alpha+1} {k_{\min}}^{-\alpha+1} \bigg)\\ &= - \frac{\alpha}{-\alpha+1} k_{\min} \ \end{align*}

Since $\gamma = \alpha + 1$ , i.e., $\alpha = \gamma - 1$ , we can rewrite the above equation for $\bar{k}$ as follows:

\begin{align*} \bar{k} &= - \frac{\gamma-1}{-\gamma+2} k_{\min} \\ &= k_{\min} \bigg( \frac{\gamma-1}{\gamma-2} \bigg) \\ \end{align*}

Rearranging the above equation, we can express $k_{\min}$ in terms of $\bar{k}$ and $\gamma$ as follows:

\begin{align*} k_{\min} &= \bar{k} \bigg( \frac{\gamma-2}{\gamma-1} \bigg) \tag{1} \end{align*}

$k$ -generator

If we could directly sample from the Pareto distribution, we could generate a random graph by connecting nodes $i$ and $j$ independently with probability $p_k(k_i, k_j)$ :

\begin{align*} p_k(k_i, k_j) = \frac{1}{\frac{\bar{k}n}{k_i k_j} + 1} \tag{2} \end{align*}

$k_i$ is independently sampled from the Pareto distribution as

\begin{align*} k_i \sim \text{Pareto}[\bar{k}, \gamma](k) \end{align*}

where

\begin{align*} k_i \in [k_{\min}, \infty] \end{align*}

$u$ -generator: Inverse CDF sampling

Overview

Practically, we often use inverse CDF method to sample from the Pareto distribution. The inverse CDF method involves the following steps:

Generate a uniform random number $u \sim \text{Uniform}(0, 1)$ . Note that $1-u \sim \text{Uniform}(0, 1)$ as well, so we can use $u$ or $1-u$ interchangeably.
Use the inverse function of the CDF to find the corresponding value of $k$ from the uniform random number $u$ .

Derivation of $p_u(u_i, u_j)$

The CDF can be expressed with $u$ as a function of $k$ as follows:

\begin{align*} u = 1 - \bigg( \frac{k_{\min}}{k} \bigg) ^{\alpha} \end{align*}

Solving for $k$ , we get the inverse CDF as follows:

\begin{align*} \bigg( \frac{k_{\min}}{k} \bigg) ^{\alpha} &= 1 - u \\ \frac{k_{\min}}{k} &= (1 - u)^{\frac{1}{\alpha}} \\ k &= k_{\min} (1 - u)^{-\frac{1}{\alpha}} \end{align*}

Plugging equation (3) into equation (2), the probability of connecting two nodes $i$ and $j$ with degrees $k_i$ and $k_j$ respectively can be expressed in terms of $u_i$ and $u_j$ as follows:

\begin{align*} p_k(k_i, k_j) &= \frac{1}{\frac{\bar{k}n}{k_i k_j}+1} \\ \iff p_u(u_i, u_j) &= \frac{1}{\frac{\bar{k}n}{k_{\min} (1 - u_i)^{-\frac{1}{\alpha}} \cdot k_{\min} (1 - u_j)^{-\frac{1}{\alpha}}}+1} \\ &= \frac{1}{\frac{\bar{k}n}{{k_{\min}}^2 (1 - u_i)^{-\frac{1}{\alpha}} (1 - u_j)^{-\frac{1}{\alpha}}}+1} \\ &= \frac{1}{\frac{\bar{k}n}{{k_{\min}}^2 \big( (1 - u_i)(1 - u_j) \big)^{-\frac{1}{\alpha}}}+1} \\ &= \frac{1} {\frac{\bar{k}n}{{k_{\min}}^2} (1 - u_i)(1 - u_j)^{\frac{1}{\alpha}}+ 1} \tag{3} \end{align*}

Now, let $\nu = \frac{\bar{k}}{{k_{\min}}^2}$ . Since $k_{\min} = \bar{k}\frac{\gamma-2}{\gamma-1}$ from equation (1),

\begin{align*} \nu &= \frac{\bar{k}}{{k_{\min}}^2} \\ &= \frac{\bar{k}}{\bigg( \bar{k} \cdot \frac{\gamma-2}{\gamma-1} \bigg)^2} \\ &= \bar{k} \cdot \bar{k}^{-2} \cdot \bigg( \frac{\gamma-2}{\gamma-1} \bigg)^{-2} \\ &= \frac{1}{\bar{k}} \bigg( \frac{\gamma-1}{\gamma-2} \bigg)^{2} \\ \end{align*}

Recalling that $u$ is interchangable with $1-u$ , we can rewrite the equation for $p_u(u_i, u_j)$ as follows:

\begin{align*} p_u(u_i, u_j) &= \frac{1}{\nu \big( (1 - u_i)(1 - u_j) \big)^{\frac{1}{\alpha}} + 1} \\ &= \frac{1}{\nu ( u_i u_j )^{\frac{1}{\alpha}} + 1} \tag{4} \end{align*}

Introduction​

Pareto Distribution​

PDF​

CDF​

Mean​

kkk-generator​

uuu-generator: Inverse CDF sampling​

Overview​

Derivation of pu(ui,uj)p_u(u_i, u_j)pu​(ui​,uj​)​