Maximum-Entropy Ensembles of Graphs
Graph Ensembles
State Space
A graph ensemble is a probability distribution over a set of graphs, where each graph is a microstate. This statistical-mechanics view of networks is the starting point for maximum-entropy network models [1][2].
If we think of a graph ensemble consisted of 3 nodes, there are three possible undirected edges and therefore graphs in the ensemble.
Graph Observables
Observables are functions of a graph. Examples include edge count , node degrees , triangle count, clustering, or any other statistic we may want an ensemble to preserve.
Probability Distributions on Graphs
The probability distributions on the graph must satisfy
If normalization is the only constraint, the maximum-entropy distribution is uniform over all graphs. For , that means and the entropy is bits.
Maximum Entropy
Unconstrained Maximum Entropy
Without constraints other than normalization, the entropy-maximizing distribution is uniform. This is the global maximum over the whole probability simplex.
For the eight graphs on three nodes,
Constrained Maximum Entropy
The maximum-entropy principle says to preserve only the facts we choose as constraints and otherwise use the distribution with the largest entropy [5]. For graph ensembles, those facts are often expected observables such as the expected total edge count and the degree distribution:
Among all graph distributions satisfying those equations, choose the one with largest Shannon entropy:
The result is the least-committed distribution consistent with the constraints.
Lagrange Multipliers
We maximize entropy over the probabilities of all graphs: . The constraints are equations that those probabilities must satisfy:
The Lagrangian combines the entropy objective with terms enforcing normalization and the observable constraints:
At the maximum, changing one graph's probability while holding the others fixed should not improve the constrained objective. Thus gives
Solving this equation for gives . The normalization constant is the partition function .
Lagrange multipliers turn the constrained optimization into the Gibbs-Boltzmann form:
The multipliers are fixed by the constraint equations . In this page, the multipliers are best understood as fields conjugate to graph observables.
Canonical Ensembles
Canonical ensembles use soft constraints: observables are fixed in expectation, not exactly graph by graph.
Example: Expected Edge Count
Here, we use expected edge count as the running example of a canonical ensemble. The constrained observable is the number of edges, and the constraint is soft:
The Hamiltonian is
An ensemble tuned to still assigns probability to graphs with several possible realized edge counts. Only the average is pinned.
Partition Function
Because , the partition function factorizes over possible edges:
where is the number of possible undirected edges.
Step-by-step
Write a graph by its adjacency matrix. Let be the entry:
For a simple undirected graph on labeled nodes, there are
possible edges. The edge count is the sum of these indicators:
With the edge-count Hamiltonian , the partition function is
The exponential of a sum becomes a product:
Therefore the sum over all graphs is a sum over all binary choices for all possible edges:
Because the product separates by edge, the sums factorize:
Each one-edge sum has only two terms:
Since the same factor appears once for each of the possible edges,
Erdos-Renyi G(N,p)
The factorization means every possible edge is an independent Bernoulli trial:
Equivalently, a graph with present edges has probability
Step-by-step
Now substitute the factorized partition function back into the Gibbs distribution:
Using and ,
This shows that the probability of a whole graph is a product of one-edge probabilities. For a single edge,
Define
Then
so each edge probability can be written as
Thus every possible edge is an independent Bernoulli trial:
For a graph with present edges and absent edges, the graph probability becomes
This is exactly the Gilbert random graph model : each of the possible edges is included independently with probability .
The maximum-entropy calculation determines the exponential form of the distribution, but the constraint value determines the numerical value of . For the edge-count ensemble, with , so a target implies and therefore
Thus is the log-odds against an edge: positive values make sparse graphs more likely, negative values make dense graphs more likely, and gives the uniform ensemble.
This is Gilbert's model: every possible edge is present independently with the same probability [3]. For , there are three trials, so and .
Edge-Count Distribution
A particular graph with present edges has probability
Collapsing the eight graph probabilities by edge count gives a binomial distribution:
Microcanonical Ensembles
Microcanonical ensembles use hard constraints: an observable is fixed exactly on every graph in the support. Again, microcanonical says how the constraint is enforced; the observable says what is fixed.
Exact Edge Count
For edge count, the hard constraint is
This gives the fixed-edge model , the original Erdos-Renyi random graph model [4].
Uniform Distribution on a Constraint Set
The maximum-entropy solution is uniform over all graphs satisfying the hard constraint:
For and , the constraint set has three graphs, so the entropy is bits.
Canonical vs Microcanonical
A chosen constraint value gives one entropy-maximizing distribution. Varying the constraint value gives a family of maximum-entropy distributions.
For one global edge-count constraint, the canonical model and microcanonical model become equivalent as when and are matched by edge density. For small , they are visibly different: a canonical model tuned to still leaves probability on , while the microcanonical model puts all mass on .
Degree-Constrained Ensembles
The same maximum-entropy framework can constrain node degrees instead of only the total edge count. That gives the configuration-model family: the soft configuration model fixes expected degrees, while the hard configuration model fixes the degree sequence exactly. See Configuration model for that version.
Ensemble Equivalence
Global Constraints
Equivalence is most natural for a small number of global, self-averaging constraints. Edge count is the basic example. In , the realized edge count fluctuates around . Equivalently, the realized edge density fluctuates around , but those fluctuations shrink as grows because grows.
Local Constraints
Degree-sequence constraints are local and extensive: there is one constraint per node. The canonical ensemble allows each degree to fluctuate around its target; the microcanonical ensemble forbids those fluctuations exactly.
Canonical-Microcanonical Non-Equivalence
Canonical and microcanonical ensembles can fail to become equivalent when the number of constraints grows with the system. Graph ensembles with a fixed number of edges are equivalent in the large-graph limit, while ensembles with a fixed degree sequence need not be [7].
This is why and are usually interchangeable asymptotically, while soft and hard configuration models can remain measurably different.
Interactive Example for N=3
The smallest nontrivial case is . There are three possible undirected edges, so the full graph space has only graphs. That makes it possible to show the entire probability distribution at once.
Use the tabs to compare three ways of assigning probabilities to the same eight graphs. The edge-count mode is the canonical ensemble: one expected edge count fixes one probability shared by all edges. The degree mode replaces that global constraint with three local expected-degree constraints. Free play lets you draw an arbitrary distribution, so you can see what is lost when the distribution no longer has the maximum-entropy form.
Interactive maximum-entropy graph ensemble on three nodes. Choose a constraint on total edge count or on individual node degrees, or edit the distribution freely.
;
Probability Distribution of Graphs
The first chart shows the probability of each individual graph. The graphs are grouped by their realized edge count . In edge-count mode, all graphs with the same have the same probability because the Hamiltonian only contains .
Edge Count Distribution
The second chart collapses those eight microstates onto the observable . This is where degeneracy appears: the levels contain graphs. The control pins the expectation , not the realized value of , so the observed edge count still fluctuates.
For the edge-count ensemble, those fluctuations follow the binomial law for three Bernoulli trials:
Entropy Ceiling
The third chart compares entropy against the maximum possible entropy at the same expected edge count. The blue curve is
where is the binary entropy in bits. This is the constrained maximum-entropy envelope: for each value of , it gives the entropy of the canonical distribution with that expected edge count.
The binary entropy function is
For example, when ,
so
This is the peak of the envelope: , every graph has probability , and the entropy reaches the unconstrained global maximum bits.
Any extra structure beyond the mean edge count lowers entropy. In the node-degree mode, fixing three expected degrees also fixes , but it adds node-specific information. The coral gap is the entropy spent to buy that extra structure. In free play, the same gap appears whenever the hand-drawn distribution has more structure than the maximum-entropy distribution at the same .
References
- Park, J., & Newman, M. E. J. (2004). Statistical mechanics of networks. Physical Review E, 70, 066117.
- Squartini, T., & Garlaschelli, D. (2017). Maximum-Entropy Networks: Pattern Detection, Network Reconstruction and Graph Combinatorics. Springer.
- Gilbert, E. N. (1959). Random graphs. Annals of Mathematical Statistics, 30(4), 1141-1144.
- Erdos, P., & Renyi, A. (1959). On random graphs I. Publicationes Mathematicae, 6, 290-297.
- Jaynes, E. T. (1957). Information theory and statistical mechanics. Physical Review, 106, 620-630.
- Newman, M. E. J., Strogatz, S. H., & Watts, D. J. (2001). Random graphs with arbitrary degree distributions and their applications. Physical Review E, 64, 026118.
- Squartini, T., de Mol, J., den Hollander, F., & Garlaschelli, D. (2015). Breaking of ensemble equivalence in networks. Physical Review Letters, 115, 268701.
- Snijders, T. A. B., Pattison, P. E., Robins, G. L., & Handcock, M. S. (2006). New specifications for exponential random graph models. Sociological Methodology, 36(1), 99-153.