3  Components and the Graph Laplacian

Open the live notebook in Google Colab here.

In the last lecture, we discovered that the notion of connectivity could tell us something important about graph structure.

Components

Many networks have parts that are disconnected from each other. These parts are called components. As we saw in an example from the previous lecture, there is no path between any pair of nodes in different components of a network.

Definition 3.1 Two nodes \(i\) and \(j\) are path-connected if there exists a path between \(i\) and \(j\). The maximal set of nodes \(j\) such that \(i\) is path-connected to \(j\) is called the connected component of \(i\).

A network is connected if it has only one connected component. Otherwise, we say the network is disconnected.

We say maximal in this definition because we want a connected component to be the biggest subset with this property. Another way to say this is that there are no additional nodes and edges we could include in the set without breaking the path-connected property.

If we have nodes that are singletons — that is, nodes that have no edges connecting to them — then we consider that node a connected component.

As we saw with degree, the definition of connected components requires a bit more subtlety if we consider directed networks.

Definition 3.2 In a directed network, two nodes \(i\) and \(j\) are strongly connected if there exists a path from \(i\) to \(j\) and a path from \(j\) to \(i\). A maximal subset of nodes such that all pairs of nodes are strongly connected is called the strongly connected component.

If we relax the requirement to consider the maximal subset of node pairs \((i,j)\) such that there exists either a path from \(i\) to \(j\) or from \(j\) to \(i\), then this is a weakly connected component.

Exercise

Identify the strongly connected components and the weakly connected components in the network below.

Show code
from matplotlib import pyplot as plt
import networkx as nx
plt.style.use('seaborn-v0_8-whitegrid')

DG = nx.DiGraph()
DG.add_edges_from([(1, 2), (2, 3), (2,5), (2,6), (3, 4), (3, 7), (4, 3), (4, 8), (5, 1), (5, 6), (6, 7), (7, 6), (8, 4), (8, 7)])

nx.draw(DG, with_labels = True, arrowsize = 20, font_color = 'white', font_weight = 'bold')
Figure 3.1: Identify the strongly connected components in this directed network.

In directed networks, we can also define individual node properties that describe all nodes that could reach or be reached by our node of interest.

Definition 3.3 An in-component of node \(i\) is the set of nodes \(j\) such that there is a directed path from \(j\) to \(i\).

An out-component of node \(i\) is the set of nodes \(j\) such that there is a directed path from \(i\) to \(j\).

We include the node \(i\) itself as a member of its own in- and out-components.

The Graph Laplacian

The graph Laplacian is another important matrix representation of a network. It’s useful in studying random walks and dynamics, for clustering and data analysis, for graph visualization, partitioning, and more!

There are mutiple matrices that use this name; the one we introduce here is sometimes called the Combinatorial Graph Laplacian.

Definition 3.4 The (combinatorial) graph Laplacian \(L\) of an undirected graph with adjacency matrix \(A\) is

\[ {\bf L} = {\bf D} - {\bf A} \, \]

where \({\bf D}\) is the diagonal matrix whose diagonal entries \(D_{ii} = k_i\) contain the degree of node \(i\).

The definition given above generalizes in a straightforward way for weighted networks with positive weights. There are also variants for directed graphs (including using in-degree or out-degree matrices to build an in-degree Laplacian or an out-degree Laplacian), but some properties may not be preserved with this approach.

Theorem 3.1 (Properties of the graph Laplacian) Consider the combinatorial graph Laplacian \({\bf L}\) as defined above for an undirected network. Let \({\bf 1}\) be the vector containing all ones. The matrix \({\bf L}\) has the following properties:

  • \({\bf L}\) is real and symmetric.
  • \({\bf L}{\bf 1} = {\bf 0}.\) That is, every row sums to 0.
  • The eigenvalues of \({\bf L}\) are real and nonnegative.
  • The Laplacian always has at least one zero eigenvalue with corresponding eigenvector \({\bf 1}.\)
  • The Laplacian is not invertible.
  • A network has \(c\) components if and only if its graph Laplacian has exactly \(c\) zero eigenvalues (that is, the eigenvalue \(\lambda = 0\) has algebraic multiplicity \(c\).)

We will save the proofs of these properties for homework.

The Graph Laplacian as a Diffusion Operator

One of the many important properties of the graph Laplacian is that it describes many spreading or diffusion processes that take place on networks. Here’s an example: suppose that we “heat up” a single node on the network, and then allow heat to flow along the network edges. The Laplacian matrix gives a concise description of how this heat spreads over the network. Let \(\mathbf{x} \in \mathbb{R}^n\) be the vector whose \(i\)th entry gives the amount of heat currently on node \(i\). Then, the vector \(\delta \mathbf{x} = -\mathbf{Lx}\) is proportional to rate of change of heat at each node. If we imagine that heat moves in discrete time, our update would be

\[ \begin{aligned} \mathbf{x} \gets \mathbf{x} -\alpha\mathbf{Lx} \,, \end{aligned} \]

where \(\alpha\) is some constant that describes the rate of heat transfer. Let’s see how this looks:

Show code
from matplotlib import pyplot as plt
import networkx as nx
import numpy as np

fig, axarr = plt.subplots(1, 4, figsize = (8, 1.7))

# create a network for visualization and set up a layout
n = 50
rad = 0.25
G = nx.random_geometric_graph(n, rad, seed = 1234)

layout = nx.kamada_kawai_layout(G)

# construct the Laplacian matrix
A = nx.to_numpy_array(G)
D = np.diag(np.sum(A, axis = 1))
L = D - A

# rate of heat transfer
rate = 0.05

# initial condition: all heat on a single node
x = np.zeros(n)
x[20] = 1

# main loop
for i, ax in enumerate(axarr.flatten()):
    nx.draw(G, ax = ax, node_size = 20, edge_color = 'gray', node_color = np.log(x + 1e-4), width = 0.5, pos = layout, cmap = "coolwarm", vmax = 0.5)
    ax.set_title("$t = " + str(i) + "$")

    # Laplacian dynamical update
    x -= rate*L@x

Snapshots of heat diffusion on a network. Colors are shown on a logarithmic scale for visualization purposes.

The Laplacian operator also has many other applications in network science, many of which we will study later in these notes.

References



© Heather Zinn Brooks and Phil Chodrow, 2024