1  Networks and their representations

Open the live notebook in Google Colab here.

Networks encode relationships among entities. A network is a collection of objects (agents, individuals, actors, sites, particles, vertices, nodes, etc.) and their relationships (interactions, links, bonds, ties, connections, edges, etc.)

We can represent networks mathematically with graphs.

Definition 1.1 (Networks as graphs) A graph is an ordered pair \(G = (V, E)\), where \(V\) is the set of vertices or nodes and \(E \subseteq V \times V\) is the set of edges between nodes.

Note that the number of nodes in the graph is the same as the cardinality of the set \(V\), that is, if our graph represents a network with \(n\) agents then \(\vert V \vert = n\).

Show code
from matplotlib import pyplot as plt
import networkx as nx
plt.style.use('seaborn-v0_8-whitegrid')

G = nx.karate_club_graph()
nx.draw(G)

print('This graph has', G.number_of_nodes(), 'nodes and', G.number_of_edges(), 'edges.')
This graph has 34 nodes and 78 edges.
Figure 1.1: This is a famous graph called the Zachary Karate Club Network from Zachary (1977).

Introductory Graph Terminology

We’ll begin by introducing some terminology for types of graphs we may encounter.

Simple graphs

Definition 1.2 Edges that connect nodes to themselves are called self-edges or self-loops.

If there is more than one edge between the same two nodes, this is called a multiedge. A network with multiedges is called a multigraph.

Show code
from matplotlib import pyplot as plt
import networkx as nx
plt.style.use('seaborn-v0_8-whitegrid')

# This is left as a coding exercise in the live notes.


G = nx.Graph()
edgelist = [(1, 4), (2, 5), (3, 5), (4, 5), (5, 6)]
G.add_edges_from(edgelist)
nx.draw(G)
Figure 1.2: A simple graph with 6 nodes and 5 edges.

Definition 1.3 A network that has neither self-edges nor multiedges is called a simple graph or simple network.

Many of the networks we’ll study this semester will be simple graphs.

Show code
from matplotlib import pyplot as plt
import networkx as nx
plt.style.use('seaborn-v0_8-whitegrid')

# This is left as a coding exercise in the live notes.


G = nx.Graph()
G.add_edges_from([(1, 4), (2, 4), (3, 3), (3, 4)])
nx.draw(G)
Figure 1.3: A graph with self-edges.

Planar graphs

Definition 1.4 A planar graph is a graph that can be embedded in the plane without having any edges cross.

Planar graphs are commonly studied objects in graph theory and there are many cool theorems about them (see, for example, the four-color theorem or Kuratowski’s theorem). This means that a network that can be represented with a planar graph can leverage this theory.

Directed graphs

In this course, we will spend much of our time focused on the analysis of undirected graphs, where an edge between nodes \(i\) and \(j\) is an unordered pair \(\{i,j\}\). However, there is an extension that can be important in many modeling contexts where there may be an edge from \(j\) to \(i\), but no edge from \(i\) to \(j\). This is called a directed graph. Informally, each edge in a directed graph has a direction, pointing from one node to another. Directed edges are usually represented by lines with arrows.

Definition 1.5 A directed graph (also called a directed network or digraph) is a graph in which each edge is an ordered pair \((j, i)\), which indicates an edge from node \(j\) to node \(i\). Such edges are called directed edges (or arcs).

Show code
from matplotlib import pyplot as plt
import networkx as nx
plt.style.use('seaborn-v0_8-whitegrid')

# This is left as a coding exercise in the live notes.


DG = nx.DiGraph()
DG.add_edges_from([(1, 2), (1, 3), (3, 4), (4, 1)])

nx.draw(DG, with_labels = True, arrowsize = 20, font_color = 'white', font_weight = 'bold')
Figure 1.4: A directed graph. We’ve chosen some arbitrary node labels for mathematical encoding later on in the notes.

Matrix Representations of Graphs

One reason that graphs are particularly useful mathematical representations of networks is that they can be encoded with matrices. This is a huge advantage, because we’ll be able to leverage a lot of theory we know from linear algebra.

For our definitions below, we’ll suppose we have a graph \(G\) with \(n\) vertices.

We take the convention that the directed edge \((j, i)\) is an edge from \(j\) to \(i\). Notice that \((j, i)\) is represented in the \(i\)th row and the \(j\)th column of the adjacency matrix. As with so many things in math, this notation is a choice made for mathematical convenience. This choice also allows us to align with both the Newman (2018) textbook and NetworkX syntax. Be aware that different authors and sources might make a different choice!

Adjacency matrix

Definition 1.6 The adjacency matrix \(A\) of a graph \(G = (V, E)\) is an \(n \times n\) matrix where \(n = \vert V \vert\). Its entries are \[\begin{align*} A_{ij} = \begin{cases} 1 & (j, i) \in E \,, \\ 0 & \text{otherwise.} \end{cases} \end{align*}\]

We can make a few observations about how the graph structure relates to the structure of the adjacency matrix.

  • If there are no self-edges, the diagonal elements are all zero.
  • If the network is undirected, then an edge between \(i\) and \(j\) implies the existence of an edge between \(j\) and \(i\). This means that \(A_{ji} = A_{ij}\) and thus the adjacency matrix is symmetric.
  • Similarly, if the adjacency matrix is not symmetric, the network cannot be undirected.

Example. The adjacency matrix for the graph in Figure 1.4 is \[ \begin{pmatrix} 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{pmatrix} \]

If we have a graph with self-edges, then \(\{i,i\} \in E\) for some \(i.\) If the graph is undirected, we represent this in the adjacency matrix by setting \(A_{ii} = 2.\) If the graph is directed, we set \(A_{ii} = 1.\)

This is another convention that will make the mathematics easier. An intuitive way to understand this choice is that, in undirected graphs, every edge “shows up” twice in the adjacency matrix, whereas in directed graphs every edge “shows up” once.

If we have a graph with multiedges, then we can set the corresponding matrix element \(A_{ij}\) equal to the multiplicity of the edge. For example, a double edge between nodes 2 and 3 in an undirected graph is represented \(A_{23} = A_{32} = 2.\)

But why stop there? Instead of requiring an integer number of edges between two nodes, we could extend this idea to form weighted networks with real-valued edge weights. Sometimes it is useful to represent edges as having a strength, weight, or value. In this situation, we set the value of \(A_{ij}\) equal to the weight of the corresponding edge \((j, i)\). For example, weights in an airport network could be used represent a distance between two airports, or weights in a social media network could represent the number of messages sent between two individuals.

Many more matrices …

There are LOTS of matrices that can be associated to networks. There’s no “right” one — some are more useful than others for certain jobs. Throughout this course, we’ll see examples of matrices that are well-suited to certain specific tasks, like ranking or clustering. If you’re interested in searching around a bit, some other fun matrices are:

  • The graph Laplacian matrix and its variants.
  • The nonbacktracking or Hashimoto matrix.
  • The modularity matrix.
  • The random-walk transition matrix.
  • The PageRank matrix.
  • The node-edge incidence matrix.

And the list goes on!

Graphs in NetworkX

In this class, we will also be studying networks from a computational perspective. We will be using the NetworkX package in Python.

Creating networks

We can create an empty undirected graph \(G\) using G = nx.Graph(). You could instead create an empty directed graph using nx.DiGraph() or an empty multigraph using nx.MultiGraph().

There are multiple ways to grow your graph in NetworkX.

Adding nodes or edges manually

You can add one node at a time. For example,

G.add_node(1)

will add a node with the label 1. We use an integer here, but a node can be any hashable Python object. You can add one edge at a time using tuples of nodes

G.add_edge(1, 2)

If you create an edge connecting to a node that’s not in your graph yet, the node gets created automatically.

In most cases it’s pretty inefficient to add one node or edge at a time. Fortunately, you can also add nodes and edges from a list or any other iterable container:

G.add_nodes_from(nodelist)

G.add_edges_from(edgelist)

There are corresponding methods to remove nodes and edges: G.remove_node(), G.remove_edge(), G.remove_nodes_from(), G.remove_edges_from().

Creating a graph dictionary

You can also build your graph using a dictionary that maps nodes to neighbors. The code below creates a graph with 3 nodes, where nodes 1 and 3 are both connected to node 2, but not to each other.

Show code
from matplotlib import pyplot as plt
import networkx as nx
plt.style.use('seaborn-v0_8-whitegrid')

graph_dict = {1: [2], 2: [1, 3], 3:[2]}
G = nx.Graph(graph_dict) 
nx.draw(G)
Figure 1.5: A network built using a dictionary that maps nodes to neighbors.

Using an adjacency matrix

You can directly create a graph using a numpy array that encodes your adjacency matrix representation for your graph. The three-node example above has the adjacency matrix \[ \begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 0 \end{pmatrix} \]

which can be encoded after importing the numpy package as np as follows:

A = np.array([[0, 1, 0], [1, 0, 0], [0, 1, 0]])

You can then quickly build an undirected graph \(G\) with

G = nx.from_numpy_array(A)

For a directed graph, you need to make two additional changes to the code above. You need specify that you want a directed graph with create_using. Additionally, you need to be very careful about the adjacency matrix convention for in and out edges. The default convention for from_numpy_array points edges in the opposite direction to what we defined above (it assumes \(A_{ij} = 1\) if \((i,j) \in E\)), so if you want to match that convention, you’ll need to take the transpose with np.transpose.

G = nx.from_numpy_array(np.transpose(A), create_using = nx.DiGraph)

Visualizing your networks

You can use the matplotlib plot interface to draw your network. If you have created a graph \(G\), you can quickly visualize it using

nx.draw(G)

There are lots of fun ways to customize your figure, including changing colors, sizes, adding labels, and more. See the documentation for details.

Practice with the following exercises:

  • Using NetworkX to reproduce Figure 1.2, Figure 1.3, Figure 1.4. Use at least two of the different methods described above.
  • Draw the network represented by the adjacency matrix \[ \begin{pmatrix} 0 & 1 & 0 & 0 & 1 \\ 1 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 \end{pmatrix} \,. \]
  • Create your own network.
Show code
import numpy as np
from matplotlib import pyplot as plt
import networkx as nx
plt.style.use('seaborn-v0_8-whitegrid')


# Use NetworkX to reproduce the figures from the notes.
# Then, create and visualize the network represented by the given adjacency matrix.
# Finally, create your own network.
# This is left as a coding exercise in the live notes.


A = np.array([[0, 1, 0, 0, 1], [1, 0, 0, 1, 1], [0, 0, 0, 1, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0]])
G = nx.from_numpy_array(np.transpose(A), create_using = nx.DiGraph)
nx.draw(G, with_labels = True, arrowsize = 20, font_color = 'white', font_weight = 'bold')
Figure 1.6: The network corresponding to the adjacency matrix given in the exercise.

References

Newman, Mark. 2018. Networks. Oxford University Press.
Zachary, Wayne W. 1977. “An Information Flow Model for Conflict and Fission in Small Groups.” Journal of Anthropological Research 33 (4): 452–73.



© Heather Zinn Brooks and Phil Chodrow, 2024