Graphs
Introduction to graphs
Graphs are a very versatile data structure. They can be used to represent game states, positions on a map with routes, people in a social network, and so on. Here we will consider graphs that represent positions on a map with routes and that represent friendships in a social network.
Let’s start with maps. Consider a minimal example of two towns in Michigan, Ann Arbor and Ypsilanti.
Here, in the language of graphs, we have two vertices (a.k.a., nodes) representing the towns, Ann Arbor and Ypsilanti, and an edge connecting the vertices, indicating that a route exists between them. We can travel along this route from Ann Arbor to Ypsilanti, and from Ypsilanti to Ann Arbor. Notice the symmetry. This is because the edge between Ann Arbor and Ypsilanti is undirected, which is reasonable, since the highway that connects them allows traffic in both directions.1
We refer to vertices which share a common edge as neighbors. We also refer to vertices which share a common edge as adjacent. So in the example above, Ann Arbor and Ypsilanti are neighbors. Ann Arbor is adjacent to Ypsilanti, and Ypsilanti is adjacent to Ann Arbor.
Here’s a little more elaborate example:
In this example, Burlington is adjacent to St Albans, Montpelier and Middlebury. Rutland is adjacent to Middlebury and Bennington. (Yes, we’re leaving out a lot of detail for simplicity’s sake.)
So the question arises: how do we represent a graph in our code? There are several ways, but what we’ll demonstrate here is what’s called the adjacency list representation.
We’ll use a dictionary, with town names as keys and the values will be a list of all towns adjacent to the key.
For example, Montpelier is adjacent to St Johnsbury, White River Junction and Burlington, so the dictionary entry for Montpelier would look like this:
= {'Montpelier': ['Burlington', 'White River Junction',
ROUTES 'St Johnsbury']}
A complete representation of adjacencies, given the map above is:
= {
ROUTES 'Burlington': ['St Albans', 'Montpelier', 'Middlebury'],
'Montpelier': ['Burlington', 'White River Junction',
'St Johnsbury'],
'White River Junction': ['Montpelier', 'Brattleboro',
'St Johnsbury'],
'Brattleboro': ['White River Junction'],
'Newport': ['St Johnsbury'],
'St Albans': ['Burlington', 'Swanton'],
'St Johnsbury': ['Montpelier', 'Newport',
'White River Junction'],
'Swanton': ['St Albans'],
'Middlebury': ['Burlington', 'Rutland'],
'Rutland': ['Middlebury', 'Bennington'],
'Bennington': ['Rutland']
}
Notice that Montpelier, St Johnsbury, and White River Junction are all neighbors with each other. This is called a cycle. If a graph doesn’t have any cycles, it is called an acyclic graph. Similarly, if a graph contains at least one cycle, then it is called a cyclic graph. So, this is a cyclic graph.
Searching a graph: breadth-first search
It is often the case that we wish to search such a structure. A common approach is to use what is called breadth-first search (BFS). Here’s how it works (in the current context):
We keep a list of towns that we’ve visited, and we keep a queue of towns yet to visit. Both the list of towns and the queue are of type list
. We choose a starting point (here it doesn’t matter which one), and we add it to the list of visited towns, and to the queue. This is how we begin.
Then, in a while loop, as long as there are elements in the queue:
- pop a town from the front of the queue
- for each neighboring town, if we haven’t already visited the town:
- we append the neighboring town to the list of visited towns
- we append the neighboring town to the back of the queue
At some point, the queue is exhausted (once we’ve popped the last one off), and we only add unvisited towns to the queue. So this algorithm will terminate.
Once the algorithm has terminated, we have a list of the visited towns, in the order they were visited.
Needless to say, this isn’t a very sophisticated approach. For example, we don’t consider the distances traveled, and we have a very simple graph. But this suffices for a demonstration of BFS.
A worked example
Here’s a complete, worked example of breadth-first search. It might help for you to go over this while checking the map (above).
Say we choose St Johnsbury as a starting point. Thus, the list of visited towns will be St Johnsbury, and the queue will contain only St Johnsbury. Then, in a while
loop\ldots
First, we pop St Johnsbury from the front of the queue, and we check its neighbors. Its neighbors are Montpelier, Newport, and White River Junction so we append Montpelier, Newport, and White River Junction to the list of visited towns, and to the queue. At this point, the queue looks like this:
'Montpelier', 'Newport', 'White River Junction'] [
At the next iteration, we pop Montpelier from the front of the queue. Now, when we check Montpelier’s neighbors, we find Burlington, White River Junction, and St Johnsbury. St Johnsbury has already been visited and so has White River Junction, so we leave them be. However, we have not visited Burlington, so we append it to the list of visited towns and to the queue. At this point, the queue looks like this:
'Newport', 'White River Junction', 'Burlington'] [
At the next iteration, we pop Newport from the front of the queue. We check Newport’s neighbors and find only St Johnsbury, so there’s nothing to append to the queue. At this point, the queue looks like this:
'White River Junction', 'Burlington'] [
At the next iteration we pop White River Junction from the front of the queue. White River Junction is adjacent to Montpelier (already visited), Brattleboro, and St Johnsbury (already visited). So we append Brattleboro to visited list and to the queue. At this point, the queue looks like this:
'Burlington', 'Brattleboro'] [
At the next iteration, Burlington is popped from the queue. Now we check Burlington’s neighbors, and we find St Albans, Montpelier (already visited), and Middlebury. Montpelier we’ve already visited, but St Albans and Middlebury haven’t been visited yet, so we append them to the list of visited towns and to the queue. At this point, the queue looks like this:
'Brattleboro', 'St Albans', 'Middlebury'] [
At the next iteration, Brattleboro is popped from the front of the queue. Brattleboro is adjacent to White River Junction (already visited). At this point, the queue looks like this:
'St Albans', 'Middlebury'] [
At the next iteration, we pop St Albans from the front of the queue. We check St Alban’s neighbors. These are Burlington (already visited) and Swanton. So we append Swanton to the visited list and to the queue. At this point, the queue looks like this:
'Middlebury', 'Swanton'] [
At the next iteration, we pop Middlebury from the queue, and we check its neighbors. Its neighbors are Burlington (already visited) and Rutland. Rutland is new, so we append it to the visited list and to the queue. At this point, the queue looks like this:
'Swanton', 'Rutland'] [
At the next iteration, we pop Swanton from the front of the queue. Swanton’s only neighbor is St Albans and that’s already visited so we do nothing. At this point, the queue looks like this:
'Rutland'] [
At the next iteration we pop Rutland from the front of the queue. Rutland’s neighbors are Bennington and Middlebury. We’ve already visited Middlebury, but we haven’t visited Bennington, so we add it to the list of visited towns and to the queue. At this point, the queue looks like this:
'Bennington'] [
At the next iteration we pop Bennington. Bennington’s sole neighbor is Rutland (already visited), so we do nothing. At this point, the queue looks like this:
[]
With nothing added to an empty queue, the queue remains empty, and the while loop terminates. (Remember: an empty list is falsey.) At this point we have a complete list of all the visited towns in the order they were visited:
St Johnsbury, Montpelier, Newport, White River Junction, Burlington, Brattleboro, St Albans, Middlebury, Swanton, Rutland, and Bennington.
Why don’t we just iterate over the entire route map?
In the example above, the graph is connected—meaning that every town can be reached from every other town by some route. However, this is not always the case. One reason we don’t just iterate over all keys in a dictionary representing routes is that if we did so, we’d discover all towns on the map, rather than only those that can be reached from a given starting point.
Consider highways connecting towns and cities in the Hawai’ian islands. You can’t drive from Hilo to Honolulu—they’re on different islands.
Sometimes we have graphs in which some vertices are not reachable from all other vertices. This can occur in directed graphs (graphs in which the edges have directions, like one-way streets), or in disconnected graphs (graphs that contain vertices or components that aren’t connected to other components).
Applications
Search algorithms on graphs have abundant applications, including solving mazes and finding routes, web crawlers, games like tic-tac-toe and various board games, and almost anything involving a network of some kind.
Supplemental reading
Copyright © 2023–2025 Clayton Cafiero
No generative AI was used in producing this material. This was written the old-fashioned way.
Footnotes
There are what are called directed edges, like one-way streets, but we’ll only deal with undirected edges in this text.↩︎