Graphs 鈥� Clayton Cafiero

Graphs

Author

Clayton Cafiero

Published

2025-01-05

Introduction to graphs

Graphs are a very versatile data structure. They can be used to represent game states, positions on a map with routes, people in a social network, and so on. Here we will consider graphs that represent positions on a map with routes and that represent friendships in a social network.

Let鈥檚 start with maps. Consider a minimal example of two towns in Michigan, Ann Arbor and Ypsilanti.

Here, in the language of graphs, we have two vertices (a.k.a., nodes) representing the towns, Ann Arbor and Ypsilanti, and an edge connecting the vertices, indicating that a route exists between them. We can travel along this route from Ann Arbor to Ypsilanti, and from Ypsilanti to Ann Arbor. Notice the symmetry. This is because the edge between Ann Arbor and Ypsilanti is undirected, which is reasonable, since the highway that connects them allows traffic in both directions.¹

We refer to vertices which share a common edge as neighbors. We also refer to vertices which share a common edge as adjacent. So in the example above, Ann Arbor and Ypsilanti are neighbors. Ann Arbor is adjacent to Ypsilanti, and Ypsilanti is adjacent to Ann Arbor.

Here鈥檚 a little more elaborate example:

In this example, Burlington is adjacent to St Albans, Montpelier and Middlebury. Rutland is adjacent to Middlebury and Bennington. (Yes, we鈥檙e leaving out a lot of detail for simplicity鈥檚 sake.)

So the question arises: how do we represent a graph in our code? There are several ways, but what we鈥檒l demonstrate here is what鈥檚 called the adjacency list representation.

We鈥檒l use a dictionary, with town names as keys and the values will be a list of all towns adjacent to the key.

For example, Montpelier is adjacent to St Johnsbury, White River Junction and Burlington, so the dictionary entry for Montpelier would look like this:

ROUTES = {'Montpelier': ['Burlington', 'White River Junction', 
                         'St Johnsbury']}

A complete representation of adjacencies, given the map above is:

ROUTES = {
    'Burlington': ['St Albans', 'Montpelier', 'Middlebury'],
    'Montpelier': ['Burlington', 'White River Junction', 
                   'St Johnsbury'],
    'White River Junction': ['Montpelier', 'Brattleboro', 
                             'St Johnsbury'],
    'Brattleboro': ['White River Junction'],
    'Newport': ['St Johnsbury'],
    'St Albans': ['Burlington', 'Swanton'],
    'St Johnsbury': ['Montpelier', 'Newport', 
                     'White River Junction'],
    'Swanton': ['St Albans'],
    'Middlebury': ['Burlington', 'Rutland'],
    'Rutland': ['Middlebury', 'Bennington'],
    'Bennington': ['Rutland']
}

Notice that Montpelier, St Johnsbury, and White River Junction are all neighbors with each other. This is called a cycle. If a graph doesn鈥檛 have any cycles, it is called an acyclic graph. Similarly, if a graph contains at least one cycle, then it is called a cyclic graph. So, this is a cyclic graph.

Searching a graph: breadth-first search

It is often the case that we wish to search such a structure. A common approach is to use what is called breadth-first search (BFS). Here鈥檚 how it works (in the current context):

We keep a list of towns that we鈥檝e visited, and we keep a queue of towns yet to visit. Both the list of towns and the queue are of type list. We choose a starting point (here it doesn鈥檛 matter which one), and we add it to the list of visited towns, and to the queue. This is how we begin.

Then, in a while loop, as long as there are elements in the queue:

pop a town from the front of the queue
for each neighboring town, if we haven鈥檛 already visited the town:
- we append the neighboring town to the list of visited towns
- we append the neighboring town to the back of the queue

At some point, the queue is exhausted (once we鈥檝e popped the last one off), and we only add unvisited towns to the queue. So this algorithm will terminate.

Once the algorithm has terminated, we have a list of the visited towns, in the order they were visited.

Needless to say, this isn鈥檛 a very sophisticated approach. For example, we don鈥檛 consider the distances traveled, and we have a very simple graph. But this suffices for a demonstration of BFS.

A worked example

Here鈥檚 a complete, worked example of breadth-first search. It might help for you to go over this while checking the map (above).

Say we choose St Johnsbury as a starting point. Thus, the list of visited towns will be St Johnsbury, and the queue will contain only St Johnsbury. Then, in a while loop\ldots

First, we pop St Johnsbury from the front of the queue, and we check its neighbors. Its neighbors are Montpelier, Newport, and White River Junction so we append Montpelier, Newport, and White River Junction to the list of visited towns, and to the queue. At this point, the queue looks like this:

['Montpelier', 'Newport', 'White River Junction']

At the next iteration, we pop Montpelier from the front of the queue. Now, when we check Montpelier鈥檚 neighbors, we find Burlington, White River Junction, and St Johnsbury. St Johnsbury has already been visited and so has White River Junction, so we leave them be. However, we have not visited Burlington, so we append it to the list of visited towns and to the queue. At this point, the queue looks like this:

['Newport', 'White River Junction', 'Burlington']

At the next iteration, we pop Newport from the front of the queue. We check Newport鈥檚 neighbors and find only St Johnsbury, so there鈥檚 nothing to append to the queue. At this point, the queue looks like this:

['White River Junction', 'Burlington']

At the next iteration we pop White River Junction from the front of the queue. White River Junction is adjacent to Montpelier (already visited), Brattleboro, and St Johnsbury (already visited). So we append Brattleboro to visited list and to the queue. At this point, the queue looks like this:

['Burlington', 'Brattleboro']

At the next iteration, Burlington is popped from the queue. Now we check Burlington鈥檚 neighbors, and we find St Albans, Montpelier (already visited), and Middlebury. Montpelier we鈥檝e already visited, but St Albans and Middlebury haven鈥檛 been visited yet, so we append them to the list of visited towns and to the queue. At this point, the queue looks like this:

['Brattleboro', 'St Albans', 'Middlebury']

At the next iteration, Brattleboro is popped from the front of the queue. Brattleboro is adjacent to White River Junction (already visited). At this point, the queue looks like this:

['St Albans', 'Middlebury']

At the next iteration, we pop St Albans from the front of the queue. We check St Alban鈥檚 neighbors. These are Burlington (already visited) and Swanton. So we append Swanton to the visited list and to the queue. At this point, the queue looks like this:

['Middlebury', 'Swanton']

At the next iteration, we pop Middlebury from the queue, and we check its neighbors. Its neighbors are Burlington (already visited) and Rutland. Rutland is new, so we append it to the visited list and to the queue. At this point, the queue looks like this:

['Swanton', 'Rutland']

At the next iteration, we pop Swanton from the front of the queue. Swanton鈥檚 only neighbor is St Albans and that鈥檚 already visited so we do nothing. At this point, the queue looks like this:

['Rutland']

At the next iteration we pop Rutland from the front of the queue. Rutland鈥檚 neighbors are Bennington and Middlebury. We鈥檝e already visited Middlebury, but we haven鈥檛 visited Bennington, so we add it to the list of visited towns and to the queue. At this point, the queue looks like this:

['Bennington']

At the next iteration we pop Bennington. Bennington鈥檚 sole neighbor is Rutland (already visited), so we do nothing. At this point, the queue looks like this:

[]

With nothing added to an empty queue, the queue remains empty, and the while loop terminates. (Remember: an empty list is falsey.) At this point we have a complete list of all the visited towns in the order they were visited:

St Johnsbury, Montpelier, Newport, White River Junction, Burlington, Brattleboro, St Albans, Middlebury, Swanton, Rutland, and Bennington.

Why don鈥檛 we just iterate over the entire route map?

In the example above, the graph is connected鈥攎eaning that every town can be reached from every other town by some route. However, this is not always the case. One reason we don鈥檛 just iterate over all keys in a dictionary representing routes is that if we did so, we鈥檇 discover all towns on the map, rather than only those that can be reached from a given starting point.

Consider highways connecting towns and cities in the Hawai鈥檌an islands. You can鈥檛 drive from Hilo to Honolulu鈥攖hey鈥檙e on different islands.

Sometimes we have graphs in which some vertices are not reachable from all other vertices. This can occur in directed graphs (graphs in which the edges have directions, like one-way streets), or in disconnected graphs (graphs that contain vertices or components that aren鈥檛 connected to other components).

Applications

Search algorithms on graphs have abundant applications, including solving mazes and finding routes, web crawlers, games like tic-tac-toe and various board games, and almost anything involving a network of some kind.

Supplemental reading

No generative AI was used in producing this material. This was written the old-fashioned way.

Footnotes

There are what are called directed edges, like one-way streets, but we鈥檒l only deal with undirected edges in this text.鈫╋笌

抖阴探探