Sets and programming languages
Sets in the context of programming languages
Sets aren’t quite the same in the context of programming languages. For example, in most commonly used programming languages with a set datatype, two sets containing the same elements are considered equal (equivalent in terms of elements) but not necessarily identical (the same object in memory). This distinction arises because sets are typically treated as separate objects, even if they contain identical data. However, there are scenarios or specialized languages where sets containing the same elements might be treated as identical due to specific constraints or optimization mechanisms.
Many imperative programming languages support sets (of a sort). For example, in Python, two sets are treated as equivalent if they contain the same objects. (Sets are written enclosed in curly braces in Python.)
>>> a = {1, 2, 3}
>>> b = {1, 2, 3}
>>> a == b # a and b are equivalent...
True
>>> a is b # but not identical!
False
In this case, a
and b
are Python sets containing the same objects (the int
literals 1
, 2
, and 3
). When we test for equivalence with the ==
operator, this yields True
—a
and b
are equivalent. However, they are not identical. If we test for identity using the is
operator, this yields False
—a
and b
are different objects, despite the fact that they contain the same elements. This is quite different from mathematical sets where, as we have seen, if A and B contain exactly the same elements then A and B are identical.
In a functional programming language like Haskell, we can compare two sets for equivalence in similar fashion.
> import qualified Data.Set as Set
ghci> let set1 = Set.fromList [1, 2, 3]
ghci> let set2 = Set.fromList [1, 2, 3]
ghci> set1 == set2
ghciTrue
> ghci
However, testing set identity is neither common nor straightforward in Haskell.
In some cases, the compiler may reuse equivalent objects—especially if they are immutable (as they would be in Haskell). Again, this is not always straightforward. The point is, that mathematical sets and “sets” in programming languages are rather different.
That said, let’s use Python to demonstrate certain set operations.
Creating a set
As we’ve seen, we can create a set in Python using curly braces.
>>> a = {1, 2, 3}
We can confirm the type using Python’s built-in isinstance
:
>>> isinstance(a, set)
True
Python sets cannot include duplicate values. If we try to assign a set literal containing repeated elements, Python ignores the repetitions.
>>> a = {1, 1, 2, 3}
>>> a
1, 2, 3} {
In Python, sets are mutable, so we can add (or remove) elements from a set. We add an element using the .add()
method.
>>> a = {1, 2, 3}
>>> a.add(4)
>>> a
1, 2, 3, 4} {
To remove an element, we can use the .discard()
method.
>>> a = {1, 2, 3, 4}
>>> a.discard(2)
>>> a
1, 3, 4} {
Notice that both of these methods modify the underlying set, and return None
.
If you’ve seen Python dictionaries before (which also use curly braces), you’ll recognize that the set literal {1, 2, 3}
is different from a dict literal, in that it has no key/value pairs (separated by :
). However, the literal {}
is treated as a dict in Python, and not an empty set. (Yes, unlike in mathematics, in Python we can have more than one empty set.) To create an empty set, we must use the set()
constructor.
>>> i_am_empty = set()
Set comprehensions
Python supports comprehensions for lists, tuples, and sets. This is a convenient shorthand.
>>> a = {n for n in range(10)}
>>> a
0, 1, 2, 3, 4, 5, 6, 7, 8, 9} {
In this form, a comprehension doesn’t do any more than the set()
constructor would.
>>> a = set(range(10))
>>> a
0, 1, 2, 3, 4, 5, 6, 7, 8, 9} {
Both of these are equivalent to \{n \: | \: n \in \mathbb{N}\text{, and } 0 \leq n \leq 9\} (on the condition that 0 is included among the natural numbers). If we understand the universe to be \mathbb{N} then we can exclude that predicate from the set builder:
\{n \: | \: 0 \leq n \leq 9\}.
However, we can construct sets in a manner similar to set builder notation by applying a predicate in the comprehension.
>>> a = {n for n in range(10) if n % 2} # odd integers between 0 and 10
>>> a
1, 3, 5, 7, 9} {
This is equivalent to \{n \: | \: n \equiv 1 \mod 2\text{, and } 0 < n < 10\}. We could also have a little fun using the definition of odd and do it this way:
>>> a = {2 * n + 1 for n in range(5)} # same
>>> a
1, 3, 5, 7, 9} {
Infinite sets
In programming, we cannot construct infinite sets. This is a significant difference between the world of programming and the world of mathematics. We could try, I suppose, but eventually we’d run out of memory.
>>> s = set()
>>> i = 0
>>> while True:
... s.add(i)+= 1
... i ...
If you try this, be prepared to wait a long time before getting an error message.
Finding the cardinality of a set
To find the cardinality of a set, we use the Python built-in len()
.
>>> a = {'c', 'a', 't'}
>>> len(a)
3
>>> len(set())
0
Testing for subset inclusion
Python allows us to test for subset inclusion using the .issubset()
method, but it also overloads comparison operators <
(strict subset) and <=
(subset or equal). The overloaded operators are analogous to \subset and \subseteq.
>>> a = {'gardenia', 'petunia'}
>>> b = {'petunia', 'begonia', 'gardenia'}
>>> a.issubset(b) # is `a` a subset of `b`?
True
>>> a <= b # is `a` a subset of `b`?
True
>>> a < b # is `a` a strict subset of `b`?
True
>>> b <= a # is `b` a subset of `a`?
False
Membership
We can test for membership in a set with the Python’s in
.
>>> s = {'gouda', 'cheddar', 'brie', 'romano'}
>>> 'cheddar' in s # is 'cheddar' an element of s?
True
>>> 'wombat' in s # is 'wombat' an element of s?
False
We can test for membership in the complement of a set with not in
.
>>> s = {'gouda', 'cheddar', 'brie', 'romano'}
>>> 'cheddar' not in s # is 'cheddar' in the complement of s?
False
>>> 'wombat' not in s # is 'wombat' in the complement of s?
True
Set difference
Set difference in Python is straightforward, as the -
operator is overloaded.
>>> a = {1, 2, 3, 4}
>>> b = {2, 4, 6, 8}
>>> a - b # a \ b
1, 3} {
This is equivalent to A \setminus B. Note that just as subtraction of real numbers is not commutative, neither is set difference.
Set union and intersection
Python does not overload the +
operator for set union. Instead, it provides the .union()
method.
>>> a = {1, 2, 3, 4}
>>> b = {2, 4, 6, 8}
>>> a.union(b)
1, 2, 3, 4, 6, 8} {
This is equivalent to A \cup B.
Python does overload the |
operator for set union, but tbh, that’s a little peculiar.
>>> a = {1, 2, 3, 4}
>>> b = {2, 4, 6, 8}
>>> a | b
1, 2, 3, 4, 6, 8} {
I don’t know about you, but I’ll stick with .union()
.
For intersection, Python provides the .intersection()
method.
>>> a = {1, 2, 3, 4}
>>> b = {2, 4, 6, 8}
>>> a.intersection(b)
2, 4} {
This is equivalent to A \cap B.
Both .union()
and .intersection()
can take multiple sets as arguments.
>>> a = {1, 2, 3, 4}
>>> b = {2, 4, 6, 8}
>>> c = {1, 3, 5, 7}
>>> a.union(b, c)
1, 2, 3, 4, 5, 6, 7, 8}
{>>> a.intersection(b, c)
set()
Here set()
is the empty set, returned by the .intersection()
method.
Notice that both methods return the result of the operation—they do not modify the underlying set(s).
Symmetric difference
Recall that symmetric difference between A and B is the set of all elements that are in A or in B, but not in both. We can implement this in Python by taking the union and subtracting the intersection:
>>> a = {1, 2, 3, 4}
>>> b = {2, 4, 6, 8}
>>> a.union(b) - a.intersection(b)
8, 1, 3, 6} {
We can also implement this by taking the union of the differences:
>>> a = {1, 2, 3, 4}
>>> b = {2, 4, 6, 8}
>>> (a - b).union(b - a)
8, 1, 3, 6} {
Notice that (A \cup B) \setminus (A \cap B) and (A \setminus B) \cup (B \setminus A) (as above) are equivalent.
Summing (numeric) elements of a set
If a Python set consists only of numeric elements (int
or float
), we can sum the elements of a set with Python’s built-in sum()
.
>>> a = {1, 2, 3, 4}
>>> sum(a)
10
Exercises
Write Python set comprehensions for the following sets (with universe \mathbb{N})
- \{n \: | \: n < 100 \}
- \{n \: | \: n \equiv 1 \mod 2\text{, and } n < 100 \}
- \{n \: | \: n \equiv 0 \mod 2\text{, and } n \equiv 0 \mod 3\text{, and } n < 100 \}
In Python, take the union \{1, 2, 3\} \cup \emptyset. What is the result?
In Python, take the intersection \{1, 2, 3\} \cap \emptyset. What is the result?
Write a Python function which takes two sets as arguments and returns their symmetric difference.
Test your function from #4 with the following inputs, A and B, and verify that it returns the correct values for A \oplus B.
A | B | A \oplus B |
---|---|---|
\{1, 2, 3\} | \{3, 4, 5\} | \{1, 2, 4, 5\} |
\{1, 2, 3\} | \{4, 5, 6\} | \{1, 2, 3, 4, 5, 6\} |
\{1, 2, 3\} | \{1, 2, 3\} | \emptyset |
\{1, 2, 3\} | \emptyset | \{1, 2, 3\} |
Copyright © 2024–2025 Clayton Cafiero