Using the csv
module
The csv
module
There’s a very common format in use for tabular data, the CSV or comma separated value format. Many on-line data sources publish data in this format, and all spreadsheet software can read from and write to this format. The idea is simple: columns of data are separated by commas. That’s it!
Here’s an example of some tabular data:
Year | FIFA champion |
---|---|
2018 | France |
2014 | Germany |
2010 | Spain |
2006 | Italy |
2002 | Brazil |
Here’s how it might be represented in CSV format:
Year,FIFA champion
2018,France
2014,Germany
2010,Spain
2006,Italy
2002,Brazil
Pretty simple.
What happens if we have commas in our data? Usually numbers don’t include comma separators when in CSV format. Instead, commas are added only when data are displayed. So, for example, we might have data like this (using format specifiers):
Country | 2018 population |
---|---|
China | 1,427,647,786 |
India | 1,352,642,280 |
USA | 327,096,265 |
Indonesia | 267,670,543 |
Pakistan | 212,228,286 |
Brazil | 209,469,323 |
Nigeria | 195,874,683 |
Bangladesh | 161,376,708 |
Russia | 145,734,038 |
and the CSV data would look like this:
Country,2018 population
China,1427647786
India,1352642280
USA,327096265
Indonesia,267670543
Pakistan,212228286
Brazil,209469323
Nigeria,195874683
Bangladesh,161376708
Russia,145734038
But what if we really wanted commas in our data?
Building | Address |
---|---|
Waterman | 85 Prospect St, Burlington, VT |
Innovation | 82 University Pl, Burlington, VT |
We’d probably break this into additional columns.
Building,Street,City,State
Waterman,85 Prospect St,Burlington,VT
Innovation,82 University Pl,Burlington,VT
What if we really, really had to have commas in our data? Oh, OK. Here are cousin David’s favorite bands of all time:
Band | Rank |
---|---|
Lovin’ Spoonful | 1 |
Sly and the Family Stone | 2 |
Crosby, Stills, Nash and Young | 3 |
Earth, Wind and Fire | 4 |
Herman’s Hermits | 5 |
Iron Butterfly | 6 |
Blood, Sweat & Tears | 7 |
The Monkees | 8 |
Peter, Paul & Mary | 9 |
Ohio Players | 10 |
Now there’s no way around commas in the data. For this we wrap the data including commas in quotation marks.
Band,Rank
Lovin' Spoonful,1
Sly and the Family Stone,2
"Crosby, Stills, Nash and Young",3
"Earth, Wind and Fire",4
Herman's Hermits,5
Iron Butterfly,6
"Blood, Sweat & Tears",7
The Monkees,8
"Peter, Paul & Mary",9
Ohio Players,10
(We’ll save the case of data which includes commas and quotation marks for another day.)
We can read data like this using Python’s csv
module.
import csv
with open('bands.csv', newline='') as csvfile:
= csv.reader(csvfile)
reader for row in reader:
print(row)
This prints:
'Band', 'Rank']
["Lovin' Spoonful", '1']
['Sly and the Family Stone', '2']
['Crosby, Stills, Nash and Young', '3']
['Earth, Wind and Fire', '4']
["Herman's Hermits", '5']
['Iron Butterfly', '6']
['Blood, Sweat & Tears', '7']
['The Monkees', '8']
['Peter, Paul & Mary', '9']
['Ohio Players', '10'] [
Notice that we have to create a special object, a CSV reader. We instantiate this object by calling the constructor function, csv.reader()
, and we pass to this function the file object we wish to read. Notice also that we read each row of our data file into a list, where the columns are separated by commas. That’s very handy!
We can write data to a CSV file as well.
import csv
= [['Deerhoof', 1],
bands 'Lightning Bolt', 2],
['Radiohead', 3],
['Big Thief', 4],
['King Crimson', 5],
['French for Rabbits', 6],
['Yak', 7],
['Boygenius', 8],
['Tipsy', 9],
['My Bloody Valentine', 10]]
[
with open('bands.csv', 'w', newline='') as csvfile:
= csv.writer(csvfile)
writer for item in bands:
writer.writerow(item)
This writes
Deerhoof,1
Lightning Bolt,2
Radiohead,3
Big Thief,4
King Crimson,5
French for Rabbits,6
Yak,7
Boygenius,8
Tipsy,9
My Bloody Valentine,10
to the file.
The newline=''
keyword argument
If you’re using a Mac or a Linux machine, the newline=''
keyword argument may not be strictly necessary when opening a file for use with a csv
reader or writer. However, omitting it could cause problems on a Windows machine and so it’s probably best to include it for maximum portability. The Python documentation recommends using it.
Iterating CSV reader objects
Unlike a list, tuple, or string, a CSV reader object can only be iterated once. This is because a CSV reader object is a type of iterator (which is different from an iterable). An iterator can be iterated only once. Accordingly, if you wish to iterate over a CSV file more than once, you’ll need to create a new CSV reader object each time.
Copyright © 2023–2025 Clayton Cafiero
No generative AI was used in producing this material. This was written the old-fashioned way.