Programming and the Python interpreter 鈥� Clayton Cafiero

Programming and the Python interpreter

Author

Clayton Cafiero

Published

2025-01-05

Why learn a programming language?

Computers are powerful tools. Computers can perform all manner of tasks: communication, computation, managing and manipulating data, modeling natural phenomena, and creating images, videos, and music, just to name a few. However, computers don鈥檛 read minds (yet), and thus we have to provide instructions to computers so they can perform these tasks.

Computers don鈥檛 speak natural languages (yet)鈥攖hey only understand binary code. Binary code is unreadable by humans.

For example, a portion of an executable program might look like this (in binary):

0110101101101011 1100000000110101 1011110100100100
1010010100100100 0010100100010011 1110100100010101 
1110100100010101 0001110110000000 1110000111100000 
0000100000000001 0100101101110100 0000001000101011
0010100101110000 0101001001001001 1010100110101000

This is unintelligible. It鈥檚 bad enough to try to read it, and it would be even worse if we had to write our computer programs in this fashion.

Computers don鈥檛 speak human language, and humans don鈥檛 speak computer language. That鈥檚 a problem. The solution is programming languages.

Programming languages allow us, as humans, to write instructions in a form we can understand and reason about, and then have these instructions converted into a form that a computer can read and execute.

There is a tremendous variety of programming languages. Some languages are low-level, like assembly language, where there鈥檚 roughly a one-to-one correspondence between machine instructions and assembly language instructions. Here鈥檚 a 鈥淗ello World!鈥� program in assembly language (for ARM64 architecture):¹

.equ STDOUT, 1
.equ SVC_WRITE, 64
.equ SVC_EXIT, 93
 
.text
.global _start
 
_start:
    stp x29, x30, [sp, -16]!
    mov x0, #STDOUT
    ldr x1, =msg
    mov x2, 13
    mov x8, #SVC_WRITE
    mov x29, sp
    svc #0 // write(stdout, msg, 13);
    ldp x29, x30, [sp], 16
    mov x0, #0
    mov x8, #SVC_EXIT
    svc #0 // exit(0);
 
msg:    .ascii "Hello World!\n"
.align 4

Now, while this is a lot better than a string of zeros and ones, it鈥檚 not so easy to read, write, and reason about code in assembly language.

Fortunately, we have high-level languages. Here鈥檚 the same program in C++:

#include <iostream>
 
int main () {
  std::cout << "Hello World!" << std::endl;
}

Much better, right?

In Python, the same program is even more succinct:

print('Hello World!')

Notice that as we progress from machine code to Python, we鈥檙e increasing abstraction. Machine code is the least abstract. These are the actual instructions executed on your computer. Assembly code uses human-readable symbols, but still retains (for the most part) a one-to-one correspondence between assembly instructions and machine instructions. In the case of C++, we鈥檙e using a library iostream to provide us with an abstraction of an output stream, std::cout, and we鈥檙e just sending strings to that stream. In the case of Python, we simply say 鈥減rint this string鈥� (more or less). This is the most abstract of these examples鈥攚e needn鈥檛 concern ourselves with low-level details.

Now, you may be wondering: How is it that we can write programs in such languages when computers only understand zeros and ones? There are programs which convert high-level code into machine code for execution. There are two main approaches when dealing with high-level languages, compilation and interpretation.

Compilation and interpretation

Generally speaking, compilation is a process whereby source code in some programming language is converted into binary code for execution on a particular architecture. The program which performs this conversion is called a compiler. The compiler takes source code (in some programming language) as an input, and yields binary machine code as an output.

Interpreted languages work a little differently. Python is an interpreted language. In the case of Python, intermediate code is generated, and then this intermediate code is read and executed by another program. The intermediate code is called bytecode.

While the difference between compilation and interpretation is not quite as clear-cut as suggested here, these descriptions will serve for the present purposes.

The Python interpreter

Python is an interpreted language with intermediate bytecode. While you don鈥檛 need to understand all the details of this process, it鈥檚 helpful to have a general idea of what鈥檚 going on.

Say you have written this program and saved it as hello_world.py.

print('Hello World!')

You may run this program from the terminal (command prompt), thus:

$ python hello_world.py

where $ indicates a command prompt (your prompt may vary). When this runs, the following is printed to the console:

Hello World!

When we run this program, Python first reads the source code, then produces the intermediate bytecode, then executes each instruction in the bytecode.

By issuing the command python hello_world.py, we invoke the Python interpreter and tell it to read and execute the program hello_world.py (.py is the file extension used for Python files).
The Python interpreter reads the file hello_world.py.
The Python interpreter produces an intermediate, bytecode representation of the program in hello_world.py.
The bytecode is executed by the Python Virtual Machine.
This results in the words 鈥淗ello World!鈥� being printed to the console.

So you see, there鈥檚 a lot going on behind the scenes when we run a Python program.² However, this allows us to write programs in a high-level language that we as humans can understand.

Supplemental reading

Whetting Your Appetite, from The (Official) Python Tutorial.³

No generative AI was used in producing this material. This was written the old-fashioned way.

Footnotes

Assembly language code sample from Rosetta Code: 鈫╋笌
Actually, there鈥檚 quite a bit more going on behind the scenes, but this should suffice for our purposes. If you鈥檙e curious and wish to learn more, ask!鈫╋笌
鈫╋笌

抖阴探探