Representation error 鈥� Clayton Cafiero

Representation error

Author

Clayton Cafiero

Published

2025-01-05

Representation error of numeric types

Representation error occurs when we try to represent a number using a finite number of bits or digits which cannot be accurately represented in the system chosen. For example, in our familiar decimal system:

number	decimal representation	representation error
1	1	0
1/3	0.3333333333333333	0.0000000000000000333\ldots
1/7	0.1428571428571428	0.0000000000000000571428\ldots

Natural numbers, integers, rational numbers, and real numbers

You probably know that the set of all natural numbers

\mathbb{N} = \{0, 1, 2, 3, \ldots\}

is infinite.

From there it鈥檚 not a great leap to see that the set of all integers

\mathbb{Z} = \{\ldots, -2, -1, 0, 1, 2, \ldots\}

is infinite too.

The rational numbers, \mathbb{Q}, and set of all real numbers, \mathbb{R}, also are infinite.

This fact鈥攖hat these sets are of infinite size鈥攈as implications for numeric representation and numeric calculations on computers.

When we work with computers, numbers are given integer or floating-point representations. For example, in Python, we have distinct types, int and float, for holding integer and floating-point numbers, respectively.

I won鈥檛 get into too much detail about how these are represented in binary, but here鈥檚 a little bit of information.

Integers

Representation of integers is relatively straightforward. Integers are represented as binary numbers with a position set aside for the sign. So, 12345 would be represented as 00110000 00111001. That鈥檚

0 \times 2^{15} + 0 \times 2^{14} + 1 \times 2^{13} + 1 \times 2^{12}

+ \; 0 \times 10^{11} + 0 \times 10^{10} + 0 \times 10^9 + 0 \times 10^8

+ \; 0 \times 10^6 + 1 \times 2^5 + 1 \times 2^4 + 1 \times 2^3

+ \; 0 \times 2^2 + 0 \times 2^1 + 1 \times 2^0

This works out to

8192 + 4096 + 32 + 16 + 8 + 1 = 12345.

Negative integers are a little different. If you鈥檙e curious about this, see the Wikipedia article on Two鈥檚 Complement.

Floating-point numbers and IEEE 754

Floating-point numbers are a little tricky. Stop and think for a minute: How would you represent floating-point numbers? (It鈥檚 not as straightforward as you might think.)

Floating-point numbers are represented using the IEEE 754 standard (IEEE stands for 鈥淚nstitute of Electrical and Electronics Engineers鈥�).¹ There are three parts to this representation: the sign, the exponent, and the fraction (also called the mantissa or significand)鈥攁nd all of these must be expressed in binary. IEEE 754 uses either 32 or 64 bits for representing floating point numbers. The issue with representation lies in the fact that there鈥檚 a fixed number of bits available: one bit for the sign of a number, eight bits for the exponent, and the rest for the fractional portion. With a finite number of bits, there鈥檚 a finite number of values that can be represented without error.

Examples of representation error

Now we have some idea of how integers and floating-point numbers are represented in a computer. Consider this: We have some fixed number of bits set aside for these representations.² So we have a limited number of bits we can use to represent numbers on a computer. Do you see the problem?

The set of all integers, \mathbb{Z}, is infinite. The set of all real numbers, \mathbb{R}, is infinite. Any computer we can manufacture is finite. Now do you see the problem?

There exist infinitely more integers, and infinitely more real numbers, than we can represent in any system with a fixed number of bits. Let that soak in.

For any given machine or finite representation scheme, there are infinitely many numbers that cannot be represented in that system! This means that many numbers are represented by an approximation only.

Let鈥檚 return to the example of 1/3 in our decimal system. We can never write down enough digits to the right of the decimal point so that we have the exact value of 1/3.

0.333333333333333333333\ldots

No matter how far we extend this expansion, the value will only be an approximation of 1/3. However, the fact that its decimal expansion is non-terminating is determined by the choice of base (10).

What if we were to represent this in base 3? In base 3, decimal 1/3 is 0.1. In base 3, it鈥檚 easy to represent!

Of course our computers use binary, and so in that system (base 2) there are some numbers that can be represented accurately, and an infinite number that can only be approximated.

Here鈥檚 the canonical example, in Python:

>>> 0.1
0.1
>>> 0.2
0.2
>>> 0.1 + 0.2
0.30000000000000004

Wait! What? Yup. Something strange is going on. Python rounds values when displaying in the shell. Here鈥檚 proof:

>>> print(f'{0.1:.56f}')
0.10000000000000000555111512312578270211815834045410156250
>>> print(f'{0.2:.56f}')
0.20000000000000001110223024625156540423631668090820312500
>>> print(f'{0.1 + 0.2:.56f}')
0.30000000000000004440892098500626161694526672363281250000

The last, 0.1 + 0.2, is an example of representation error that accumulates to the point that it is no longer hidden by Python鈥檚 automatic rounding, hence

>>> 0.1 + 0.2
0.30000000000000004

Remember we only work with powers of two. So there鈥檚 no way to accurately represent these numbers in binary with a fixed number of decimal places.

What鈥檚 the point?

The subset of real numbers that can be accurately represented within a given positional system depends on the base chosen (1/3 cannot be represented without error in the decimal system, but it can be in base 3).
It鈥檚 important that we understand that no finite machine can represent all real numbers without error.
Most numbers that we provide to the computer and which the computer provides to us in the form of answers are only approximations.
Perhaps most important from a practical standpoint, representation error can accumulate with repeated calculations.
Understanding representation error can prevent you from chasing bugs when none exist.

For more, see:

Floating Point Arithmetic: Issues and Limitations: .

No generative AI was used in producing this material. This was written the old-fashioned way.

Footnotes

For more, see: 鈫╋笌
That鈥檚 not entirely true for integers in Python, but it鈥檚 reasonable to think of it this way for the purpose at hand.鈫╋笌

抖阴探探