Equality and Identity

Understanding structural and referential equality

Sébastien Boisgérault
Associate Professor, ITN Mines Paris – PSL

Equality

The expression x == y determines if the objects x and y are equal:

>>> 0 == 0
True
>>> 0 == 1
False

>>> "Hello!" == "Hello!"
True
>>> "Hello" == "World"
False

>>> [1, 2, 3] == [1, 2, 3]
True
>>> [1, 2, 3] == [4, 5, 6]
False

Equality tests in Python depend on the type of the compared objects: there is no totally universal interpretation of ==; you must refer to the documentation of the types involved. You can also decide what meaning to give to the equality of the types you will define.

Numbers

The equality test of numbers proceeds without great surprise if we set aside some properties of special floating-point numbers.

Note that the equality tests between numbers are permissive enough to allow comparing numbers of different types:

>>> 1 == True
True
>>> 1 == 1.0
True
>>> 1 == 1 + 0j
True

Floating-point number equality

Python floating-point numbers (float) have finite precision. Consequently, rounding errors in calculations can cause equality tests to fail. For example:

>>> 0.1 + 0.2 == 0.3
False

Because the addition introduced a (small) error in the calculation:

>>> 0.1 + 0.2
0.30000000000000004

The IEEE 754 standard governs the representation and calculation of floating-point numbers. It introduces special numbers; there are thus two distinct zeros ($0^+$ and $0^-$) but considered equal:

>>> +0.0
0.0
>>> -0.0
-0.0
>>> +0.0 == -0.0
True

More surprisingly, the “not-a-number” nan is a special value… that is not equal to itself! (All “not-a-numbers” are considered different.)

>>> from math import nan
>>> nan == nan
False

You need to use the isnan function to know if a value is a non-number.

>>> from math import isnan
>>> isnan(nan)
True

Collections

Two collections — lists, tuples, dictionaries, sets, etc. — delegate the equality test to the elements that compose them — recursively if they are also collections. Thus:

>>> [] == [0]
False
>>> [0] == [0]
True
>>> [0] == [1]
False
>>> [0] == [0, 0]
False
>>> [0] == [0.0]
True
>>> [[0]] == [[0.0]]
True

For dictionaries:

>>> {"a": 1, "b": 2} == {"a": 1, "b": 2}
True

And:

>>> {"a": 1, "b": 2} == {"a": 1.0, "b": 2.0}
True

The order of key-value pairs does not matter:

>>> {"a": 1, "b": 2} == {"b": 2, "a": 1}
True

But it is enough that a key or value differs in the two collections to invalidate equality:

>>> {"a": 1, "b": 2} == {"a": 1, "b": 2, "c": 3}
False
>>> {"a": 1, "b": 2} == {"a": 2, "b": 1}
False

Strings

String comparison proceeds most of the time as expected:

>>> "Hello" == "Hello"
True
>>> "Hello" == "Halo"
False

Except that in the Unicode standard there are sometimes multiple ways to visually obtain the same character. There is a “Latin small letter e with acute” character:

>>> "\xe9"
'é'

But also a “combining acute accent” symbol that can be combined with “e”:

>>> "e\u0301"
'é'

The two code point sequences are different, so the two strings are considered different:

>>> "\xe9" == "e\u0301"
False

However, this is much more surprising when the test is done as follows:

>>> "é" == "é"
False

Identity

The expression x is y determines if x and y have the same identity:

x is y

The negation of == is !=, that of is is is not:

x != y

x is not y

The identity x is y means that the variables x and y refer to the same Python object: the data is at the same address in memory. A perfect copy of an object will therefore have a different identity from the original, whereas it will be considered equal to the original. However, if two objects are identical (in the sense of: have the same identity, are a single unique object), then they are necessarily equal.

Equality and Identity

As an example, let’s consider the three lists a, b and c:

>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> c = b

The lists a and b are equal, as are b and c, but they are not identical—they do not designate the same object (in memory); the variables b and c, on the other hand, designate the same object:

>>> a == b
True
>>> b == c
True
>>> a is b
False
>>> b is c
True

We can verify that b and c designate the same object by evaluating the identifier of these objects (an integer) with the function id:

>>> id(a)
140636096399680
>>> id(b)
140636098130688
>>> id(c)
140636098130688
>>> id(a) == id(b)
False
>>> id(b) == id(c)
True

One important consequence of this distinction: modifications to the list (designated by) b will affect the list c (which is the same object), but not the list a (which is a distinct object):

>>> b.append(4)
>>> b
[1, 2, 3, 4]
>>> c
[1, 2, 3, 4]
>>> a
[1, 2, 3]

To be or not to be

Although composed of two keywords separated by a space, is not is an operator in its own right. The expression x is not y is equivalent to not (x is y) … but more readable! If you need to use is and not as distinct operators, to mean x is (not y), you should keep the parentheses. Thus, with

>>> x = 1
>>> y = True

we have

>>> x is y
False
>>> x is not y
True
>>> not (x is y)
True

but

>>> not y
False
>>> x is (not y)
False