8.2. String Interning¶
Similarly, Python reuses objects to represent identical string literals in your code rather than making separate copies of the same string. To see this in practice, enter the following into the interactive shell:
>>> spam = 'cat'
>>> eggs = 'cat'
>>> spam is eggs
True
>>> id(spam), id(eggs)
(140426210659696, 140426210659696)
Python notices that the ‘cat’ string literal assigned to eggs is the same as the ‘cat’ string literal assigned to spam ; so instead of making a second, redundant string object, it just assigns eggs a reference to the same string object that spam uses. This explains why the IDs of their strings are the same.
This optimization is called string interning, and like the preallocated inte- gers, it’s nothing more than a CPython implementation detail. You should never write code that relies on it. Also, this optimization won’t catch every possible identical string. Trying to identify every instance in which you can use an optimization often takes up more time than the optimization would save. For example, try creating the ‘cat’ string from ‘c’ and ‘at’ in the inter- active shell; you’ll notice that CPython creates the final ‘cat’ string as a new string object rather than reusing the string object made for spam :
>>> bacon = 'c'
>>> bacon += 'at'
>>> spam is bacon
False
>>> id(spam), id(bacon)
(140426210659696, 140426105391024)
String interning is an optimization technique that interpreters and compilers use for many different languages. You’ll find further details at https://en.wikipedia.org/wiki/String_interning.