Underscores, dunders and everything nice

Pavlin Gergov

Dec 4, 2017

Categories:Python

We’re going to talk about underscores, dunders, encapsulation, and magic methods in Python

Motivation

Python is an easy-to-learn language that provides a stepping-stone into the world of programming, but some of its features are confusing for beginners and advanced developers. At the end of this article, you’ll know when and how to use underscores, dunders, magic methods, and encapsulation in Python.

Introduction

Single _ and double __ leading or trailing underscores have different meanings in Python. Most of the time it’s just a convention (hint to the programmer), but there are cases when they’re enforced by the Python interpreter. We’re going to talk about:

Underscore: _
Single trailing underscore: foo_
Single leading underscore: _ham
Double leading underscores: __spam
Double leading and trailing underscores: __eggs__

Dunders

Double underscores are referred to as dunders because they appear quite often in the Python code and it’s easier to use the shorten “dunder” instead of “double underscore”.

Unused Variables

A single stand-alone underscore is used to indicate that a variable is temporary or insignificant. This meaning is per convention only and doesn’t trigger any special behavior in the Python parser. A single underscore is just a valid variable name that’s used for this purpose. Let’s see a couple examples:

If you’re iterating, and you are not using the yielded value from the iterator, you can use a single underscore to indicate that it’s just a temporary value:

>>> for _ in range(3):
...     print('Zen of Python')
... 
Zen of Python
Zen of Python
Zen of Python

If you’re unpacking values from a tuple, but you don’t need some of it’s values, you can use a single underscore to mark it as insignificant:

>>> foo, _ = ('bar', 42)

In a Python REPL the single underscore is a special variable that represents the result of the last evaluated expression:

>>> 5 + 5
10
>>> _
10

Bonus feature When doing internationalization in Python code with Django it’s a convention to import the gettext function as _ to save typing:

from django.utils.translation import gettext as _

def django_view(request):
    translated_text = _("The zen of Python")
    ...

https://docs.djangoproject.com/en/2.2/topics/i18n/translation/#internationalization-in-python-code

Keyword Collision

Sooner or later one ends up using a Python keyword like class, type, list, etc. as a variable name because it fits well in his context, but this is a bad practice and in some cases can end up in a SyntaxError

>>> def foo(class):
  File "<stdin>", line 1
    def foo(class):
                ^
SyntaxError: invalid syntax

To avoid naming conflicts append a single underscore to the variable name:

>>> def foo(class_):
...     return 42

Private Variables

The leading underscore prefix is used as a hint the programmer that a variable or method is intended for internal use. However, this convention isn’t enforced by the Python interpreter and it doesn’t affect the behavior of your programs because Python doesn’t have a strong distinction between private and public variables like Java or C++:

>>> class Foo:
...     def __init__(self):
...         self.spam = 'spam'
...         self._ham = '_ham'
... 
>>> foo = Foo()
>>> foo.spam
'spam'
>>> foo._ham
'_ham'

The leading underscore has an impact on how functions are imported from modules. Let’s have the following module (a module is just a file that contains function definitions):

# example.py

def foo():
    return 42


def _bar():
    return 42

If one uses a wildcard import import * to import all names from the module this will import all names except those beginning with and underscore:

Note: Avoid using wildcard imports as they make it unclear what names are present in the namespace and the code is less readable.

>>> from example import *
>>> foo()
42
>>> _bar()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name '_bar' is not defined

You can override this behavior by explicitly defining the value of __all__ in the module:

# example.py

__all__ = ['foo', '_bar']


def foo():
    return 42


def _bar():
    return 42

>>> from example import *
>>> foo()
42
>>> _bar()
42

Another way to import a name with a leading underscore is by not using the import * syntax, but a regular import instead:

>>> from example import foo, _bar
>>> foo()
42
>>> _bar()
42

Name mangling

All of the naming patterns so far have been agreed-upon conventions to which the Python community agrees. However, Python class attributes that start with double underscores are rewritten by the Python interpreter. This helps to avoid naming collisions in extended classes. Let’s see how the name mangling works:

>>> class Foo:
...     def __init__(self):
...         self.__spam = 'spam'
... 
>>> foo = Foo()
>>> foo.__spam
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Foo' object has no attribute '__spam'

The __spam attribute is not accessible in the Foo instance. This is because it’s been renamed to _Foo__spam – this is the so-called name mangling:

>>> foo._Foo__spam
'spam'

Name mangling is done under the hood and if you create a getter method for your class you won’t notice it:

>>> class Foo:
...     def __init__(self):
...         self.__spam = 'spam'
...
...     def get_spam(self):
...         return self.__spam
... 
>>> foo = Foo()
>>> foo.get_spam()
'spam'

If you decide to extend Foo and override the __spam attribute, by assigning a different value, the new attribute will again be rewritten by the interpreter because the name mangling is applied to both classes. Unless you override the get_spam method you’ll receive Foo’s original attribute value if you call it. To get the new overridden attribute’s value to create a new method. All of this is possible because both mangled attributes exist in the extended class:

>>> class ExtendsFoo(Foo):
...     def __init__(self):
...         super().__init__()
...         self.__spam = 'extended spam'
...
...     def get_extended_spam(self):
...         return self.__spam
... 
>>> extended_foo = ExtendsFoo()
>>> extended_foo.get_spam()
'spam'
>>> extended_foo.get_extended_spam()
'extended spam'
>>> extended_foo._Foo__spam
'spam'
>>> extended_foo._ExtendsFoo__spam
'extended spam'

Encapsulation in Python lacks strict access control such as private and protected attributes. It will stop you from accidentally accessing stuff, but you can intentionally do pretty much everything as long as you’re aware of how the language works.

In the examples above we’ve used class attributes, but the same rules apply for method names too. In short name, mangling affects all names that start with two leading underscores in a class context. Having this in mind let’s take a look at the following example:

>>> _Foo__mangled = 42
>>> class Foo:
...     def bar(self):
...         return __mangled
... 
>>> foo = Foo()
>>> foo.bar()
42

Cool, right? But please don’t do this. No one deserves to be abused.

If you feel confused please do check out Raymond Hettinger’s great tutorial on Python’s built-in toolset for creating classes.

Magic Methods

One very important fact about the name mangling is that it isn’t applied if a name starts and ends with double underscores:

>>> class Foo:
...     def __init__(self):
...         self.__spam__ = 'spam'
... 
>>> foo = Foo()
>>> foo.__spam__
'spam'

Names that have both leading and trailing double underscores are reserved for a special use in the language. Such methods are often referred to as magic methods even though they have nothing to do with wizardry. Magic methods are called behind the scenes when certain circumstances occur. For example, when you create an instance of a class the necessary calls to __new__ and __init__ are made.

However, as far as naming conventions go, it’s best to stay away from using names that start and end with double underscores to avoid collision with future methods in the Python language.

Conclusion

Use:

* Single underscore _ for temporary or insignificant variables
* Single trailing underscore foo_ to avoid naming conflicts with Python keywords
* Single leading underscore _foo to indicate a name is meant for internal use
* Double leading underscore __foo to avoid naming conflicts and overriding in subclasses

Avoid:

* Double leading and trailing underscores __foo__ as they are used to indicate Python special methods

This is the first article from a series of posts about Python. I’m going to blog about more Python cool features & gotchas, built-in data structures, generators, coroutines, async, await & more. If you’ve liked what you’ve read, share, tweet, and check out the rest of the articles in our blog.

Your development partner beyond code.