We're going to talk about underscores, dunders, encapsulation and magic methods in Python
Python is an easy to learn language that provides a stepping-stone into the world of programming, but some of it's features are confusing for beginners and advanced developers. At the end of this article you'll know when and how to use underscores, dunders, magic methods and encapsulation in Python.
Single and double leading (or trailing) underscores have different meanings in Python. Most of the time it's just a convention (hint to the programmer), but there are cases where they're enforced by the Python interpreter. We're going to talk about:
- Single underscore: _
- Single trailing underscore: foo_
- Single leading underscore: _spam
- Double leading underscores: __ham
- Double leading and trailing underscores: __eggs__
Double underscores are reffered to as dunders because they appear quite often in Python code and it's easier to use the shorten "dunder" instead of "double underscore".
A single stand-alone underscore is used to indicate that a variable is temporary or insignificant. This meaning is per convention only and doesn't trigger any special behavior in the Python parser. Single underscore is valid variable name that's used for this purpose.
If you're iterating and don't need access to the running index you can use
_ to indicate that it's just a temporary value:
for _ in range(42): print('Zen of Python')
If you're unpacking person information from a tuple (or any expression), but don't care about the eye color, you can use
_ to mark it as insignificant:
name, age, _ = ('Pavlin', 25, 'brown')
Bonus feature: In most Python REPLs the single underscore is a special variable that represents the result of the last evaluated expression:
>>> 5 + 5 10 >>> _ 10
Sooner or later one ends up using a Python keyword (class, type, list, etc.) as variable name, because it fits well in his context, but this is a bad practice and in some cases can end up in
>>> def foo(class, assert): File "<stdin>", line 1 def foo(class, assert): ^ SyntaxError: invalid syntax
To avoid naming conflicts append a single underscore to the variable name:
def foo(class_, assert_): print('Zen of Python')
The underscore prefix is used to hint the programmer that a variable or method is intended for internal use. However, this convention isn't enforced by the Python interpreter. Python does not have strong distinction between private and public variables like Java or C++:
class Foo: def __init__(self): self.spam = 42 self._ham = 42 >>> foo = Foo() >>> foo.spam 42 >>> foo._ham 42
When it comes to variable and method names a single leading underscore won't prevent access to them, but the leading underscore does impact how names get imported from modules:
# example.py def foo(): return 42 def _bar(): return 42
If one uses a wildcard import (import *) to import all names from the module, Python won't import names with a leading underscore unless the module defines an
__all__ list that overrides this behavior:
>>> from example import * >>> foo() 42 >>> _bar() Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name '_bar' is not defined
Unlike wildcard imports, regular imports are not affected by the leading single underscore naming convention:
>>> from example import foo, _bar >>> foo() 42 >>> _bar() 42
NOTE: Wildcard imports should be avoided at all cost as they make it unclear which names are present in the namespace!
All of the naming patterns so far were agreed-upon conventions only, but things are a little bit different with Python attributes that start with double underscores. A dunder prefix causes the Python interpreter to rewrite the attribute name in order to avoid naming conflicts in subclasses, also called name mangling:
class Foo: def __init__(self): self.spam = 42 self._ham = 42 self.__eggs = 42 >>> foo = Foo() >>> foo.spam 42 >>> foo._ham 42 >>> foo.__eggs Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'Foo' object has no attribute '__eggs'
What happened? Let’s take a look at the attributes of the foo object using the built-in
>>> dir(foo) ['_Foo__eggs', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_ham', 'spam'] >>> foo._Foo__eggs 42
If you look closely you'll find out there's an attribute called
_Foo__eggs. This is the name mangling Python interpreter applies to protect the variable from being overridden in subclasses:
class ExtendsFoo(Foo): def __init__(self): super().__init__() self.spam = 1.618 self._ham = 1.618 self.__eggs = 1.618 >>> golden_foo = ExtendsFoo() >>> dir(golden_foo) ['_ExtendsFoo__eggs', '_Foo__eggs', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_ham', 'spam']
_Foo__eggs still exists as an attribute of the new object, that extended
class Foo, and there's the new overridden
__eggs attribute as
>>> golden_foo.spam 1.618 >>> golden_foo._ham 1.618 >>> better_foo._ExtendsFoo__eggs 1.618 >>> better_foo._Foo__eggs 42
Encapsulation in Python lacks strict access control such as private and protected attributes. It will stop you from accidentally accesing stuff, but you can intentionally do pretty much everything as long as you're aware how the language works.
In the examples above we used attributes, but the same rules apply for method names also. In short name mangling affects all names that start with two underscore characters in a class context. Having that in mind let's take a look at another example:
# example.py _Foo__mangled = 42 class Foo: def bar(self): return __mangled >>> foo = Foo() >>> foo.bar() 42
Why did it work? As we just said the Python interpreter expanded the name
_Foo__mangled because it begins with dunder.
Something I didn't mention in the italic text above is name mangling is not applied if a name starts and ends with double underscores:
class Foo: def __init__(self): self.__spam__ = 42 >>> foo = Foo() >>> foo.__spam__ 42
Methods that have both leading and trailing double underscores are reserved for special use in the language. These dunders are often referred to as magic methods even thought they have nothing to do with wizardry. Magic methods are called behind the scenes when certain circumstances occur. For example when you create an instance of class the necessary calls to
__init__ are made.
However, as far as naming conventions go, it's best to stay away from using names that start and end with double underscores to avoid collision with future changes in the Python language.
To summarize, use:
- Single underscore to discard variables
- Single trailing underscore to avoid naming conflicts with Python keywords
- Single leading underscore to indicate a name is meant for internal use
- Double leading underscore to avoid naming conflicts and overridding in subclasses
- Double leading and trailing underscore for special methods defined in Python only
This is the first article from series of posts about Python. I'm going to blog about more Python cool features & gotchas, built-in data structures, generators, coroutines, async, await & more. If you like what you've read subscribe for our newsletter, share, tweet and visit our blog for more useful and intriguing posts <3