Anyone who’s worked with Python knows that modules can be a Godsend, saving you time, effort, and many lines of code.
They even have namespacing built-in 💪 😍
To expand on this a bit:
- Saving time, effort, and lines of code: Python modules allow for the organization of code into separate files, which can be used and reused across different programs or parts of the same program.
This organization helps to save time and effort since you can reuse code, rather than writing the same functions and classes over and over again.
This is a common reason why developers create and use modules — to save time and to make their code more readable and maintainable. - Namespacing built-in: Python modules also provide what is known as ‘namespacing’.
A namespace is essentially a container that holds a set of identifiers (such as variable names, function names, class names, etc.), and allows the disambiguation of these identifiers from other sets in different namespaces.
When you import a module in Python, the module name acts as a namespace, which means you can have functions or classes with the same name in different modules without conflict.
However, not all ways of using modules are equally beneficial. In this article, we will discuss why using import *
can be more problematic than it’s worth, and what you should do instead.
Why is this a problem?
When you use import *
, Python brings in every single function and variable from the other module into your own.
This can cause something called “namespace pollution” 😱
It’s certainly quick and easy, but it can lead to confusion and bugs!
For example, if the module you’re importing from has a function or variable with the same name as one in your own module, your version gets overridden 🤯
This can lead to unexpected behavior in your code + making debugging extra hard.
A practical example
Consider the following example:
1. You have a colors.py
def print_colors():
return ['red', 'green', 'blue']
2. And you have a shapes.py
def print_colors():
return ['square', 'circle', 'triangle']
3. And lastly you have a script.py that imports from both:
from colors import *
from shapes import * # Overrides print_colors from colors module
print(print_colors()) # Output: ['square', 'circle', 'triangle']
Extreme example but it’s to illustrate the danger of using import *
In this case, we might reasonably expect print_colors()
to give us ['red', 'green', 'blue']
from the colors module.
However, because we imported from the shapes
module after the colors
module, the print_colors()
function from the shapes module overrides the one from the colors module.
As a result, the output of this script will actually be ['square', 'circle', 'triangle']
😵
Again, this is a trivial demo example, but in large code bases this can be way more confusing and insidious!
The better way
Instead of import *
, it’s better practice to import only the specific functions (objects) you need, e.g. from os import path
.
Or import the module under an alias like often done for numpy and pandas: import numpy as np
and import pandas as pd
This way, there’s less ambiguity, and you’re less likely to experience bugs due to namespace pollution 💡
Remember, “explicit is better than implicit” 🔍🐍
What PEP8 has to say
This is also stated in the PEP8 style guide (and a good reminder to read through it and stick to its recommendations).
Wildcard imports (
Imports section of PEP8from <module> import *
) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools.
The stylised version is great. We also did a 5 min summary a long time ago.
Conclusion
In conclusion, while import *
may seem like a quick and convenient way to bring functions and variables into your script, it can lead to significant issues down the line, making your code harder to read and debug.
By being explicit in your imports, as recommended by the PEP 8 style guide, you can write cleaner, more maintainable Python code.
Bonus tip: __all__ dunder
Although you cannot prevent somebody else from doing an import *
, as a package author you can be proactive about this and use __all__
to list just the modules you allow to be imported.
See this example (Pybites tip 77 from our book):
$ more mod.py
__all__ = ['a', 'b']
def a():
pass
def b():
pass
def c():
pass
$ python
>>> from mod import *
>>>
>>> a()
>>> b()
>>> c()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'c' is not defined
Here __all__
defines the public interface of the module so that when from module import *
is used, only the names in __all__
are imported 💡
This allows you, as a package author, to have finer control over the public interface of your module.
It ensures that only the intended components are exposed and reduces the risk of unexpected behaviors when others use your code 😍