In Python 3, bytes contains sequences of 8-bit values, str contains
sequences of Unicode characters. bytes and str instances can't be
used together with operators (like > or +).
In Python 2, str contains sequences of 8-bit values, unicode contains
sequences of Unicode characters. str and unicode can be used together
with operators if the str only contains 7-bit ASCII characters.
Use helper functions to ensure that the inputs you operate on are the
type of character sequence you expect (8-bit values, UTF-8 encoded
characters, Unicode characters, etc.)
If you want to read or write binary data to/from a file, always open the
file using a binary mode (like 'rb' or 'wb').
Avoid being verbose: Don't supply 0 for the start index or the length of
the sequence for the end index.
Slicing is forgiving of start or end indexes that are out of bounds,
making it easy to express slices on the front or back boundaries of a
sequence (like a[:20] or a[-20:]).
Assigning to a list slice will replace that range in the original
sequence with what's referenced even if their lengths are different.
Specifying start, end, and stride in a slice can be extremely confusing.
Prefer using positive stride values in slices without start or end
indexes. Avoid negative stride values if possible.
Avoid using start, end and stride together in a single slice. If you need
all three parameters, consider doing two assignments (one to slice,
another to stride) or using islice form itertools built-in module.
The zip built-in function can be used to iterate over multiple iterators
in parallel.
In Python 3, zip is a lazy generator that produces tuples. In Python 2,
zip returns the full result as a list of tuples.
zip truncates its outputs silently if you supply it with iterators of
different lengths.
The zip_longest function from the itertools built-in module lets you
iterate over multiple iterators in parallel regardless of their
lengths (see Item 46: Use built-in algorithms and data structures).
Functions that return None to indicate special meaning are error prone
because None and other values (e.g., zero, the empty string) all
evaluate to False in conditional expressions.
Raise exceptions to indicate special situations instead of returning
None. Expect the calling code to handle exceptions properly when they
are documented.
Beware of functions that iterate over input arguments multiple times. If
these arguments are iterators, you may see strange behavior and missing
values.
Python's iterator protocol defines how containers and iterators interact
with the iter and next built-in functions, for loops, and related
expression.
You can easily define your own iterable container type by implementing
the iter method as a generator.
You can detect that a value is an iterator (instead of a container) if
calling iter on it twice produces the same result, which can then be
progressed with the next built-in function.
Keyword arguments make the intention of a function call more clear.
Use keyword-only arguments to force callers to supply keyword arguments
for potentially confusing functions, especially those that accept
multiple Boolean flags.
Python 3 supports explicit syntax for keyword-only arguments in
functions.
Python 2 can emulate keyword-only arguments for functions by using
**kwargs and manually raising TypeError exceptions.
Instead of defining and instantiating classes, functions are often all
you need for simple interfaces between components in Python.
References to functions and methods in Python are first class, meaning
they can be used in expressions like any other type.
The call special method enables instances of a class to be called
like plain Python functions.
When you need a function to maintain state, consider defining a class
that provides the call method instead of defining a stateful closure
(see Item 15: "Know how closures interact with variable scope").
Inherit directly from Python's container types (like list or dict) for
simple use cases.
Beware of the large number of methods required to implement custom
container types correctly.
Have your custom container types inherit from the interface defined in
collections.abc to ensure that your classes match required interfaces
and behaviors.
Use getattr and setattr to lazily load and save attributes for an
object.
Understand that getattr only gets called once when accessing a
missing attribute, whereas getattribute gets called every time an
attribute is accessed.
Avoid infinite recursion in getattribute and setattr by using
methods from super() (i.e., the object class) to access instance
attributes directly.
Moving CPU bottlenecks to C-extension modules can be an effective way to
improve performance while maximizing your investment in Python code.
However, the cost of doing so is high and may introduce bugs.
The multiprocessing module provides powerful tools that can parallelize
certain types of Python computation with minimal effort.
The power of multiprocessing is best accessed through the
concurrent.futures built-in module and its simple ProcessPoolExecutor
class.
The advanced parts of the multiprocessing module should be avoided
because they are so complex.
The with statement allows you to reuse logic from try/finally blocks and
reduce visual noise.
The contextlib built-in module provides a contextmanager decorator that
makes it easy to use your own functions in with statements.
The value yielded by context managers is supplied to the as part of the
with statement. It's useful for letting your code directly access the
cause of the special context.
Write documentation for every module, class and function using
docstrings. Keep them up to date as your code changes.
For modules: introduce the contents of the module and any important
classes or functions all users should know about.
For classes: document behavior, important attributes, and subclass
behavior in the docstring following the class statement.
For functions and methods: document every argument, returned value,
raised exception, and other behaviors in the docstring following the
def statement.
Packages in Python are modules that contain other modules. Packages allow
you to organize your code into separate, non-conflicting namespaces with
unique absolute module names.
Simple package are defined by adding an init.py file to a directory
that contains other source files. These files become that child modules
of the directory's package. Package directories may also contain other
packages.
You can provide an explict API for a module by listing its publicly
visible name in its all special attribute.
You can hide a package's internal implementation by only importing public
names in the package's init.py file or by naming internal-only
members with a leading underscore.
When collaborating within a single team or on a single codebase, using
all for explicit APIs is probably unnecessary.
Virtual environment allow you to use pip to install many different
versions of the same package on the same machine without conflicts.
Virtual environments are created with pyvenv, enabled with source
bin/activate, and disabled with deactivate.
You can dump all of the requirements of an environment with pip freeze.
You can reproduce the environment by supplying the requirements.txt file
to pip install -r.
In versions of Python before 3.4, the pyvenv tool must be downloaded and
installed separately. The command-line tool is called virtualenv instead
of pyvenv.
Calling print on built-in Python types will produce the human-readable
string version of a value, which hides type information.
Calling repr on built-in Python types will produce the printable string
version of a value. These repr strings could be passed to the eval
built-in function to get back the original value.
%s in format strings will produce human-readable strings like str.%r will
produce printable strings like repr.
You can define the repr method to customize the printable
representation of a class and provide more detailed debugging
information.
You can reach into any object's dict attribute to view its internals.
The only way to have confidence in a Python program is to write tests.
The unittest built-in module provides most of the facilities you'll need
to write good tests.
You can define tests by subclassing TestCase and defining one method per
behavior you'd like to test. Test methods on TestCase classes must start
with the word test.
It's important to write both unit tests (for isolated functionality) and
integration tests (for modules that interact).
You can initiate the Python interactive debugger at a point of interest
directly in your program with the import pdb; pdb.set_trace() statements.
The Python debugger prompt is a full Python shell that lets you inspect
and modify the state of a running program.
pdb shell commands let you precisely control program execution, allowing
you to alternate between inspecting program state and progressing program
execution.