This project focuses on understanding the language ecosystem
This project focuses on understanding the language ecosystem, not getting into programming details.
:sunrise_over_mountains: Python's Habitat
This topic describes how to set up the environment for Python development.
:snake: Python's Taxonomy
This topic describes features of the pattern of Python projects.
:anger: Python's Behavior
This topic describes how the language is designed and how it works.
:bug: Python's Feeding
This topic describes static code analysis, formatting patterns and style guides.
:mag: Python's Other Features
Extra topics.
Python needs a set of tools that are system requirements. If necessary, install these requirements with this command:
sudo apt update
sudo apt install\
software-properties-common\
build-essential\
libffi-dev\
python3-pip\
python3-dev\
python3-venv\
python3-setuptools\
python3-pkg-resources
Now, the environment is done to install Python
sudo apt install python
On Windows, I recommend using the package manager chocolatey and setting your Powershell to work as admin. See this tutorial.
Now, install Python
choco install python
Test
python --version
python --version
which python
sudo update-alternatives --list python
Add repository
This PPA contains more recent Python versions packaged for Ubuntu.
sudo add-apt-repository ppa:deadsnakes/ppa -y
Update packages
sudo apt update -y
Check which Python version is installed
python --version
Install Python
sudo apt install python3.<VERSION>
Add dependencies
sudo apt install curl -y
Update packages
sudo apt update -y
Install pyenv
curl https://pyenv.run | bash
Add these three lines from .bashrc or .zhshrc
export PATH="$HOME/.pyenv/bin:$PATH"
eval "$(pyenv init --path)"
eval "$(pyenv virtualenv-init -)"
Open a new terminal and execute
exec $SHELL
pyenv --version
Before installing other versions of Python it's necessary to set which system's Python will be used.
Use update-alternatives
It's possible use the update-alternatives
command to set priority to different versions of the same software installed in Ubuntu systems. Now, define priority of versions:
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.11 1
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.10 2
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.8 3
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.6 4
In directory /usr/bin
will be create simbolic link: /usr/bin/python -> /etc/alternatives/python*
Choose version
sudo update-alternatives --config python
Test
python --version
If return Python 2, try set a alias in /home/$USER/.bashrc
, see this example.
alias python=python3
NOTE: The important thing to realize is that Python 3 is not backwards compatible with Python 2. This means that if you try to run Python 2 code as Python 3, it will probably break.
PYTHONPATH
is an environment variable which you can set to add additional directories where python will look for modules and packages. Example: Apache Airflow read dag/
folder and add automatically any file that is in this directory.PYTHONHOME
indicate standard packages.Open profile
sudo vim ~/.bashrc
Insert Python PATH
export PYTHONHOME=/usr/bin/python<NUMER_VERSION>
Update profile/bashrc
source ~/.bashrc
Test
>>> import sys
>>> from pprint import pprint
>>> pprint(sys.path)
['',
'/usr/lib/python311.zip',
'/usr/lib/python3.11',
'/usr/lib/python3.11/lib-dynload',
'/usr/local/lib/python3.11/dist-packages',
'/usr/lib/python3/dist-packages']
Example with Apache Airflow
>>> import sys
>>> from pprint import pprint
>>> pprint(sys.path)
['',
'/home/project_name/dags',
'/home/project_name/config',
'/home/project_name/utilities',
...
]
Python can run in a virtual environment with isolation from the system.
Virtualenv enables us to create multiple Python environments which are isolated from the global Python environment as well as from each other.
When Python is initiating, it analyzes the path of its binary. In a virtual environment, it's actually just a copy or Symbolic link to your system's Python binary. Next, set the sys.prefix
location which is used to locate the site-packages
(third party packages/libraries)
sys.prefix
points to the virtual environment directory.sys.base.prefix
points to the non-virtual environment.ll
# random.py -> /usr/lib/python3.6/random.py
# reprlib.py -> /usr/lib/python3.6/reprlib.py
# re.py -> /usr/lib/python3.6/re.py
# ...
tree
├── bin
│ ├── activate
│ ├── activate.csh
│ ├── activate.fish
│ ├── easy_install
│ ├── easy_install-3.8
│ ├── pip
│ ├── pip3
│ ├── pip3.8
│ ├── python -> python3.8
│ ├── python3 -> python3.8
│ └── python3.8 -> /Library/Frameworks/Python.framework/Versions/3.8/bin/python3.8
├── include
├── lib
│ └── python3.8
│ └── site-packages
└── pyvenv.cfg
Create virtual environment
virtualenv -p python3 <NAME_ENVIRONMENT>
Activate
source <NAME_ENVIRONMENT>/bin/activate
Create and manage automatically a virtualenv for your projects, as well as adds/removes packages from your Pipfile as you install/uninstall packages. It also generates the ever-important Pipfile.lock
, which is used to produce deterministic builds.
Pipefile
Pipfile
# Pipfile
[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true
[dev-packages]
[packages]
requests = "*"
numpy = "==1.18.1"
pandas = "==1.0.1"
wget = "==3.2"
[requires]
python_version = "3.8"
platform_system = 'Linux'
# requirements.txt
requests
matplotlib==3.1.3
numpy==1.18.1
pandas==1.0.1
wget==3.2
pip3 install --user pipenv
Create environment
pipenv --python 3
See where virtual environment is installed
pipenv --venv
Activate environment
pipenv run
Install packages with Pipefile
pipenv install flask
# or
pipenv install --dev flask
Create lock file
pipenv lock
Requirements.txt
is file containing a list of items to be installed using pip install.
pip3 freeze
requirements.txt
pip3 freeze > requirements.txt
cat requirements.txt
pip3 install -r requirements.txt
Using pip and requirements.txt
file, have a real issue here is that the build isn’t deterministic. What I mean by that is, given the same input (the requirements.txt
file), pip does not always produce the same environment.
A set of command line tools to help you keep your pip-based packages fresh and ensure the deterministic build.
pip install pip-tools
pip3 freeze > requirements.in
pip-compile --generate-hashes requirements.in
output: requirements.txt
pip-compile --generate-hashes requirements.in
CPython can be defined as both an interpreter and a compiler.
.py
source file into a .pyc
bytecode for the Python virtual machine.The principal feature of CPython, is that it makes use of a global interpreter lock (GIL). This is a mechanism used in computer-language interpreters to synchronize the execution of threads so that only one native thread can execute at a time.
Therefore, for a CPU-bound task in Python, single-process multi-thread Python program would not improve the performance. However, this does not mean multi-thread is useless in Python. For a I/O-bound task in Python, multi-thread could be used to improve the program performance.
This can happen when multiple threads are servicing separate clients. One thread may be waiting for a client to reply, and another may be waiting for a database query to execute, while the third thread is actually processing Python code or other example is read multiples images from disk.
NOTE: we would have to be careful and use locks when necessary. Lock and unlock make sure that only one thread could write to memory at one time, but this will also introduce some overhead.
Removing the GIL would have made Python 3 slower in comparison to Python 2 in single-threaded performance. Other problem if remove the GIL it's would broke the existing C extensions which depend heavily on the solution that the GIL provides.
Although many proposals have been made to eliminate the GIL, the general consensus has been that in most cases, the advantages of the GIL outweigh the disadvantages; in the few cases where the GIL is a bottleneck, the application should be built around the multiprocessing structure.
Parser/tokenizer.c
Parser/parser.c
Python/compile.c
Python/compile.c
When Python executes this statement:
import my_lib
The interpreter searches my_lib.py
a list of directories
assembled from the following sources:
The resulting search can be accessed using the sys module:
import sys
sys.paths
# ['', '/usr/lib/python38.zip',
# '/usr/lib/python3.8',
# '/usr/lib/python3.8/lib-dynload',
# '/home/campos/.local/lib/#python3.8/site-packages',
# '/usr/local/lib/python3.8/dist-packages',
# '/usr/lib/python3/dist-packages']
Now, to see where a packeage was imported from you can use the attribute __file__
:
import zipp
zipp.__file__
# '/usr/lib/python3/dist-packages/zipp.py'
NOTE: you can see that the
__file__
directory is in the list of directories searched by the interpreter.
TODO
TODO
TODO
The static code analysis serves to evaluate the coding. This analysis must be done before submitting for a code review. The static code analysis can check:
The characteristics of a static analysis are:
A lint, is a static code analysis tool.
Pylint is a lint that checks for errors in Python code, tries to enforce a coding standard and looks for code smells. The principal features is:
.pylintrc
file where you can choose which errors or agreements are relevant to you.# Get Errors & Warnings
pylint -rn <file/dir> --rcfile=<.pylintrc>
# Get Full Report
pylint <file/dir> --rcfile=<.pylintrc>
isort is a Python tool/library for sorting imports alphabetically, automatically divided into sections. It is very useful in projects where we deal with a lot of imports [6].
# sort the whole project
isort --recursive ./src/
# just check for errors
isort script.py --check-only
Someone likes to write them in single quotes, someone in double ones. To unify the whole project, there is a tool that allows you to automatically align with your style guide — unify [6].
unify --in-place -r ./src/
Work recursively for files in the folder.
Docformater is utility helps to bring your docstring under the PEP 257 [6]. The agreement specifies how documentation should be written.
docformatter --in-place example.py
There are also automatic code formatters now, here are the popular one [6]:
To make the code consistent and make sure it's readable the style guides can help.
CapWords()
cat_words
MAX_OVERFLOW
Limit the clausule try:
minimal code necessary.
Yes:
try:
value = collection[key]
except KeyError:
return key_not_found(key)
else:
return handle_value(value)
No:
try:
# Too broad!
return handle_value(collection[key])
except KeyError:
# Will also catch KeyError raised by handle_value()
return key_not_found(key)
"Should explicitly state this as return None"
Yes:
def foo(x):
if x >= 0:
return math.sqrt(x)
else:
return None
No:
def foo(x):
if x >= 0:
return math.sqrt(x)
Docstrings must have:
def fetch_bigtable_rows(big_table, keys, other_silly_variable=None):
"""Fetches rows from a Bigtable.
Retrieves rows pertaining to the given keys from the Table instance
represented by big_table. Silly things may happen if
other_silly_variable is not None.
Args:
big_table: An open Bigtable Table instance.
keys: A sequence of strings representing the key of each table row
to fetch.
other_silly_variable: Another optional variable, that has a much
longer name than the other args, and which does nothing.
Returns:
A dict mapping keys to the corresponding table row data
fetched. Each row is represented as a tuple of strings. For
example:
{'Serak': ('Rigel VII', 'Preparer'),
'Zim': ('Irk', 'Invader'),
'Lrrr': ('Omicron Persei 8', 'Emperor')}
If a key from the keys argument is missing from the dictionary,
then that row was not found in the table.
Raises:
IOError: An error occurred accessing the bigtable.Table object.
"""
return None