This tutorial covers releasing a Python distribution locally using GateWay as an example. It focuses on how to package a simple project in Python. It will show you how to add the necessary files and structure to create a package, how to build a package, and how to install it on any platform.
1. Preparatory activities
Please note that these steps are not mandatory to proceed successfully, however they are considered good practice.
1.1. PIP upgrade
Before any Python package activity, it is always a good move to upgrade your main Python package manager, such as PIP. This can be done from any location, however, sometimes the user is unable to do so due to file system permissions. In this case, it can be done for a local user using the --user
option on the command line.
This tutorial is based on using GateWay, therefore a Python module shall be loaded first.
1.1.1. With necessary permissions
$ module load itm-python
$ python3 -m pip install --upgrade pip
1.1.2. Without necessary permissions
This solution is not recommended, but may be needed under certain circumstances, such as when the user does not have permissions.
$ module load itm-python
$ python3 -m pip install --upgrade pip --user
In that case it may be needed to add a local site-packages directory to $PATH:
$ PATH=$HOME/.local/bin:$PATH
1.2. Virtual Environment
At its core, the main purpose of Python virtual environments is to create an isolated environment for Python projects. This means that each project can have its own dependencies, regardless of what dependencies every other project has. The great advantage is that there is no limit to the number of environments that can be created as it is just a directory with a few scripts that handle system package isolation.
As shown in the previous section, there are at least two different locations where Python packages can be installed on the system. This can pose risks such as name shadowing or even suppression of certain packages from the kernel to the system. Most of the time, nothing like this happens, but even so, using a Python virtual environment is also considered good practice. This tutorial assumes usage of at least Python 3.6, therefore it should already have the venv
module from the standard library installed.
1.2.1. Creating a Virtual Python Environment
# Python 3
$ python3 -m venv env_name
And that's just it. It is important to use the venv module as others are deprecated. The name of the environment can be anything as it is just a directory name, however it is good practice to use names like env or venv or expand with these names.
1.2.1.1. Create a Virtual Environment With System Site Package Inheritance
For packages developed on a clustered system such as GateWay, it may be necessary to inherit packages from the system site environment typically set with modules. To achieve this, add the --system-site-packages option when creating a new virtual environment.
$ python -m venv env_name --system-site-packages
1.2.2. Activate The Virtual Environment
To activate the environment just created, navigate to the location where it was created, if not already there, and execute the following command:
$ source env_name/bin/activate
(env_name): $
As a result of activation, the currently used shell should explicitly show the name of the environment being activated on the command line.
1.2.3. Deactivate The Virtual Environment
After finishing work with the environment, it can be deactivated in the following way:
(env_name): $ deactivate
Removing The Virtual Environment
Normally it is not necessary to remove the environment, but this can be achieved by simply deleting the env_name
directory.
1.2.4. Upgrade PIP Inside The Virtual Environment
It is always good practice to update the PIP, even if it is up-to-date on the system site. This time the user should be able to access all files in the project directory, so there is no need to use the local --user flag with the install command. On GateWay, it should look like this:
# Python 3
$ python -m venv env_name --system-site-packages
$ source env_name/bin/activate
(env_name): $ python3 -m pip install --upgrade pip
2. Package directory starter
A the beginning the minimum starter project structure should look like below:
project_name
├── LICENSE
├── project_name
│ ├── __init__.py
│ └── project_name.py
├── README.md
├── requirements.txt
├── pyproject.toml
├── setup.cfg
├── setup.py
└── tests
A good practices for creating new package directory structure:
- All names should be lowercase
- Names should be underscore-separeted (no hyphens)
- The project name should be as unique as possible
- The root directory should mainly store metadata files
- Source files should be stored in a subdirectory with the same name as the project directory
- Always pay attention to the inclusion of the LICENSE file
2.1.1. Package directory contents
2.1.1.1. LICENSE file
├── LICENSE
- It’s important for every package to include a license. This tells users who install package the terms under which they can use it.
2.1.1.2. README.md file
├── README.md
- It can be customised in a way that best describes the project.
- Normally, the package configuration loads README.md to provide a long_description, README.md must be included along with code when generating a source distribution.
2.1.1.3. requirements.txt file
├── requirements.txt
- Allows to store dependency versions for a project that are considered concrete and essential.
- All necessary dependencies can be collected into requirements.txt with PIP like below:
$ python -m pip freeze > requirements.txt
2.1.1.4. pyproject.toml file
├── pyproject.toml
- A build-system independent way to specify project dependencies. It is a way to step away from distutils / setuptools.
- Tells build tools (like pip and build) what is required to build project.
- Minimal content:
[build-system]
requires = [
"setuptools>=42",
"wheel"
]
build-backend = "setuptools.build_meta"
- build-system.requires gives a list of packages that are needed to build project package.
Listing something here will only make it available during the build, not after it is installed.
- build-system.build-backend is the name of Python object that will be used to perform the build.
- This is to omit setup.py and setup.cfg in the future, but is currently not fully supported.
2.1.1.5. Metadata configuration
There are two types of metadata: static and dynamic.
Static metadata (setup.cfg): guaranteed to be the same every time. This is simpler, easier to read, and avoids many common errors, like encoding errors.
Dynamic metadata (setup.py): possibly non-deterministic. Any items that are dynamic or determined at install-time, as well as extension modules or extensions to setuptools, need to go into setup.py.
Static metadata (setup.cfg) should be preferred. Dynamic metadata (setup.py) should be used only as an escape hatch when absolutely necessary. setup.py used to be required, but can be omitted with newer versions of setuptools and pip.
2.1.1.5.1. setup.cfg file - static metadata
├── setup.cfg
- setup.cfg is the configuration file for setuptools. It tells setuptools about project package (such as the name and version) as well as which code files to include. Eventually much of this configuration should be able to move to pyproject.toml.
- Minimal content:
[metadata]
name = project_name
version = 0.0.1
author = Example Author
author_email = author@example.com
description = Sample short description of the package
long_description = file: README.md
long_description_content_type = text/markdown
url = https://github.com/author/project_name
project_urls =
Bug Tracker = https://github.com/author/project_name/issues
classifiers =
Programming Language :: Python :: 3
License :: OSI Approved :: MIT License
Operating System :: OS Independent
[options]
package_dir =
= project_name
packages = find:
python_requires = >=3.6
[options.packages.find]
where = project_name
2.1.1.5.2. setup.py file - dynamic metadata
├── setup.py
- setup.py is a build script for setuptools and setup.cfg. However, it works dynamically to provide setuptools with information about your package (such as name and version) as well as code files that need to be included.
- Minimal content:
import setuptools
with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()
setuptools.setup(
name="project_name",
version="0.0.1",
author="Example Author",
author_email="author@example.com",
description="Sample short description of the package",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/author/project_name",
project_urls={
"Bug Tracker": "https://github.com/author/project_name/issues",
},
classifiers=[
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
],
package_dir={"": "project_name"},
packages=setuptools.find_packages(where="project_name"),
python_requires=">=3.6",
)