You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 22 Next »

This tutorial covers releasing a Python distribution locally using GateWay as an example. It focuses on how to package a simple project in Python. It will show you how to add the necessary files and structure to create a package, how to build a package, and how to install it on any platform.


1. Preparatory activities

Please note that these steps are not mandatory to proceed successfully, however they are considered good practice.

1.1.  PIP upgrade

Before any Python package activity, it is always a good move to upgrade your main Python package manager, such as PIP. This can be done from any location, however, sometimes the user is unable to do so due to file system permissions. In this case, it can be done for a local user using the --user option on the command line.
This tutorial is based on using GateWay, therefore a Python module shall be loaded first.

1.1.1. With necessary permissions

$ module load itm-python
$ python3 -m pip install --upgrade pip

1.1.2. Without necessary permissions

This solution is not recommended, but may be needed under certain circumstances, such as when the user does not have permissions.

$ module load itm-python
$ python3 -m pip install --upgrade pip --user

In that case it may be needed to add a local site-packages directory to $PATH:

$ PATH=$HOME/.local/bin:$PATH

1.2.  Virtual Environment

At its core, the main purpose of Python virtual environments is to create an isolated environment for Python projects. This means that each project can have its own dependencies, regardless of what dependencies every other project has. The great advantage is that there is no limit to the number of environments that can be created as it is just a directory with a few scripts that handle system package isolation.

As shown in the previous section, there are at least two different locations where Python packages can be installed on the system. This can pose risks such as name shadowing or even suppression of certain packages from the kernel to the system. Most of the time, nothing like this happens, but even so, using a Python virtual environment is also considered good practice. This tutorial assumes usage of at least Python 3.6, therefore it should already have the venv module from the standard library installed.

1.2.1. Creating a Virtual Python Environment

# Python 3
$ python3 -m venv env_name

And that's just it. It is important to use the venv module as others are deprecated. The name of the environment can be anything as it is just a directory name, however it is good practice to use names like env or venv or expand with these names.

1.2.1.1. Create a Virtual Environment With System Site Package Inheritance

For packages developed on a clustered system such as GateWay, it may be necessary to inherit packages from the system site environment typically set with modules. To achieve this, add the --system-site-packages option when creating a new virtual environment.

$ python -m venv env_name --system-site-packages

1.2.2. Activate The Virtual Environment

To activate the environment just created, navigate to the location where it was created, if not already there, and execute the following command:

$ source env_name/bin/activate
(env_name): $

As a result of activation, the currently used shell should explicitly show the name of the environment being activated on the command line.

1.2.3. Deactivate The Virtual Environment

After finishing work with the environment, it can be deactivated in the following way:

(env_name): $ deactivate

Removing The Virtual Environment

Normally it is not necessary to remove the environment, but this can be achieved by simply deleting the env_name directory.

1.2.4. Upgrade PIP Inside The Virtual Environment

It is always good practice to update the PIP, even if it is up-to-date on the system site. This time the user should be able to access all files in the project directory, so there is no need to use the local --user flag with the install command. On GateWay, it should look like this:

# Python 3
$ python -m venv env_name --system-site-packages
$ source env_name/bin/activate
(env_name): $ python3 -m pip install --upgrade pip

2.  Package directory starter

 A the beginning the minimum starter project structure should look like below:

project_name
    ├── LICENSE
    ├── project_name
    │   ├── __init__.py
    │   └── project_name.py
    ├── README.md
    ├── requirements.txt
    ├── pyproject.toml 
    ├── setup.cfg
    ├── setup.py
    └── tests

A good practices for creating new package directory structure:

  • All names should be lowercase
  • Names should be underscore-separeted (no hyphens)
  • The project name should be as unique as possible
  • The root directory should mainly store metadata files
  • Source files should be stored in a subdirectory with the same name as the project directory
  • Always pay attention to the inclusion of the LICENSE file

2.1.1. Package directory contents

2.1.1.1. LICENSE file

├── LICENSE
  • It’s important for every package to include a license. This tells users who install package the terms under which they can use it.

2.1.1.2. README.md file

├── README.md
  • It can be customised in a way that best describes the project.
  • Normally, the package configuration loads README.md to provide a long_description, README.md must be included along with code when generating a source distribution.

2.1.1.3. requirements.txt file

├── requirements.txt
  • Allows to store dependency versions for a project that are considered concrete and essential.
  • All necessary dependencies can be collected into requirements.txt with PIP like below:
$ python -m pip freeze > requirements.txt

2.1.1.4. pyproject.toml file

 ├── pyproject.toml  
  • A build-system independent way to specify project dependencies. It is a way to step away from distutils / setuptools.
  • Tells build tools (like pip and build) what is required to build project.
  • Minimal content:
[build-system]
requires = [
    "setuptools>=42",
    "wheel"
]
build-backend = "setuptools.build_meta"
  • build-system.requires gives a list of packages that are needed to build project package.

Listing something here will only make it available during the build, not after it is installed.

  • build-system.build-backend is the name of Python object that will be used to perform the build.
  • This is to omit setup.py and setup.cfg in the future, but is currently not fully supported.

2.1.1.5. Metadata configuration

There are two types of metadata: static and dynamic.

  • Static metadata (setup.cfg): guaranteed to be the same every time. This is simpler, easier to read, and avoids many common errors, like encoding errors.

  • Dynamic metadata (setup.py): possibly non-deterministic. Any items that are dynamic or determined at install-time, as well as extension modules or extensions to setuptools, need to go into setup.py.

Static metadata (setup.cfg) should be preferred. Dynamic metadata (setup.py) should be used only as an escape hatch when absolutely necessary. setup.py used to be required, but can be omitted with newer versions of setuptools and pip.

2.1.1.5.1. setup.cfg file - static metadata
├── setup.cfg
  • setup.cfg is the configuration file for setuptools. It tells setuptools about project package (such as the name and version) as well as which code files to include. Eventually much of this configuration should be able to move to pyproject.toml.
  • Minimal content:
[metadata]
name = project_name
version = 0.0.1
author = Example Author
author_email = author@example.com
description = Sample short description of the package
long_description = file: README.md
long_description_content_type = text/markdown
url = https://github.com/author/project_name
project_urls =
    Bug Tracker = https://github.com/author/project_name/issues
classifiers =
    Programming Language :: Python :: 3
    License :: OSI Approved :: MIT License
    Operating System :: OS Independent

[options]
package_dir =
    = project_name
packages = find:
python_requires = >=3.6

[options.packages.find]
where = project_name
setup.cfg - options description
  • name is the distribution name of project package. This can be any name as long as it only contains letters, numbers, _ , and -.

  • version is the package version.

  • author and author_email are used to identify the author of the package.

  • description is a short, one-sentence summary of the package.

  • long_description is a detailed description of the package. In this case, the long description is loaded from README.md (which is a common pattern) using the file: directive.

  • long_description_content_type tells what type of markup is used for the long description. In this case, it’s Markdown.

  • url is the URL for the repository homepage of the project.

  • project_urls lets list any number of extra links related to project. Generally this could be to documentation, issue trackers, etc.

  • classifiers gives the PIP some additional metadata about package. In this case, the package is only compatible with Python 3, is licensed under the MIT license, and is OS-independent.

In the options category, there are controls for setuptools itself:

  • package_dir is a mapping of package names and directories. An empty package name represents the “root package” — the directory in the project that contains all Python source files for the package — so in this case the project_name directory is designated the root package.

  • packages is a list of all Python import packages that should be included in the distribution package. Instead of listing each package manually, the find: directive can be used to automatically discover all packages and subpackages and options.packages.find to specify the package_dir to use. In the present example, the package list will be empty because there are no subpackages in the package.

  • python_requires gives the versions of Python supported by project. Installers like pip will look back through older versions of packages until it finds one that has a matching Python version.

  • No labels