Virtualise Everything

Python environments: from virtual to contained environments

The joy of package management!

I have always liked to be in control and I have always enjoyed reading to understand how things work. So about a decade ago I decided to leave the Windows ecosystem and I hopped on the Linux bandwagon.

I experimented a fair bit with Damn Small Linux, Linux From Scratch, Gentoo, etc. And although having near total control, it was way too technical for my younger self. I then enjoyed Debian for a few months and I settled with Ubuntu for a couple of years. Having already tried a few flavours, I started to appreciate how important package management is and how painful it could sometimes be.

About 6 years ago, I discovered Arch Linux and the concept of rolling release. I’ve never looked back since moving to that distribution as the documentation is absolutely brilliant, the community is very active and the package manager is near perfect. No more package management issues, latest versions of any packages at any time (there is a bit of a lag for some packages), exactly what I was looking for!

You can imagine my shock when I started to program with python. The python ecosystem goes against everything I was used to. Even the first python programming books I bought were advising to learn python 2 as most libraries/frameworks were in python 2 and python 3 was too immature… Shocking! (for me at least)

Anyway, embracing python as a programming language meant I had to get my hands dirty and go back to actively manage packages.

I quickly realised that pip would install packages in different places, depending on how/what you install. (Have you ever done ?) By default, installs system packages, available for all users. The flag will install site packages which will be installed in your home directory, available only to you and not requiring any superuser privilege.

Virtual Environments

A virtual environment is a directory tree that will contain a specific version of Python and packages. As said before, Python can run from different locations, a virtual environment will ensure Python runs in a fully controlled manner, from a chosen directory.

Venv

Although shipping with python, I won’t be talking about the package as is a much more powerful alternative. Comparison here Just know you can create an environment with .

Virtualenv

One of the lowest level for managing python virtual environments. It is not the most elegant way but it offers a lot of control. By creating a virtual environment, each project will have its own environment and so the package management is rather easy as you can just the packages in the version that works for your project.

To get started, install virtualenv:

  • or even better, via your favourite package manager.
pacmanInstall
  • to create a folder with all your virtual environments.
  • to get into the directory.
  • to create your environment.
createEnv
  • To activate the environment, run from the directory.
activateEnv
  • To deactivate the environment: .

As you can see in the image above, the and the python interpreters are different: we are switching from the virtual environment to the system environment. This means if your system is up to date and running , you can still develop a project using for example. Just use the switch: to create python 2.7 environment.

Once the environment set up and active (the tag at the beginning of the command prompt tells you which environment is running), go ahead and install the packages you need for your project, they'll be installed in the environment (in the directory). You can then run to save the packages and their versions into the text file.

Freezing packages is particularly useful if you need to transfer your project. Simply create a new virtual environment on the machine you need to import the project/environment to (make sure to initiate the correct version of python) then run to upgrade/downgrade the packages to the correct version.

Virtualenvwrapper

Another, higher level, way to organise your virtual environments is to use virtualenvwrapper: Once installed, you can run to check which its path and add the following lines to your file

Now to load the new configuration file.

  • Create a new environment with . If you have a requirements.txt file: can be used to install the packages in the required version.
  • List the packages installed with but if you want to create the file, use
  • List your environments with
  • Activate a specific environment with
  • Change to the environment directory with (you can check the path with once the environment is activated)
  • Deactivate the environment with
  • Remove an environment with

Slightly different commands but the result is the same. is a set of extensions to .

Other useful packages for managing virtual environments are:

  • pipenv which is combining , and (another way to address requirements.txt).
  • pyenv which aims to isolate Python versions, the virtual environments are managed with or

Conda

Conda is very popular amongst data scientists for a few reasons:

  • It bundles Intel MKL which makes some libraries (like numpy) faster.
  • It manages packages locally so there is no need for superuser privilege.
  • It comes with a lot of industry standard packages.
  • Anaconda inc. is a company which offers support contracts.
  • Makes using python on Windows a lot easier.

Not only does it manage packages, it also allows for environment management.

  • , if you need a specific version of python or to create an environment from an file (similar to the file)
  • will create the file.
  • to add packages, the version number is optional.
  • to deactivate an environment.
  • to activate an environment.
  • will start the anaconda navigator with all the applications installed in the environment
CondaNavigator

As you can see, no matter what tool you are using, virtual environments work in the same way: creation, activation, installation of packages, creation of text file (so the environment can be replicated), deactivation.

Contained environments

Docker

Another powerful way to address the environment issue is to use containers solutions such as Docker. For most situations, it is a little bit over the top but I will be addressing this because:

  • it is rather easy to implement
  • once you understand how Docker works, you get access to a lot of really useful images such as: , , , , etc.

First, install Docker.

Install

Add the user to the docker group

Group

And make sure you activate the docker service

Services

You can then execute

Hello-World!

There you go, you’ve just ran your first container!

A few useful commands:

  • to download the container image.
  • to execute the container.
  • to list the containers locally available.
  • to remove a container from your machine.

Make sure to visit the Docker hub which lists all the (shared) images available. I would also strongly advise to read the images description as they will always explain how to run the image. For example, you can see above that the postgreSQL container starts with .

If you are interested in working with (note that you'll have to create an account on Docker Hub, then use ) run to download the image then to start a container with a jupyter notebook.

If you are interested in building your own image, have a look here, this documentation will give you the basics to achieve your first build.

Conclusion

This article was a rather quick overview of the various solutions available to sort out the environment issue. is interesting for all the apps it can manage ( is quite a nice IDE, for example) but should be sufficient for most scenarios. We went a little bit off the beaten track by talking about Docker but mastering this tool is really interesting as it opens a whole new world!

Data Scientist

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store