Article header image.

pyenv: a free and simple alternative to Conda

Programming

19 August 2024 | Reading time: 9 minutes

Header image: Christina Morillo on Pexels

An article has recently been going around about how a popular company for Python tools — Anaconda — recently decided that not-for-profit organizations (such as universities) need to pay to use their tools. This is a major issue for probably hundreds of universities around the world that use Anaconda products (like Conda) to manage their Python distribution. They are even seeking back-dated license fees for use of their software — an extremely shady business practice for a major staple of the Python ecosystem.

I think that the Conda change is emblematic of the woeful state of Python packaging in 2024. Unlike for languages like Rust (which has cargo) or JavaScript (which has npm & nvm), there is no community consensus about the right tool to use to manage Python versions and environments. Instead, we are stuck with at least five different popular tools, one of which is Conda.

It’s no surprise that the (for profit) Anaconda company are taking advantage of this.

Conda has long been popular in the (data) science community because of its simplicity. Installing Conda gives you a large suite of pre-installed packages. Anaconda actually ‘build’ Python packages themselves to then distribute to users, which can make setup faster on different machines.

However, Conda are now going to start charging $50 per month per user (!!!) for use of Conda in organizations with over 200 employees, and have sent legal threats to research institutes not in compliance with their… updated interpretation of their terms and conditions.

In this blog, I’d like to talk about:

  • The alternatives to Conda
  • pyenv: a simple solution
  • An example of how I use it to develop a package locally.

The alternatives

As mentioned before, Python distribution and environment management is currently in a sorry state, and this is probably part of why a for-profit company somehow still has relevance in the open-source Python ecosystem in 2024. In the wider Python software development world, there are many tools to manage your Python install, like:

each with varying levels of features. An open problem in Python is that there is no standard to define a development environment, meaning that many of them are also not compatible with each other.

So, even if I recommend my favourite from this list right now (which is Rye), it is relatively useless to scientists, as getting collaborators to all install and use Rye just to install the correct packages and run a notebook you want to send them is not ideal. Especially if Collaborator B prefers Poetry, Collaborator C preferes Hatch, and Collaborator D is new to Python and is confused by the whole thing.

And let me be clear: I don’t think that Conda (or any of the forks of Conda that are ‘free’ for now) are good at solving this issue either. The Conda ecosystem locks you in to using Conda. You cannot install packages with pip if you’re using Conda without risking a broken Conda environment, meaning that you can’t access most of the packages on the official Python repositiory, PyPI.

Also, to make it even more complicated, Conda is a very bloated tool that requires many GBs of storage space per enviroment, making it impractical on machines with less disk space (like a small user account on a HPC cluster). The ‘solution’ in the Conda world is to use yet another alternative tool (😭) like miniconda, or mamba, or Conda forge… the list goes on.

I would love to develop in Python with a ✨single standard tool✨ (like how JavaScript has npm). However, until the community coalesces around one tool, I’d like to recommend a simple solution. My approach to Python version and environment management is compatible with the Python Packaging Authority standards for installing packages, as it relies on pip; it can be installed on any machine; and it has far less disk space overhead than Conda. Oh, also it doesn’t cost $600 per year - it’s free, like any Python packaging tool should be.

There are cases where Conda (or a fork of it) is necessary to use, as some tools rely on its ability to build complicated scientific dependencies. Nevertheless, in astronomy, I think that this is rarely/never the case — and so I’d recommend moving away from the Conda ecosystem.

pyenv: a simple solution

pyenv is a simple tool for installing multiple Python versions. pyenv is lightweight, and just uses a clever shell script and shim approach to manage Python for you — in plainer English, that means your system supports pyenv out of the box. You also don’t need to have Python installed already to install pyenv with (no chicken/egg problems here!)

Then, once you set up pyenv, you can install packages in the environment of your choice with pip. And you can use plugins (like pyenv-virtualenv) to extend it further, such as allowing for pyenv to also manage virtual environments.

Thanks to being lightweight, you can use pyenv wherever you need Python. I’ve had no issues using it in different places, from on my own computers to on shared compute nodes — and I get to take the same Python version with me wherever I go!

Installing pyenv

pyenv is relatively straightforward to install, but does require a bit of experience using a terminal. It doesn’t require admin (super user/sudo) rights, although the last step (installing system packages) might be necessary and may require admin rights.

The instructions below work for Linux or Mac; Windows users should follow instructions in the Windows-compatible fork instead.

Step 1: basic pyenv install

The installation instructions differ depending on your system, but can be found in the pyenv README on GitHub. The easiest automatic way that works on Linux and Mac is probably with the command:

curl https://pyenv.run | bash

but you should check the README for the latest instructions and make sure that you understand what you’re installing.

Step 2: set up your shell

Next, you need to set up your shell to know where pyenv is and to start it whenever you open a terminal. This can differ for different systems; I recommend reading (and understanding) the current version of the shell install instructions. This is the most intimidating bit, but I promise it’s worth it 😇

Step 3: Ensure that you have the correct packages in your OS

The one sharp edge is that unlike Conda, pyenv will build your Python version and packages locally, meaning that you need to make sure your system has the correct packages installed. This is the only step that requires admin access to a machine.

However, many systems are already bundled with them, as these are mostly basic libraries for things like compression — so you may not need to do anything. It’s worth checking, though.

Step 4: Install the pyenv-virtualenv plugin

pyenv has a number of compatible plugins, one of which I think is mandatory for typical use. The pyenv-virtualenv plugin adds the ability to manage virtual environments for your Python projects with pyenv, making Python virtual environments straightforward to use.

It can be installed by following the installation instructions on the GitHub README after installing pyenv.

For the rest of this guide, I’ll assume that you have it installed.

How to use pyenv

Once you have pyenv installed, there are a few basic commands that you can use to manipulate it.

1. Install desired Python version(s)

To start, you can install any version of Python you want with pyenv. For instance, to install the latest version of Python 3.11, you can do:

pyenv install 3.11.7

2. Install a virtual environment

Because version conflicts suck and can be avoided by using a separate virtual environment for each of your projects, you should get into the habit of using a different virtual environment for every project you work on. Thanks to the pyenv-virtualenv plugin from above, you can set up virtual environments with

pyenv virtualenv <desired_python_version> <virtualenv_name>

so for instance, to make a virtualenv with Python 3.11.7 (which we just installed) called ‘website’, we would do

pyenv virtualenv 3.11.7 website

3. Activate it

Like with Conda, you can now activate a pyenv virtual environment with the activate command. We can activate our website environment with

pyenv activate website

4. Set which virtual environment should be used per-folder

The real power of pyenv is that you can set folder-specific Python virtual environments. For instance, let’s say that when in /home/emily/website, I always want to use my website environment; assuming I’m already in that directory (cd /home/emily/website), I can set the pyenv virtual environment I want to use in that folder and all subfolders with

pyenv local <virtualenv_name>

in this case that’s

pyenv local website

This creates a .python-version file in my folder (/home/emily/website) that tells pyenv to use that environment. Now, all Python commands, including python, python3, and pip will use the ‘website’ virtual environment we just made.

5. Other useful commands

There are some other useful commands worth mentioning, too. For instance, you can optionally set a default global Python virtual environment to use with

pyenv global <virtualenv_name>

You can see which Python version you’re currently using with

pyenv version

and understand what set it with

pyenv version-origin

You can list all installed Python versions and virtual environments with

pyenv versions

or just the virtual environments with:

pyenv virtualenvs

And unsurprisingly, you can get a list of these commands (and more) with

pyenv help

Learning by example: start developing a module with a pyenv-powered development environment

Using this approach with pyenv is a bit more ‘manual’ than having a tool like Poetry do everything, but it works with basically any package (as long as it has a pyproject.toml). I really like this approach because it just works.

For instance, let’s say we wanted to work on a local version of astropy and write some code for it. We’d start by downloading the package:

git clone https://github.com/astropy/astropy.git

and navigating into the new directory:

cd astropy

Next, install our desired Python version for development - let’s pick the latest Python 3.12 version at the time of writing:

pyenv install 3.12.1

and let’s make a new virtual environment called astropy with it:

pyenv virtualenv 3.12.1 astropy

and pin it to this folder with

pyenv local astropy

Then, we can install astropy in edit mode with the pip flag -e, which means that local changes we make to the code are reflected in the astropy version in our ‘astropy’ virtual environment. We’ll also install all optional dev dependencies with the [dev] specifier, and we’re installing with . (the full stop) because that’s a shorthand for the current directory. All together, it looks like this:

pip install -e .[dev]

And now we can start editing astropy’s source code in a ring-fenced virtual development environment. This method will work for basically any package.

Final thoughts

Using pyenv and pyenv-virtualenv in this way is a bit manual, but I like it because it just works. On any machine, and for working on basically any library.

For all the papers that I write, I keep all code for them in a given folder and set up a pyenv virtual environment for each paper individually. It keeps each paper ringfenced and helps to prevent dependency hell that occurs when trying to use one Python install across your entire machine for years at a time, and hoping that nothing breaks. (This is a bad idea, don’t do this - it WILL fail eventually 🫠)

Someday, I really hope that the Python community coalesces around one package and Python version management system that just works. But until that day, this simple approach with pyenv should work basically anywhere.