6 steps to migrate your scientifc scripts to Python 3

Python 3 has been around for some time (the most recent stable version is Python 3.2), but till now it was not widely adopted by scientific community. One of the reason was that the basic scientific Python libraries such as NumPy and SciPy were not ported to Python 3. Since this is no longer the case, there is no reasons anymore to resist migration to Python (you can find the pros and cons on the Python website)

In this guide I am going to describe some tips that I learnt while trying to make my scripts compatible with Python 3. There is nothing to be afraid of – the procedures are actually quite easy and very rewarding (it is like a glimpse into the future of Python!).

This guide is not about porting frameworks – it is a completely different story that is covered by plenty of other guides or even books. I will focus here on porting the scientific scripts (analysis scripts, plotting, simulations etc.). I tried to make the transition smooth and extended in time as opposed to rapid and chaotic, so I decided to make my scripts compatible with both Python 2.6 and 3.2. It may not be the easiest of all solutions, but it works and allows for in-depth testing before moving completely to Python 3 (I have also a lot of legacy 2.6 code that I have to keep to reproduce my old experiments).

  1. Learn about the differences between Python 2 and Python 3
    If you haven’t yet done so, before you start modifying your code you should rush and read about changes in Python 3. Python 3 is an incremental rather than revolutionary release. Compared to Python 2 it does not offer many new functions for casual programmers (like most of us are), but rather it straightens up some quirks left by the past versions and makes the syntax more consistent. Unfortunately, to do that it has to break backwards compatibility – something that usually does not happen in programming languages. However, the changes in the syntax are rather cosmetic and they can even be taken care of by automatic scripts.
  2. Use compatible syntax
    Many of the features in Python 3 were backported to Python 2.x series, which means that you can already start using them in your old scripts. This will make the later transition a lot easier. For example:

    • in Python 3 print becomes a function, which means that you have to wrap your string in parenthesis like that:
      print('Hello world')
      

      the same function will work in Python 2.6. However, there are still some differences with respect to arguments between the 2.6 and 3.x version, so for better compatibility it is recommended to use from __future__ import print_function (see below).

    • instead of using string formatting with % (in Python 2.x) you should use string method .format(...) such that:
      "Hello %s" % ("world", )
      

      becomes

      "Hello {0}".format("world")
      

      The % operator is still valid in Python 3, but format method gives much more flexibility.

    • if you want more close compatibility you can use __future__ imports; for example, slash / in Python 2.x is an integer division, but in Python 3.x it becomes a floating point division, but you can change the default behaviour in 2.6 to the one compatible with 3.x with the following import:
      from __future__ import division
      

      You can find more futures in Python docs.

  3. Write unit tests
    Although this step is not required, it is highly recommended. It is hard to over-emphasise the significance of unit testing. This is especially true in science, where you want your code to be well tested and reproducible. While porting to Python 3 you will probably do lots of small changes – most of them are very innocent, but because of that even more dangerous. Therefore, you should implement small snippets of code to test individual functions in your scripts (so called unit tests) before modifying your code. There is even an automatic unit testing framework – nose – that will make the process a breeze. This gives you a quick and easy way to check whether you did not introduce errors. Do not worry that you waste your time – you will also profit from thorough testing in the future.
  4. Use alternative imports for standard library modules
    An important change is that many standard modules were renamed in Python 3. For example, cPickle (C-accelerated pickle module) become simply pickle (includes the C-accelerated version and a pure Python version as a fall-back). Now, when you try to import cPickle in Python 3.x you will get an error. As a work-around you can use this snippet to define an alternative import:

    try:
        import cPickle as pickle
    except ImportError:
        import pickle
    

    You can find a list of standard modules in Python 2.x and their new names in Python 3 in this guide

  5. Reflect on some differences between builtin functions.
    Unfortunately not all Python 2.6 functions are directly related to their Python 3 counterparts. For example, filter(...) function in Python 2.6 returns a list whereas in Python 3 it returns an iterator. The change may seem small (a list is also iterable after all), but it depends very much on how you use it in the code. In one of my scripts, I reused a list returned by the filter function in two list comprehensions. With iterators it won’t work, because you have to rewind an iterator before the second use. As a work-around you can wrap the filter with a list function, i.e., list(filter(...)), which works both in Python 2.6 and 3. The same applies to .keys(...) method in dictionaries. Whenever in doubt – google it!
  6. Use libraries ported to Python 3
    Although pure Python 2.6 libraries can be quickly and semi-automatically converted to Python 3, libraries that depend on C extensions are much harder to port. That is a reason why it can take some time for libraries to be ported to Python 3. However, many libraries have been already ported – especially the scientific libraries, such as NumPy and SciPy, matplotlib is compatible with Python 3 in the development version (you can get it from github). Others still have to be ported (most notably, Django, PyTables and PIL, see the list of ported libraries). However, some of the libraries can be easily replaced with Python3-compatible alternatives (such as hd5py for PyTables or scikits-image for PIL). Although, this may require more effort, as a side effect you may move to a more recent library with a more dynamic community. If this is not possible, you have to ask the developers or maybe join the development efforts yourself. You will learn loads on the way!

Many of the above tips can be considered harmful when working on larger projects (such as new frameworks or libraries), but they are absolutely fine when you work on simple scripts on your own or within a small group of colleagues.

This list is in no way exhaustive. You will find a plenty of materials on the web (a good collection of resource is at getpython3.com). If you want to share your tips or describe your experiences, please leave a comment.

Leave a Reply

Your email address will not be published. Required fields are marked *