0. Configuring your computer to use Python for scientific computing


Why Python?

As will become readily apparent even at the beginning of our journey into biological circuit design, you will need to use your computer to analyze circuits and understand the principles governing their function. There are plenty of approaches we could take, and many languages we could use for computing as well. Indeed, in addition to Python, Matlab/Octave, Mathematica, R, Julia, Java, JavaScript, C++, and others are widely used. We have chosen to use Python. Though we view this as an unessential choice (we believe language wars are counterproductive and welcome anyone to port the code we use to any language of their choice), we nonetheless feel we should explain our choice.

Python is a flexible programming language that is widely used in many applications. This is in contrast to more domain-specific languages like R and Julia. It is easily extendable, which is in many ways responsible for its breadth of use. We find that there is a decent Python-based tool for many applications we can dream up, certainly in systems biology. However, the Python-based tool is seldom the very best for the particular task at hand, but it is almost always pretty good. Thus, knowing Python is like having a Swiss Army knife; you can wield it to effectively accomplish myriad tasks. Finally, we also find that it has a shallow learning curve with most students.

Why not use systems biology packages?

There are packages available to streamline systems biology calculations, such as PySB or Matlab’s SymBiology. While these packages are useful, we find that many applications in systems biology, and in genetic circuits in particular, need, or at least benefit from, bespoke computational analyses. We therefore will build all of our code from scratch, using only packages like NumPy, SciPy, and Bokeh, which contain core numerical and plotting data structures and routines. Of course, code we use in one chapter may be reused in another, but our approach is that we build all of the code we need as we go along. This will provide a greater level of mastery and less reliance on black boxes (though there will inevitably be some).

The biocircuits package

For some of those black boxes, we will use the biocircuits package, which is written to be used with this course. The functions contained therein are are presented in the course materials before being abstracted away into the package. Thus, anything that is in the package is introduced in the course and you should have a full understanding of how it works.

The documentation for this package appears as an appendix to the course materials.

What to do if you are new to Python

As you proceed through the chapters, we assume that you have a basic introduction to computer programming and the Python programming language. We assume further that you have a working knowledge of NumPy. If this is new to you, there are plenty of great resources to learn Python and to learn the basics quickly. A weeklong intensive course offered by one of the authors and the resources linked to therein provide a good starting point.

Installing a Python distribution

Prior to embarking on your journey into biological circuits, you need to have a functioning Python distribution installed on your computer. There are two main ways people set up Python for scientific computing.

  1. By downloading and installing package by package with tools like pip.

  2. By downloading and installing a Python distribution that contains binaries of many of the scientific packages needed. The major distributions of these are Anaconda and Enthought Canopy. Both contain IDEs.

We will use Anaconda, with its associated package manager, conda. It is pretty much the de facto package manager/distribution for scientific use.

A special note to Mac users

If your machine is a Mac, you will need to install XCode, which you can get through the App Store, before installing Anaconda. Once you install XCode, you need to launch it in order to have everything set up properly. It will take a while to launch, and it may ask you to install extras, which you should do. After it has launched, you can close it, and you won’t need it again for this course in biological circuits. Important components under the hood are set up by installing and launching XCode.

Windows users: Chrome or Firefox

To run Jupyter notebooks, you use JupyterLab. It is browser-based, and Chrome, Firefox, and Safari are supported. (For JupyterLab versions 3.0.0 through 3.0.11, Safari is not supported.) Microsoft Edge is not. Therefore, if you are a Windows user, you need to be sure you have either Chrome of Firefox installed.

Downloading and installing Anaconda

Mac users: Before installing Anaconda, be sure you have XCode installed.

Downloading and installing Anaconda is simple.

  1. Go to the Anaconda homepage and download the graphical installer.

  2. Install Anaconda with Python 3.8.

  3. You may be prompted for your email address, which you should provide. If you are at a university, you may want to use your university email address because educational users can get some of the non-free goodies in Anaconda.

  4. Follow the on-screen instructions for installation. While doing so, be sure that Anaconda is installed in your home directory, not in root.

That’s it! After you do that, you will have a functioning Python distribution.

Package installations and the conda package manager

conda is a package manager for keeping all of your packages up-to-date. It has plenty of functionality beyond our basic usage in class, which you can learn more about by reading the docs. We will primarily be using conda to install and update packages.

conda works from the command line. On Windows, the command line is usually accessed through PowerShell and on macOS through Terminal. After you have installed Anaconda, at a command line prompt, run the following commands to update conda (yes, do it twice).

conda update conda
conda update conda

If conda is out of date and needs to be updated, you will be prompted to perform the update. Just type y, and the update will proceed.

Now that conda is updated, we’ll update all packages, so type the following on the command line.

conda update --all

You will be prompted to perform all of the updates. They may even be some downgrades. This happens when there are package conflicts where one package requires an earlier version of another. conda is very smart and figures all of this out for you, so you can almost always say “yes” (or “y”) to conda when it prompts you.

As you work through this course, you will sometimes use packages that are not included in the default Anaconda distribution. As we develop code throughout the course, we will reuse it. For convenience, this is contained in the biocircuits package.

You can do these installations with conda or pip by doing the following at the command line. We will take care of these installations now, and will discuss them in much more detail as we use them. To do the installations, do the following on the command line.

conda install -c bokeh jupyter_bokeh
conda install colorcet datashader fastparquet holoviews hvplot panel param line_profiler
pip install multiprocess eqtk biocircuits iqplot watermark blackcellmagic

Finally, if you want to install a spell checker for Jupyter notebooks, you can do so by doing the following on the command line. Note that you must have node.js installed prior to installing the spell checker.

jupyter labextension install @ijmbarr/jupyterlab_spellchecker

Launching JupyterLab

The easiest way to launch JupyterLab is from the command line. On Windows, this is usually accessed through PowerShell and on macOS through Terminal. To launch JupyterLab, type the following on the command line.

jupyter lab

If you are using an unsupported browser (e.g., Microsoft Edge or Safari for 3.0.0 ≤ JupyterLab version ≤ 3.0.11), you can specify the browser you want using, for example,

jupyter lab --browser=firefox

Alternatively, you can use the Anaconda Navigator that was installed when you installed Anaconda. If you’re using macOS, this is available in your Applications menu. If you are using Windows, you can do this from the Start menu. You can then launch Anaconda Navigator. After launching the Navigator, you should see an option to launch JupyterLab. When you do that, a new browser window or tab will open with JupyterLab running. Within the JupyterLab window, you will have the option to launch a notebook, a console, a terminal, or a text editor. We will notebooks heavily.

Checking your distribution

We’ll now run a quick test to make sure things are working properly. We will make a quick plot that requires some of the scientific libraries we will use in the bootcamp.

Use the JupyterLab launcher (you can get a new launcher by clicking on the + icon on the left pane of your JupyterLab window) to launch a notebook. In the first cell (the box next to the [ ]: prompt), paste the code below. To run the code, press Shift+Enter while the cursor is active inside the cell. You should see a plot that looks like the one below. If you do, you have a functioning Python environment for scientific computing!

[1]:
import numpy as np
import bokeh.io
import bokeh.plotting

bokeh.io.output_notebook()

# Generate plotting values
t = np.linspace(0, 2 * np.pi, 200)
x = 16 * np.sin(t) ** 3
y = 13 * np.cos(t) - 5 * np.cos(2 * t) - 2 * np.cos(3 * t) - np.cos(4 * t)

# Make the plot
p = bokeh.plotting.figure(frame_width=300, frame_height=300)
p.line(x, y, line_width=3, color="red")
source = bokeh.models.ColumnDataSource(
    dict(x=[0], y=[0], text=["Biocircuits"])
)
p.text(
    x="x",
    y="y",
    text="text",
    source=source,
    text_align="center",
    text_font_size="18pt",
)

# Display
bokeh.io.show(p)
Loading BokehJS ...

Computing environment

[2]:
%load_ext watermark
%watermark -v -p numpy,bokeh,jupyterlab
Python implementation: CPython
Python version       : 3.8.8
IPython version      : 7.22.0

numpy     : 1.20.1
bokeh     : 2.3.1
jupyterlab: 3.0.14