IPython Notebook Tips
IPython notebook is awesome, but it's still pretty immature in many ways. These are a few functions that I've found help make it a more pleasant experience.
Use retina displays
It's great to be able to make use of a retina display, if you have one. The difference in figure quality is very noticable. To use retina, just set this flag at the top of your notebook (you probably want inline plots too).
%matplotlib inline %config InlineBackend.figure_format = 'retina'
More prominent section breaks
IPython notebooks can look messy and disorienting. I like to include large section breaks using IPython.display.HTML.
from IPython.display import HTML def new_section(title): style = "text-align:center;background:#66aa33;padding:120px;color:#ffffff;font-size:3em;" return HTML('<div style="{}">{}</div>'.format(style, title)) new_section("New Section")
Nicer image display
The IPython.display.Image function is very useful, but sometimes it's nice to use more of the horizontal space available. I extended the Image function to Images to allow for a table-based layout.
from IPython.display import display, HTML, Image def Images(images, header=None, width="100%"): # to match Image syntax if type(width)==type(1): width = "{}px".format(width) html = ["<table style='width:{}'><tr>".format(width)] if header is not None: html += ["<th>{}</th>".format(h) for h in header] + ["</tr><tr>"] for image in images: html.append("<td><img src='{}' /></td>".format(image)) html.append("</tr></table>") display(HTML(''.join(html))) Images(["images/hanahaus.jpg","images/drug_approval.png"], header=["Hanahaus", "Drug approvals"], width="60%")
Hanahaus | Drug approvals |
---|---|
Running command-line tools
What do you do if you want to run a command-line tool from within IPython notebook? It can get very messy, since you may need to see stdout, but it may include a lot of unformatted text, which can make your notebook very unreadable.
To address this, I created my own wrapper around subprocess. I first have to set some global CSS for the notebook. There are no really good ways to do this yet, so I just call a HTML function at the top of the notebook. I then create a do_call function that runs the command-line tool (subprocess) and captures the output. The output is hidden but can be expanded using a CSS trick. It also keeps track of how long each subprocess took to run. This function also includes your environment variables from os.environ, which is often important.
I also include a @contextmanager-based function that allows me to run command-line tools from within specific directories. This comes up a lot.
This system was very useful for me when I was making a "pipeline" notebook, chaining together different command-line tools.
from IPython.display import HTML HTML(""" <style> .showhide_label { display:block; cursor:pointer; } .showhide { position: absolute; left: -999em; } .showhide + div { display: none; } .showhide:checked + div { display: block; } .shown_or_hidden { font-size:85%; } </style> """)
import time, os, random from IPython.display import display, HTML from subprocess import Popen, PIPE from contextlib import contextmanager def do_call(cmd, stdin=None, stdout=None, stderr=None, env=None, base_dir=None): """Help call subprocess with some niceties. Output to html.""" assert type(cmd)==type([]) MAX_OUT = 30000 # characters to output to >stdout and >stderr def with_div(text, style="none", label=None): random_id = ''.join(random.choice("01234567890ABCDEF") for _ in range(16)) div = { "none":"<div>{}</div>".format(text), "time":"<div style='color:#953'>{}</div>".format(text), "title":"<div style='font-size:125%;padding:5px;color:#6a3;'>{}</div>".format(text), "main":"<div style='border:2px solid #6a3;padding:5px;color:#555;'>{}</div>".format(text), "mono":"<div style='font-family:monospace;padding:5px;'><pre>{}</pre></div>".format(text), "hide":"""<label class="showhide_label" for="showhide_{}">▸{}</label> <input type="checkbox" id="showhide_{}" class="showhide"/> <div class="shown_or_hidden"><pre>{}</pre></div>""".format(random_id, label, random_id, text), } return div[style] # Keep track of the amount of time spent in the process start_time = time.time() # Treat Nones as empty cmd = [c for c in cmd if c is not None] # ignore Nones, which otherwise would be "" # Optionally, make the command more readable by pretending base_dir is an env # To actually make an env requires shell=True and seems worse since this is only cosmetic cmdstr = ' '.join(cmd).replace(base_dir, "$BASE") if base_dir is not None else ' '.join(cmd) # Use custom environment variables. os.environ variables are overwritten if duplicated process_env = dict(os.environ, **env) if env is not None else os.environ # Use Popen instead of subprocess.call to get stdout, stderr p = Popen(cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE, env=process_env) p_out, p_err = p.communicate(stdin) p_rc = p.returncode p_out = p_out if p_out != "" else "[No stdout]" p_err = p_err if p_err != "" else "[No stderr]" if stdout is not None: stdout.write(p_out) if stderr is not None: stderr.write(p_err) p_out_fmt = p_out if len(p_out) <= MAX_OUT else "{}\n{}".format(p_out[:MAX_OUT], "[TRUNCATED]") p_err_fmt = p_err if len(p_err) <= MAX_OUT else "{}\n{}".format(p_err[:MAX_OUT], "[TRUNCATED]") # Output a nicely formatted HTML block html = [with_div("Running subprocess", "title")] html += [with_div(cmdstr, "mono")] html += [with_div(p_out_fmt, "hide", "stdout ({})".format(p_out_fmt.count("\n")))] html += [with_div(p_err_fmt, "hide", "stderr ({})".format(p_err_fmt.count("\n")))] html += [with_div("subprocess time : {:.2f}s".format(time.time() - start_time), "time")] display(HTML(with_div(''.join(html), "main"))) @contextmanager def using_dir(path): old_dir = os.getcwd() os.chdir(path) try: yield finally: os.chdir(old_dir) # The same thing, two ways with using_dir("/Users/briann/anaconda"): do_call(["ls", "."]) do_call(["ls", "/Users/briann/anaconda"])
ls .
Examples Launcher.app bin conda-meta docs envs imports include lib mkspecs node-webkit phrasebooks pkgs plugins python.app q3porting.xml share ssl tests translations
[No stderr]
ls /Users/briann/anaconda
Examples Launcher.app bin conda-meta docs envs imports include lib mkspecs node-webkit phrasebooks pkgs plugins python.app q3porting.xml share ssl tests translations
[No stderr]
Nicer printing
Sometimes the standard print function doesn't cut it. I have a few helper functions that make printing text a bit more flexible.
I also generally use "from __future__ import print_function" everywhere, since print should be a function anyway. I use "from __future__ import division" since we'll all be on Python 3 soon enough.
from __future__ import print_function, division from IPython.display import display, HTML from itertools import count import yaml def uprint(text): print("{}\n".format(text) + "-"*len(text)) def hprint(text): display(HTML(text)) def tprint(rows, header=True): html = ["<table>"] html_row = "</td><td>".join(k for k in rows[0]) html.append("<tr style='font-weight:{}'><td>{}</td></tr>".format('bold' if header is True else 'normal', html_row)) for row in rows[1:]: html_row = "</td><td>".join(r for r in row) html.append("<tr style='font-family:monospace;'><td>{:}</td></tr>".format(html_row)) html.append("</table>") display(HTML(''.join(html))) def jprint(dict_or_json, do_print=True, numbered=False): text = yaml.safe_dump(dict_or_json, indent=2, default_flow_style=False) if numbered: cnt = count(1) text = re.sub("^\-", lambda x: str(cnt.next()), text, 0, re.MULTILINE) if do_print: print(text) else: return text uprint("Some text") print("Normal text") hprint("HTML table") tprint([[random.choice("ACGT")*3 for _ in range(4)] for _ in range(4)]) uprint("Some JSON or a dict") jprint({"a":{"b":"c"}, "d":"e", "f":{"g":{"i":"k"}}})
Some text --------- Normal textHTML table
CCC | GGG | TTT | AAA |
TTT | TTT | CCC | AAA |
AAA | AAA | CCC | AAA |
GGG | TTT | CCC | TTT |
Some JSON or a dict ------------------- a: b: c d: e f: g: i: k
Use SVG for plotting
I really like using SVG for custom plots, since it is extremely flexible. For example, you can combine images/photos and data easily. Doing this does require learning some SVG, but it's not so complicated — in many ways matplotlib etc. are much more complex since you often can't get it to do exactly what you want (at least I can't.)
from IPython.display import display, SVG def show_svg(svgs, width=1000, height=1000): SVG_HEAD = '''<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">''' SVG_START = '''<svg width="{w:}px" height="{h:}px" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink= "http://www.w3.org/1999/xlink">''' SVG_END = '</svg>' return display(SVG(SVG_HEAD + SVG_START.format(w=width, h=height) + svgs + SVG_END))
from random import random w, h = 500, 390 def box(xy,wh,rgba=(50,50,50,1)): return '''<rect x="{}" y="{}" width="{}" height="{}" fill="rgba({:d},{:d},{:d},{:f})" stroke="rgb(0,0,0)" /> '''.format(xy[0],xy[1], wh[0],wh[1], rgba[0],rgba[1],rgba[2],rgba[3]) svgs = ['<image xlink:href="static/biospace-news-biogen-idec-2.png" x="0" y="0" width="{:d}px" height="{:d}px"/>'.format(w,h)] svgs += [box((100,140),(200,180), rgba=(0,255,255,.35))] svgs += ['''<text x="{x:d}" y="{y:d}" text-anchor="middle" font-size="28" fill="rgba(0,255,255,.95)"> IMPORTANT!</text>'''.format(x=200, y=120)] show_svg(''.join(svgs), width=w, height=h)