In a previous blogpost
I described a pipeline for synthesizing arbitrary proteins on the
transcriptic robotic lab platform
using only Python code.
The ultimate goal of that project was to be able to run a program that takes a protein sequence as input,
and "returns" a tube of bacteria expressing that protein.
Here I'll describe some progress towards that goal.
Pipelining
The usual way to chain together different programs in bioinformatics is with a
pipeline management system, for example,
snakemake,
nextflow,
toil,
WDL, and many many more.
I've recently become a big fan of nextflow for computational pipelines,
but its major advantages
(e.g., containerization)
don't help much here because so much of the work happens outside of the computer.
For this project I've been using the slightly simpler snakemake,
mainly for tracking which steps have been completed,
and deciding which steps can be run in parallel based on their dependencies.
Each protocol has four associated steps in the pipeline:
generate protocol: create an autoprotocol file describing the protocol
submit protocol: submit the autoprotocol file to transcriptic
get results: download images, data, etc. from transcriptic
create report: create a HTML report from the downloaded data
snakemake pipeline for protein synthesis
Metaprotocol
In my terminology, a "metaprotocol" defines the complete process,
which is turned into a series of protocols.
Ideally, the output of a single protocol will be a decision point:
for example, whether or not a gel image includes the expected bands.
The metaprotocol is defined in yaml, which has its issues,
but is more readable than json, and well supported.
This code depends heavily on Pydna,
a Python package for cloning and assembly.
Given an insert and a vector, Pydna will design primers and a PCR program.
The following is my metaprotocol yaml for expressing GFP:
Of course, before you can run this pipeline,
you need to have the appropriate insert DNA in your transcriptic inventory.
As far as I know, none of the major synthetic DNA suppliers has an API.
However, you can order DNA from IDT by filling in an excel file.
I have automated filling in and emailing this file,
so DNA synthesis can be included in the pipeline too!
It should take about a week from ordering for DNA to appear at transcriptic.
Reporting
After each protocol finishes, a HTML report is generated.
This allows the user to evaluate protocol results manually before initiating the next step.
There are ways to automate this more,
like using automated band mapping of gel images,
but I think that kind of thing will work better once the transcriptic API settles down a bit.
The HTML report also serves as a log of the experiment.
Conclusions
There is still plenty to do before the pipeline is completely automatic.
For example, attentive readers will notice that the HTML report above shows
an unsuccessful transformation, one of many!
The first complete transformation took several months to get right.
The biggest challenge is making the process robust to changes in the protein sequence —
even basic PCR can go wrong in many ways.
Currently, debugging is a major undertaking;
unlike regular programming, iterations are slow and expensive.
However, if the protocols can be made robust enough, which I think they can,
then synthesizing a new protein could become as simple as running BLAST.
What if you had an idea for a cool, useful protein, and you wanted to
turn it into a reality? For example, what if you wanted to create a
vaccine against H. pylori (like the 2008 Slovenian iGEM
team) by creating a hybrid
protein that combines the parts of E. coli flagellin that stimulate an
immune response with regular H. pylori flagellin?
A design for a hybrid flagellin H. pylori vaccine, from the 2008
Slovenian iGEM team
Amazingly, we're pretty close to being able to create any protein we
want from the comfort of our jupyter notebooks, thanks to developments
in genomics, synthetic biology, and most recently, cloud labs.
In this article I'll develop Python code that will take me from an idea
for a protein all the way to expression of the protein in a bacterial
cell, all without touching a pipette or talking to a human. The total
cost will only be a few hundred dollars! Using Vijay Pande from
A16Z's terminology, this is
Bio 2.0.
To make this a bit more concrete, this article describes a cloud lab
protocol in Python to do the following:
Synthesize a DNA sequence that encodes any protein I want.
Clone that synthetic DNA into a vector that can express it.
Transform bacteria with that vector and confirm that it is
expressed.
Python Setup
First, some general Python setup that I need for any jupyter notebook. I
import some useful Python modules and make some utility functions,
mainly for plotting and data presentation.
importreimportjsonimportloggingimportrequestsimportitertoolsimportnumpyasnpimportseabornassnsimportpandasaspdimportmatplotlibasmplimportmatplotlib.pyplotaspltfromioimportStringIOfrompprintimportpprintfromBio.SeqimportSeqfromBio.Alphabetimportgeneric_dnafromIPython.displayimportdisplay,Image,HTML,SVGdefuprint(astr):print(astr+"\n"+"-"*len(astr))defshow_html(astr):returndisplay(HTML('{}'.format(astr)))defshow_svg(astr,w=1000,h=1000):SVG_HEAD='''<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">'''SVG_START='''<svg viewBox="0 0 {w:}{h:}" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink= "http://www.w3.org/1999/xlink">'''returndisplay(SVG(SVG_HEAD+SVG_START.format(w=w,h=h)+astr+'</svg>'))deftable_print(rows,header=True):html=["<table>"]html_row="</td><td>".join(kforkinrows[0])html.append("<tr style='font-weight:{}'><td>{}</td></tr>".format('bold'ifheaderisTrueelse'normal',html_row))forrowinrows[1:]:html_row="</td><td>".join(row)html.append("<tr style='font-family:monospace;'><td>{:}</td></tr>".format(html_row))html.append("</table>")show_html(''.join(html))defclean_seq(dna):dna=re.sub("\s","",dna)assertall(ntin"ACGTN"forntindna)returnSeq(dna,generic_dna)defclean_aas(aas):aas=re.sub("\s","",aas)assertall(aain"ACDEFGHIKLMNPQRSTVWY*"foraainaas)returnaasdefImages(images,header=None,width="100%"):# to match Image syntaxiftype(width)==type(1):width="{}px".format(width)html=["<table style='width:{}'><tr>".format(width)]ifheaderisnotNone:html+=["<th>{}</th>".format(h)forhinheader]+["</tr><tr>"]forimageinimages:html.append("<td><img src='{}' /></td>".format(image))html.append("</tr></table>")show_html(''.join(html))defnew_section(title,color="#66aa33",padding="120px"):style="text-align:center;background:{};padding:{} 10px {} 10px;".format(color,padding,padding)style+="color:#ffffff;font-size:2.55em;line-height:1.2em;"returnHTML('<div style="{}">{}</div>'.format(style,title))# Show or hide textHTML("""
<style>
.section { display:flex;align-items:center;justify-content:center;width:100%; height:400px; background:#6a3;color:#eee;font-size:275%; }
.showhide_label { display:block; cursor:pointer; }
.showhide { position: absolute; left: -999em; }
.showhide + div { display: none; }
.showhide:checked + div { display: block; }
.shown_or_hidden { font-size:85%; }
</style>
""")# Plotting styleplt.rc("axes",titlesize=20,labelsize=15,linewidth=.25,edgecolor='#444444')sns.set_context("notebook",font_scale=1.2,rc={})%matplotlibinline%configInlineBackend.figure_format='retina'# or 'svg'
Cloud Labs
Just like AWS or any compute cloud, a cloud lab owns molecular biology
equipment and robots, and will rent them out to you in small increments.
You can issue instructions to their robots by clicking some buttons on
their front-end, or write code to instruct the robots yourself. It's not
necessarily better to write your own protocols as I'll do here — in
general a lot of molecular biology is the same few routine tasks, so
you generally want to rely on a robust protocol that someone else has
shown performs well with robots.
There are a number of nascent cloud lab companies out there:
Transcriptic, Autodesk Wet Lab
Accelerator (beta, and built on
top of Transcriptic), Arcturus BioCloud
(beta), Emerald Cloud Lab (beta),
Synthego (not yet live). There are even
companies built on top of cloud labs, like Desktop
Genetics, which specializes in CRISPR.
Scientific papers,
like this one from the Siegel lab,
are just starting to appear that use cloud labs to do real science.
At the time of writing, only Transcriptic is available for general use
so that's what I'll be using. As I understand it, most of Transcriptic's
business comes from automating common protocols, and writing your own
protocols in Python (as I'll be doing in this article) is less common.
A "work cell" at Transcriptic, with freezers visible at the bottom and
various bits of lab equipment on the bench
I issue instructions to Transcriptic's robots using
autoprotocol. Autoprotocol is a JSON-based
language for writing protocols for lab robots (and humans, sort of).
Autoprotocol is mainly written using this Python
library. The
language was originally developed by, and is still supported by
Transcriptic, but as I understand it it's completely open. There is some
good and improving
documentation.
One fascinating thing to think about here is that you could also
generate an autoprotocol protocol and submit it to a lab staffed by
humans — say in China or India — and potentially get some of the
advantages of using humans (their judgement) and robots (their lack of
judgement). I should also mention
protocols.io here, which is also an
effort to standardize protocols for increased reproducibility, but aimed
at humans instead of robots.
As well as standard Python imports, I'll need some
molecular-biology–specific utilities. This code is primarily
autoprotocol- and Transcriptic-centric.
One thing that comes up a lot in this code is the concept of dead
volume. This means the last bit of liquid that Transcriptic's robots
cannot consistently pipette out of tubes (because they can't see!). I
have to spend quite a bit of time ensuring that there is enough volume
left in my tubes.
importautoprotocolfromautoprotocolimportUnitfromautoprotocol.containerimportContainerfromautoprotocol.protocolimportProtocolfromautoprotocol.protocolimportRef# "Link a ref name (string) to a Container instance."importrequestsimportlogging# Transcriptic authorizationorg_name='hgbrian'tsc_headers={k:vfork,vinjson.load(open("auth.json")).items()ifkin["X_User_Email","X_User_Token"]}# Transcriptic-specific dead volumes_dead_volume=[("96-pcr",3),("96-flat",25),("96-flat-uv",25),("96-deep",15),("384-pcr",2),("384-flat",5),("384-echo",15),("micro-1.5",15),("micro-2.0",15)]dead_volume={k:Unit(v,"microliter")fork,vin_dead_volume}definit_inventory_well(well,headers=tsc_headers,org_name=org_name):"""Initialize well (set volume etc) for Transcriptic"""def_container_url(container_id):return'https://secure.transcriptic.com/{}/samples/{}.json'.format(org_name,container_id)response=requests.get(_container_url(well.container.id),headers=headers)response.raise_for_status()container=response.json()well_data=container['aliquots'][well.index]well.name="{}/{}".format(container["label"],well_data['name'])ifwell_data['name']isnotNoneelsecontainer["label"]well.properties=well_data['properties']well.volume=Unit(well_data['volume_ul'],'microliter')if'ERROR'inwell.properties:raiseValueError("Well {} has ERROR property: {}".format(well,well.properties["ERROR"]))ifwell.volume<Unit(20,"microliter"):logging.warn("Low volume for well {} : {}".format(well.name,well.volume))returnTruedeftouchdown(fromC,toC,durations,stepsize=2,meltC=98,extC=72):"""Touchdown PCR protocol generator"""assert0<stepsize<toC<fromCdeftd(temp,dur):return{"temperature":"{:2g}:celsius".format(temp),"duration":"{:d}:second".format(dur)}return[{"cycles":1,"steps":[td(meltC,durations[0]),td(C,durations[1]),td(extC,durations[2])]}forCinnp.arange(fromC,toC-stepsize,-stepsize)]defconvert_ug_to_pmol(ug_dsDNA,num_nts):"""Convert ug dsDNA to pmol"""returnfloat(ug_dsDNA)/num_nts*(1e6/660.0)defexpid(val):"""Generate a unique ID per experiment"""return"{}_{}".format(experiment_name,val)defµl(microliters):"""Unicode function name for creating microliter volumes"""returnUnit(microliters,"microliter")
DNA synthesis & synthetic biology
Despite its connection to modern synthetic biology, DNA synthesis is a
fairly old technology. We've been able to make "oligos" (meaning DNA
sequences of <~200bp) for
decades.
It's always been expensive though, and the chemistry has never allowed
for long sequences of DNA. Recently, it's become feasible to synthesize
entire genes (up to thousands of bases) at a reasonable price. This
advance is really what is enabling the era of "synthetic biology".
Craig Venter's Synthetic
Genomics has taken synthetic
biology the furthest by synthesizing an entire
organism —
over a million bases in length. As the length of the DNA grows, the
problem becomes more about assembly (i.e., stitching together
synthesized DNA sequences) rather than synthesis. Each time you assemble
you can double the length of your DNA (or more), so after a dozen or so
iterations you can get pretty
long! The
distinction between synthesis and assembly should become transparent to
the end-user fairly soon.
Moore's Lab?
The price of DNA synthesis has been falling pretty quickly, from over
30c per base a couple of years ago to around 10c per base today, but
it's developing more like batteries than CPUs. In contrast, DNA
sequencing costs have been falling faster than Moore's law. A target of
2c/base
has been mooted as an inflection point where you can replace a lot of
money-saving-but-laborious DNA manipulation with simple synthesis. For
example, at 2c/base you could synthesize an entire 3kb plasmid for
$60,
and skip over a ton of molecular biology chores. Hopefully that's what
we'll all be doing in a couple of years.
DNA synthesis vs DNA sequencing costs (Carlson,
2014)
DNA Synthesis Companies
There are a few big companies in the DNA synthesis space:
IDT is the largest manufacturer of oligos,
and can produce longer (up to 2kb) "gene fragments"
(gBlocks)
too; Gen9,
Twist, DNA
2.0 generally focus on longer DNA sequences
— these are the gene synthesis companies. There are also some exciting
new companies like Cambrian Genomics
and Genesis DNA that are working on
next-generation synthesis methods.
Other companies, like Amyris,
Zymergen and Ginkgo
Bioworks, use the DNA sythesized by
these companies to do organism-scale work. Synthetic
Genomics also does that but
synthesizes its own DNA.
Recently, Ginkgo did a deal with
Twist
for 100 million bases, a leap over anything public I've seen before. In
a move that proves we're living in the future, Twist has even advertised
a promo code on Twitter for a deal where if you buy 10 million bases of
DNA (almost an entire yeast genome!), you get 10 million bases free.
For this experiment I want to synthesize the DNA sequence for a simple
protein, Green Fluorescent
Protein
(GFP). GFP is a protein, first found in a
jellyfish, that
fluoresces under UV light. It's an extremely useful protein since it's
easy to tell where it is being expressed simply by measuring
fluorescence. There are variants of GFP that produce yellow, red, orange
and other colors.
It's interesting to see how various mutations affect the color of the
protein, and potentially an interesting machine learning problem. Not
long ago, this would have been a significant investment of time in the
lab, but now, as I'll show, it is (almost) as simple as editing a text
file!
Technically, my GFP is a "superfolder" variant
(sfGFP),
with some enhancing mutations.
Superfolder
GFP(sfGFP) has mutations that give it some useful properties
I was fortunate to be included in Twist's alpha program so I used them
to synthesize my DNA (they graciously accommodated my tiny order —
thanks Twist!) Twist is a new company in the space, with a novel
miniaturized process for synthesis. Although Twist's pricing is probably
the best around at 10c a base or
lower,
they are still in beta only, and
the alpha program I took part in has closed. Twist has raised about
$150M
so there is a lot of enthusiasm for their technology.
I sent my DNA sequence to Twist as an Excel spreadsheet (there is no API
yet but I assume it'll come soon), and they sent the synthesized DNA
directly to my inventory in Transcriptic's labs. (I also used IDT for
synthesis, but since they did not ship the DNA straight to Transcriptic,
it kind of ruins the fun.)
This process is clearly not a common use-case yet and required some
hand-holding, but it worked, and it keeps the entire pipeline virtual.
Without this, I would have likely needed access to a laboratory — many
companies will not ship DNA or reagents to a home address.
To express my protein in bacteria, I need somewhere for the gene to
live, otherwise the synthetic DNA encoding the gene will just be
instantly degraded. Generally, in molecular biology we use a plasmid, a
bit of circular DNA that lives outside the bacterial genome and
expresses proteins. Plasmids are a convenient way for bacteria to share
useful self-contained functional modules like antibiotic resistance.
There can be hundreds of plasmids per cell.
The commonly used terminology is that the plasmid is the vector and
the synthetic DNA is the insert. So here I'm trying to clone the
insert into the vector, then transform bacteria with the
vector.
A bacterial genome and plasmid (not to scale!)
(Wikipedia)
pUC19
I chose a fairly standard plasmid in
pUC19. This plasmid is very
commonly used and since it's available as part of the standard
Transcriptic inventory, I do not need to ship anything to Transcriptic.
The structure of pUC19: the major components are an ampicillin
resistance gene, lacZα, an MCS/polylinker, and an origin of replication
(Wikipedia)
pUC19 has a nice feature where because it contains a lacZα gene, you can
use it in a blue–white
screen and see
which colonies have had successful insertion events. You need two
chemicals,
IPTG
and X-gal, and it works as
follows:
lacZα expression is induced by IPTG
If lacZα is inactivated — by DNA inserted at the multiple cloning
site
(MCS/polylinker)
within lacZα — then the plasmid cannot hydrolyze X-gal, and these
colonies will be white instead of blue
Therefore a successful insertion produces white colonies and an
unsuccessful insertion produces blue colonies
A blue–white screen shows where lacZα expression has been inactivated
(Wikipedia)
DH5α E. coli does not require IPTG to induce expression from the lac promoter even though the strain expresses the Lac repressor. The copy number of most plasmids exceeds the repressor number in the cells. If you are concerned about obtaining maximal levels of expression, add IPTG to a final concentration of 1 mM.
Synthetic DNA Sequences
sfGFP DNA Sequence
It's straightforward to get a DNA sequence for sfGFP by taking the
protein sequence and
encoding it with codons suitable
for the host organism (here, E. coli). It's a medium-sized protein at
236 amino acids, which means at 10c/base it costs me about
$70
to synthesize the DNA.
Wolfram Alpha, calculating the cost of synthesis
The first 12 bases of my sfGFP are a Shine-Dalgarno
sequence that
I added myself, which should in theory increase expression
(AGGAGGACAGCT, then an ATG (start
codon) starts the
protein). According to a computational tool developed by the Salis
Lab (lecture
slides),
I should expect medium to high expression of my protein (a Translation
Initiation Rate of 10,000 "arbitrary units").
sfGFP_plus_SD=clean_seq("""
AGGAGGACAGCTATGTCGAAAGGAGAAGAACTGTTTACCGGTGTGGTTCCGATTCTGGTAGAACTGGA
TGGGGACGTGAACGGCCATAAATTTAGCGTCCGTGGTGAGGGTGAAGGGGATGCCACAAATGGCAAAC
TTACCCTTAAATTCATTTGCACTACCGGCAAGCTGCCGGTCCCTTGGCCGACCTTGGTCACCACACTG
ACGTACGGGGTTCAGTGTTTTTCGCGTTATCCAGATCACATGAAACGCCATGACTTCTTCAAAAGCGC
CATGCCCGAGGGCTATGTGCAGGAACGTACGATTAGCTTTAAAGATGACGGGACCTACAAAACCCGGG
CAGAAGTGAAATTCGAGGGTGATACCCTGGTTAATCGCATTGAACTGAAGGGTATTGATTTCAAGGAA
GATGGTAACATTCTCGGTCACAAATTAGAATACAACTTTAACAGTCATAACGTTTATATCACCGCCGA
CAAACAGAAAAACGGTATCAAGGCGAATTTCAAAATCCGGCACAACGTGGAGGACGGGAGTGTACAAC
TGGCCGACCATTACCAGCAGAACACACCGATCGGCGACGGCCCGGTGCTGCTCCCGGATAATCACTAT
TTAAGCACCCAGTCAGTGCTGAGCAAAGATCCGAACGAAAAACGTGACCATATGGTGCTGCTGGAGTT
CGTGACCGCCGCGGGCATTACCCATGGAATGGATGAACTGTATAAA""")print("Read in sfGFP plus Shine-Dalgarno: {} bases long".format(len(sfGFP_plus_SD)))sfGFP_aas=clean_aas("""MSKGEELFTGVVPILVELDGDVNGHKFSVRGEGEGDATNGKLTLKFICTTGKLPVPWPTLVTTLTYG
VQCFSRYPDHMKRHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNSHNVYITADKQKN
GIKANFKIRHNVEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSVLSKDPNEKRDHMVLLEFVTAAGITHGMDELYK""")assertsfGFP_plus_SD[12:].translate()==sfGFP_aasprint("Translation matches protein with accession 532528641")
Read in sfGFP plus Shine-Dalgarno: 726 bases long
Translation matches protein with accession 532528641
pUC19_fasta=!catpuc19fsa.txtpUC19_fwd=clean_seq(''.join(pUC19_fasta[1:]))pUC19_rev=pUC19_fwd.reverse_complement()assertall(ntin"ACGT"forntinpUC19_fwd)assertlen(pUC19_fwd)==2686pUC19_MCS=clean_seq("GAATTCGAGCTCGGTACCCGGGGATCCTCTAGAGTCGACCTGCAGGCATGCAAGCTT")print("Read in pUC19: {} bases long".format(len(pUC19_fwd)))assertpUC19_MCSinpUC19_fwdprint("Found MCS/polylinker")
Read in pUC19: 2686 bases long
Found MCS/polylinker
I do some basic QC is to make sure that EcoRI and BamHI are only present
in pUC19 once. (The following restriction enzymes are available in
Transcriptic's default inventory: PstI, PvuII, EcoRI, BamHI, BbsI,
BsmBI.)
REs={"EcoRI":"GAATTC","BamHI":"GGATTC"}forrename,resinREs.items():assert(pUC19_fwd.find(res)==pUC19_fwd.rfind(res)andpUC19_rev.find(res)==pUC19_rev.rfind(res))assert(pUC19_fwd.find(res)==-1orpUC19_rev.find(res)==-1orpUC19_fwd.find(res)==len(pUC19_fwd)-pUC19_rev.find(res)-len(res))print("Asserted restriction enzyme sites present only once: {}".format(REs.keys()))
Now I look at the lacZα sequence and make sure there is nothing
unexpected. For example, it should start with a Met and end with a stop
codon. It's also easy to confirm that this is the full 324bp lacZα ORF
by loading the pUC19 sequece into the free snapgene
viewer tool.
lacZ=pUC19_rev[2217:2541]print("lacZα sequence:\t{}".format(lacZ))print("r_MCS sequence:\t{}".format(pUC19_MCS.reverse_complement()))lacZ_p=lacZ.translate()assertlacZ_p[0]=="M"andnot"*"inlacZ_p[:-1]andlacZ_p[-1]=="*"assertpUC19_MCS.reverse_complement()inlacZassertpUC19_MCS.reverse_complement()==pUC19_rev[2234:2291]print("Found MCS once in lacZ sequence")
lacZ sequence: ATGACCATGATTACGCCAAGCTTGCATGCCTGCAGGTCGACTCTAGAGGATCCCCGGGTACCGAGCTCGAATTCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAG
r_MCS sequence: AAGCTTGCATGCCTGCAGGTCGACTCTAGAGGATCCCCGGGTACCGAGCTCGAATTC
Found MCS once in lacZ sequence
Gibson Assembly
DNA assembly simply means stitching DNA together. Usually, you
assemble several pieces of DNA into a longer segment, and then
clone that into a plasmid or genome. In this experiment I just want
to clone one DNA segment into the pUC19 plasmid downstream of the lac
promoter, so that it will be expressed in E. coli.
There are many different ways to do cloning (e.g., see
NEB,
openwetware,
addgene).
Here I will use Gibson assembly (developed by Daniel
Gibson at Synthetic
Genomics in 2009), which is not necessarily the cheapest method, but is
straightforward and flexible. All you have to do is put all the DNA you
want to assemble (with the appropriate overlaps) in a tube with the
Gibson Assembly Master Mix, and it assembles itself!
I am starting with 100ng of synthetic DNA in 10µl of liquid. That
equates to 0.21 picomoles of DNA or a concentration of 10ng/µl.
pmol_sfgfp=convert_ug_to_pmol(0.1,len(sfGFP_plus_SD))print("Insert: 100ng of DNA of length {:4d} equals {:.2f} pmol".format(len(sfGFP_plus_SD),pmol_sfgfp))
Insert: 100ng of DNA of length 726 equals 0.21 pmol
NEB recommends a total of 0.02–0.5 pmols of DNA fragments when 1 or 2 fragments are being assembled into a vector and 0.2–1.0 pmoles of DNA fragments when 4–6 fragments are being assembled
0.02–0.5 pmols* X µl
* Optimized cloning efficiency is 50–100 ng of vectors with 2–3 fold of excess inserts. Use 5 times more of inserts if size is less than 200 bps. Total volume of unpurified PCR fragments in Gibson Assembly reaction should not exceed 20%.
NEBuilder for Gibson Assembly
New England Biolab's NEBuilder is a
really excellent tool to help you design your Gibson assembly protocol.
It even generates a comprehensive four-page PDF protocol for you. Using
this tool, I design a protocol to cut pUC19 with EcoRI and then use PCR
to add appropriately sized flanks to the insert.
Part Two: Running The Experiment
There are four steps in the experiment:
PCR of the insert to add complementary flanks;
Cutting the plasmid to accommodate the insert;
Gibson assembly of insert and plasmid;
Transforming the bacteria with the assembled plasmid.
Step 1. PCR of the Insert
Gibson assembly relies on the DNA sequences you are assembling having
some overlapping sequence (see the NEB protocol above for detailed
instructions). As well as simple amplification, PCR also enables you to
add flanking DNA to a sequence by simply including the additional
sequence in the primers. (You can also clone using only
OE-PCR).
I synthesize primers according to the NEB protocol above. I used a
Quickstart
protocol
on the Transcriptic website to try it out, but there's also an
autoprotocol
command.
Transcriptic does not do oligo synthesis in-house, so after a day or two
of waiting, these primers magically appear in my inventory. (Note, the
gene-specific part of the primers below is upper-case but it's just
cosmetic.)
I can analyze the properties of these primers using IDT
OligoAnalyzer. It is useful to
know the melting temperatures and propensity for
dimer-forming when
debugging a PCR experiment, though the NEB protocol almost certainly
chooses primers with good properties.
I went through many iterations of this PCR protocol before getting
results I was satisfied with, including experimenting with several
different brands of PCR mixes. Since each of these iterations can take
several days, (depending on the length of the queue at the lab) it is
worth spending time debugging upfront, since it saves a lot of time in
the long run. As cloud lab capacity increases this issue should
diminish. Still, I would not assume that your first protocol will
succeed — there are many variables at work here.
""" PCR overlap extension of sfGFP according to NEB protocol.
v5: Use 3/10ths as much primer as the v4 protocol.
v6: more complex touchdown pcr procedure. The Q5 temperature was probably too hot
v7: more time at low temperature to allow gene-specific part to anneal
v8: correct dNTP concentration, real touchdown
"""p=Protocol()# ---------------------------------------------------# Set up experiment#experiment_name="sfgfp_pcroe_v8"template_length=740_options={'dilute_primers':False,# if working stock has not been made'dilute_template':False,# if working stock has not been made'dilute_dNTP':False,# if working stock has not been made'run_gel':True,# run a gel to see the plasmid size'run_absorbance':False,# check absorbance at 260/280/320'run_sanger':False}# sanger sequence the new sequenceoptions={kfork,vin_options.items()ifvisTrue}# ---------------------------------------------------# Inventory and provisioning# https://developers.transcriptic.com/v1.0/docs/containers## 'sfgfp2': 'ct17yx8h77tkme', # inventory; sfGFP tube #2, micro-1.5, cold_20# 'sfgfp_puc19_primer1': 'ct17z9542mrcfv', # inventory; micro-2.0, cold_4# 'sfgfp_puc19_primer2': 'ct17z9542m5ntb', # inventory; micro-2.0, cold_4# 'sfgfp_idt_1ngul': 'ct184nnd3rbxfr', # inventory; micro-1.5, cold_4, (ERROR: no template)#inv={'Q5 Polymerase':'rs16pcce8rdytv',# catalog; Q5 High-Fidelity DNA Polymerase'Q5 Buffer':'rs16pcce8rmke3',# catalog; Q5 Reaction Buffer'dNTP Mixture':'rs16pcb542c5rd',# catalog; dNTP Mixture (25mM?)'water':'rs17gmh5wafm5p',# catalog; Autoclaved MilliQ H2O'sfgfp_pcroe_v5_puc19_primer1_10uM':'ct186cj5cqzjmr',# inventory; micro-1.5, cold_4'sfgfp_pcroe_v5_puc19_primer2_10uM':'ct186cj5cq536x',# inventory; micro-1.5, cold_4'sfgfp1':'ct17yx8h759dk4',# inventory; sfGFP tube #1, micro-1.5, cold_20}# Existing inventorytemplate_tube=p.ref("sfgfp1",id=inv['sfgfp1'],cont_type="micro-1.5",storage="cold_4").well(0)dilute_primer_tubes=[p.ref('sfgfp_pcroe_v5_puc19_primer1_10uM',id=inv['sfgfp_pcroe_v5_puc19_primer1_10uM'],cont_type="micro-1.5",storage="cold_4").well(0),p.ref('sfgfp_pcroe_v5_puc19_primer2_10uM',id=inv['sfgfp_pcroe_v5_puc19_primer2_10uM'],cont_type="micro-1.5",storage="cold_4").well(0)]# New inventory resulting from this experimentdilute_template_tube=p.ref("sfgfp1_0.25ngul",cont_type="micro-1.5",storage="cold_4").well(0)dNTP_10uM_tube=p.ref("dNTP_10uM",cont_type="micro-1.5",storage="cold_4").well(0)sfgfp_pcroe_out_tube=p.ref(expid("amplified"),cont_type="micro-1.5",storage="cold_4").well(0)# Temporary tubes for use, then discardedmastermix_tube=p.ref("mastermix",cont_type="micro-1.5",storage="cold_4",discard=True).well(0)water_tube=p.ref("water",cont_type="micro-1.5",storage="ambient",discard=True).well(0)pcr_plate=p.ref("pcr_plate",cont_type="96-pcr",storage="cold_4",discard=True)if'run_absorbance'inoptions:abs_plate=p.ref("abs_plate",cont_type="96-flat",storage="cold_4",discard=True)# Initialize all existing inventoryall_inventory_wells=[template_tube]+dilute_primer_tubesforwellinall_inventory_wells:init_inventory_well(well)print(well.name,well.volume,well.properties)# -----------------------------------------------------# Provision water once, for general use#p.provision(inv["water"],water_tube,µl(500))# -----------------------------------------------------# Dilute primers 1/10 (100uM->10uM) and keep at 4C#if'dilute_primers'inoptions:forprimer_numin(0,1):p.transfer(water_tube,dilute_primer_tubes[primer_num],µl(90))p.transfer(primer_tubes[primer_num],dilute_primer_tubes[primer_num],µl(10),mix_before=True,mix_vol=µl(50))p.mix(dilute_primer_tubes[primer_num],volume=µl(50),repetitions=10)# -----------------------------------------------------# Dilute template 1/10 (10ng/ul->1ng/ul) and keep at 4C# OR# Dilute template 1/40 (10ng/ul->0.25ng/ul) and keep at 4C#if'dilute_template'inoptions:p.transfer(water_tube,dilute_template_tube,µl(195))p.mix(dilute_template_tube,volume=µl(100),repetitions=10)# Dilute dNTP to exactly 10uMif'dilute_DNTP'inoptions:p.transfer(water_tube,dNTP_10uM_tube,µl(6))p.provision(inv["dNTP Mixture"],dNTP_10uM_tube,µl(4))# -----------------------------------------------------# Q5 PCR protocol# www.neb.com/protocols/2013/12/13/pcr-using-q5-high-fidelity-dna-polymerase-m0491## 25ul reaction# -------------# Q5 reaction buffer 5 µl# Q5 polymerase 0.25 µl# 10mM dNTP 0.5 µl -- 1µl = 4x12.5mM# 10uM primer 1 1.25 µl# 10uM primer 2 1.25 µl# 1pg-1ng Template 1 µl -- 0.5 or 1ng/ul concentration# -------------------------------# Sum 9.25 µl### Mastermix tube will have 96ul of stuff, leaving space for 4x1ul aliquots of templatep.transfer(water_tube,mastermix_tube,µl(64))p.provision(inv["Q5 Buffer"],mastermix_tube,µl(20))p.provision(inv['Q5 Polymerase'],mastermix_tube,µl(1))p.transfer(dNTP_10uM_tube,mastermix_tube,µl(1),mix_before=True,mix_vol=µl(2))p.transfer(dilute_primer_tubes[0],mastermix_tube,µl(5),mix_before=True,mix_vol=µl(10))p.transfer(dilute_primer_tubes[1],mastermix_tube,µl(5),mix_before=True,mix_vol=µl(10))p.mix(mastermix_tube,volume="48:microliter",repetitions=10)# Transfer mastermix to pcr_plate without templatep.transfer(mastermix_tube,pcr_plate.wells(["A1","B1","C1"]),µl(24))p.transfer(mastermix_tube,pcr_plate.wells(["A2"]),µl(24))# acknowledged dead volume problemsp.mix(pcr_plate.wells(["A1","B1","C1","A2"]),volume=µl(12),repetitions=10)# Finally add templatep.transfer(template_tube,pcr_plate.wells(["A1","B1","C1"]),µl(1))p.mix(pcr_plate.wells(["A1","B1","C1"]),volume=µl(12.5),repetitions=10)# ---------------------------------------------------------# Thermocycle with Q5 and hot start# 61.1 annealing temperature is recommended by NEB protocol# p.seal is enforced by transcriptic#extension_time=int(max(2,np.ceil(template_length*(11.0/1000))))assert0<extension_time<60,"extension time should be reasonable for PCR"cycles=[{"cycles":1,"steps":[{"temperature":"98:celsius","duration":"30:second"}]}]+ \
touchdown(70,61,[8,25,extension_time],stepsize=0.5)+ \
[{"cycles":16,"steps":[{"temperature":"98:celsius","duration":"8:second"},{"temperature":"61.1:celsius","duration":"25:second"},{"temperature":"72:celsius","duration":"{:d}:second".format(extension_time)}]},{"cycles":1,"steps":[{"temperature":"72:celsius","duration":"2:minute"}]}]p.seal(pcr_plate)p.thermocycle(pcr_plate,cycles,volume=µl(25))# --------------------------------------------------------# Run a gel to hopefully see a 740bp fragment#if'run_gel'inoptions:p.unseal(pcr_plate)p.mix(pcr_plate.wells(["A1","B1","C1","A2"]),volume=µl(12.5),repetitions=10)p.transfer(pcr_plate.wells(["A1","B1","C1","A2"]),pcr_plate.wells(["D1","E1","F1","D2"]),[µl(2),µl(4),µl(8),µl(8)])p.transfer(water_tube,pcr_plate.wells(["D1","E1","F1","D2"]),[µl(18),µl(16),µl(12),µl(12)],mix_after=True,mix_vol=µl(10))p.gel_separate(pcr_plate.wells(["D1","E1","F1","D2"]),µl(20),"agarose(10,2%)","ladder1","10:minute",expid("gel"))#---------------------------------------------------------# Absorbance dilution series. Take 1ul out of the 25ul pcr plate wells#if'run_absorbance'inoptions:p.unseal(pcr_plate)abs_wells=["A1","B1","C1","A2","B2","C2","A3","B3","C3"]p.transfer(water_tube,abs_plate.wells(abs_wells[0:6]),µl(10))p.transfer(water_tube,abs_plate.wells(abs_wells[6:9]),µl(9))p.transfer(pcr_plate.wells(["A1","B1","C1"]),abs_plate.wells(["A1","B1","C1"]),µl(1),mix_after=True,mix_vol=µl(5))p.transfer(abs_plate.wells(["A1","B1","C1"]),abs_plate.wells(["A2","B2","C2"]),µl(1),mix_after=True,mix_vol=µl(5))p.transfer(abs_plate.wells(["A2","B2","C2"]),abs_plate.wells(["A3","B3","C3"]),µl(1),mix_after=True,mix_vol=µl(5))forwavelengthin[260,280,320]:p.absorbance(abs_plate,abs_plate.wells(abs_wells),"{}:nanometer".format(wavelength),exp_id("abs_{}".format(wavelength)),flashes=25)# -----------------------------------------------------------------------------# Sanger sequencing: https://developers.transcriptic.com/docs/sanger-sequencing# "Each reaction should have a total volume of 15 µl and we recommend the following composition of DNA and primer:# PCR product (40 ng), primer (1 µl of a 10 µM stock)"## By comparing to the gel ladder concentration (175ng/lane), it looks like 5ul of PCR product has approximately 30ng of DNA#if'run_sanger'inoptions:p.unseal(pcr_plate)seq_wells=["G1","G2"]forprimer_num,seq_wellin[(0,seq_wells[0]),(1,seq_wells[1])]:p.transfer(dilute_primer_tubes[primer_num],pcr_plate.wells([seq_well]),µl(1),mix_before=True,mix_vol=µl(50))p.transfer(pcr_plate.wells(["A1"]),pcr_plate.wells([seq_well]),µl(5),mix_before=True,mix_vol=µl(10))p.transfer(water_tube,pcr_plate.wells([seq_well]),µl(9))p.mix(pcr_plate.wells(seq_wells),volume=µl(7.5),repetitions=10)p.sangerseq(pcr_plate,pcr_plate.wells(seq_wells[0]).indices(),expid("seq1"))p.sangerseq(pcr_plate,pcr_plate.wells(seq_wells[1]).indices(),expid("seq2"))# -------------------------------------------------------------------------# Then consolidate to one tube. Leave at least 3ul dead volume in each tube#remaining_volumes=[well.volume-dead_volume['96-pcr']forwellinpcr_plate.wells(["A1","B1","C1"])]print("Consolidated volume",sum(remaining_volumes,µl(0)))p.consolidate(pcr_plate.wells(["A1","B1","C1"]),sfgfp_pcroe_out_tube,remaining_volumes,allow_carryover=True)uprint("\nProtocol 1. Amplify the insert (oligos previously synthesized)")jprotocol=json.dumps(p.as_dict(),indent=2)!echo'{jprotocol}'|transcripticanalyzeopen("protocol_{}.json".format(experiment_name),'w').write(jprotocol)
WARNING:root:Low volume for well sfGFP 1 /sfGFP 1 : 2.0:microliter
By running a gel I can see if the amplified product is the right
size (position of the band in the gel), and the right quantity
(darkness of the band). The gel has a ladder corresponding to different
lengths and quantities of DNA that can be used for comparison.
In the gel photograph below, lanes D1, E1, F1 contain 2µl, 4µl, and 8µl
of amplified product, respectively. I can estimate the amount of DNA in
each lane by comparison to the DNA in the ladder (50ng of DNA per band
in the ladder). I think the results look very clean.
I tried using
GelEval
to analyze the image and estimate concentrations, and it worked pretty
well, though I'm not sure it would be much more accurate than a more
naive method. However, small changes to the location and size of the
bands led to large changes in the estimate of the amount of DNA. My best
estimate for the amount of DNA in my amplified product using GelEval is
40ng/µl.
If I assume that I am limited by the amount of primer in the
mixture,
as opposed to the amount of dNTP or enzyme, then since I have 12.5pmol
of each primer, that implies a theoretical maximum of 6µg of 740bp DNA
in 25µl. Since my estimate for the total amount of DNA using GelEval is
40ng x 25µl (1µg or 2pmol), these results are very reasonable and close
to what I should expect under ideal conditions.
Gel electrophoresis of an EcoRI-cut pUC19, various concentrations (D1,
E1, F1), plus a control (D2)
PCR results diagnostics
Recently, Transcriptic has started providing some interesting and useful
diagnostic data, outputted by its robots. At the time of writing, the
data were not available for download, so for now I just have an image of
temperatures during thermocycling.
The data looks good, with no unexpected peaks or troughs. The PCR cycles
35 times in total, but some of these cycles are spent at very high
temperature as part of the touchdown
PCR
process. In my previous attempts to amplify this segment — of which
there were a few! — I had issues with self–primer hybridization so here
I made the PCR spends quite a bit of time at high temperatures, which
should increase the fidelity.
Thermocycling diagnostics for a touchdown PCR: temperatures of block,
sample and lid over 35 cycles and 42 minutes
Step 2. Cutting the Plasmid
To insert my sfGFP DNA into pUC19, I first need to cut the plasmid open.
Following the NEB protocol, I do this with the restriction enzyme
EcoRI. Transcriptic has the
reagents I need in its standard inventory: this NEB EcoRI and 10x
CutSmart buffer and
this NEB pUC19
plasmid.
Here are the prices from their inventory for reference. I only actually
pay a fraction of the price below since Transcriptic sells by the
aliquot:
Item ID Amount Concentration Price
------------ ------ ------------- ----------------- ------
CutSmart 10x B7204S 5 ml 10 X $19.00
EcoRI R3101L 50,000 units 20,000 units/ml $225.00
pUC19 N3041L 250 µg 1,000 µg/ml $268.00
The buffer must be completely thawed before use. Dilute the 10X stock with dH2O to a final concentration of 1X. Add the water first, buffer next, the DNA solution and finally the enzyme. A typical 50 µl reaction should contain 5 µl of 10X NEBuffer with the rest of the volume coming from the DNA solution, enzyme and dH2O.
One unit is defined as the amount of enzyme required to digest 1 µg of λ DNA in 1 hour at 37°C in a total reaction volume of 50 µl. In general, we recommend 5–10 units of enzyme per µg DNA, and 10–20 units for genomic DNA in a 1 hour digest.
A 50 µl reaction volume is recommended for digestion of 1 µg of substrate
"""Protocol for cutting pUC19 with EcoRI."""p=Protocol()experiment_name="puc19_ecori_v3"options={}inv={'water':"rs17gmh5wafm5p",# catalog; Autoclaved MilliQ H2O; ambient"pUC19":"rs17tcqmncjfsh",# catalog; pUC19; cold_20"EcoRI":"rs17ta8xftpdk6",# catalog; EcoRI-HF; cold_20"CutSmart":"rs17ta93g3y85t",# catalog; CutSmart Buffer 10x; cold_20"ecori_p10x":"ct187v4ea85k2h",# inventory; EcoRI diluted 10x}# Tubes and plates I use then discardre_tube=p.ref("re_tube",cont_type="micro-1.5",storage="cold_4",discard=True).well(0)water_tube=p.ref("water_tube",cont_type="micro-1.5",storage="cold_4",discard=True).well(0)pcr_plate=p.ref("pcr_plate",cont_type="96-pcr",storage="cold_4",discard=True)# The result of the experiment, a pUC19 cut by EcoRI, goes in this tube for storagepuc19_cut_tube=p.ref(expid("puc19_cut"),cont_type="micro-1.5",storage="cold_20").well(0)# -------------------------------------------------------------# Provisioning and diluting.# Diluted EcoRI can be used more than once#p.provision(inv["water"],water_tube,µl(500))if'dilute_ecori'inoptions:ecori_p10x_tube=p.ref("ecori_p10x",cont_type="micro-1.5",storage="cold_20").well(0)p.transfer(water_tube,ecori_p10x_tube,µl(45))p.provision(inv["EcoRI"],ecori_p10x_tube,µl(5))else:# All "inventory" (stuff I own at transcriptic) must be initializedecori_p10x_tube=p.ref("ecori_p10x",id=inv["ecori_p10x"],cont_type="micro-1.5",storage="cold_20").well(0)init_inventory_well(ecori_p10x_tube)# -------------------------------------------------------------# Restriction enzyme cutting pUC19## 50ul total reaction volume for cutting 1ug of DNA:# 5ul CutSmart 10x# 1ul pUC19 (1ug of DNA)# 1ul EcoRI (or 10ul diluted EcoRI, 20 units, >10 units per ug DNA)#p.transfer(water_tube,re_tube,µl(117))p.provision(inv["CutSmart"],re_tube,µl(15))p.provision(inv["pUC19"],re_tube,µl(3))p.mix(re_tube,volume=µl(60),repetitions=10)assertre_tube.volume==µl(120)+dead_volume["micro-1.5"]print("Volumes: re_tube:{} water_tube:{} EcoRI:{}".format(re_tube.volume,water_tube.volume,ecori_p10x_tube.volume))p.distribute(re_tube,pcr_plate.wells(["A1","B1","A2"]),µl(40))p.distribute(water_tube,pcr_plate.wells(["A2"]),µl(10))p.distribute(ecori_p10x_tube,pcr_plate.wells(["A1","B1"]),µl(10))assertall(well.volume==µl(50)forwellinpcr_plate.wells(["A1","B1","A2"]))p.mix(pcr_plate.wells(["A1","B1","A2"]),volume=µl(25),repetitions=10)# Incubation to induce cut, then heat inactivation of EcoRIp.seal(pcr_plate)p.incubate(pcr_plate,"warm_37","60:minute",shaking=False)p.thermocycle(pcr_plate,[{"cycles":1,"steps":[{"temperature":"65:celsius","duration":"21:minute"}]}],volume=µl(50))# --------------------------------------------------------------# Gel electrophoresis, to ensure the cutting worked#p.unseal(pcr_plate)p.mix(pcr_plate.wells(["A1","B1","A2"]),volume=µl(25),repetitions=5)p.transfer(pcr_plate.wells(["A1","B1","A2"]),pcr_plate.wells(["D1","E1","D2"]),µl(8))p.transfer(water_tube,pcr_plate.wells(["D1","E1","D2"]),µl(15),mix_after=True,mix_vol=µl(10))assertall(well.volume==µl(20)+dead_volume["96-pcr"]forwellinpcr_plate.wells(["D1","E1","D2"]))p.gel_separate(pcr_plate.wells(["D1","E1","D2"]),µl(20),"agarose(10,2%)","ladder2","15:minute",expid("gel"))# ----------------------------------------------------------------------------# Then consolidate all cut plasmid to one tube (puc19_cut_tube).#remaining_volumes=[well.volume-dead_volume['96-pcr']forwellinpcr_plate.wells(["A1","B1"])]print("Consolidated volume: {}".format(sum(remaining_volumes,µl(0))))p.consolidate(pcr_plate.wells(["A1","B1"]),puc19_cut_tube,remaining_volumes,allow_carryover=True)assertall(tube.volume>=dead_volume['micro-1.5']fortubein[water_tube,re_tube,puc19_cut_tube,ecori_p10x_tube])# ---------------------------------------------------------------# Test protocol#jprotocol=json.dumps(p.as_dict(),indent=2)!echo'{jprotocol}'|transcripticanalyze#print("Protocol {}\n\n{}".format(experiment_name, jprotocol))open("protocol_{}.json".format(experiment_name),'w').write(jprotocol)
I ended up doing this experiment twice under slightly different
conditions and with different-sized gels, but the results are almost
identical. Both gels look good to me.
Originally, I did not allocate enough space for dead
volume (1.5ml
tubes have 15µl of dead volume!), which I believe explains the
difference between D1 and E1 (these two lanes should be
identical). This dead volume problem would be easily solved by making a
proper working stock of diluted EcoRI at the start of the protocol.
Despite that error, in both gels, lanes D1 and E1 contain strong
bands at the correct position of 2.6kb. Lane D2 contains uncut
plasmid, so as expected, it is not visible in one gel and barely visible
as a smear in the other.
The two gel photographs look pretty different, partially just because
this is a step that Transcriptic has yet to automate.
Two gels showing a cut pUC19 (2.6kb) in lanes D1 and E1, and uncut
pUC19 in D2
Step 3. Gibson Assembly
The simplest way to check if my Gibson assembly works is to assemble the
insert and plasmid, then use standard M13
primers
(which flank the insert) to amplify part of the plasmid and the inserted
DNA, and run
qPCR
and a gel to see
that the amplification worked. You could also run a sequencing reaction
to confirm that everything inserted as expected, but I decided to leave
this for later.
If the Gibson assembly fails, then the M13 amplification will fail,
because the plasmid has been cut between the two M13 sequences.
"""Debugging transformation protocol: Gibson assembly followed by qPCR and a gel
v2: include v3 Gibson assembly"""p=Protocol()options={}experiment_name="debug_sfgfp_puc19_gibson_seq_v2"inv={"water":"rs17gmh5wafm5p",# catalog; Autoclaved MilliQ H2O; ambient"M13_F":"rs17tcpqwqcaxe",# catalog; M13 Forward (-41); cold_20 (1ul = 100pmol)"M13_R":"rs17tcph6e2qzh",# catalog; M13 Reverse (-48); cold_20 (1ul = 100pmol)"SensiFAST_SYBR_No-ROX":"rs17knkh7526ha",# catalog; SensiFAST SYBR for qPCR"sfgfp_puc19_gibson_v1_clone":"ct187rzdq9kd7q",# inventory; assembled sfGFP; cold_4"sfgfp_puc19_gibson_v3_clone":"ct188ejywa8jcv",# inventory; assembled sfGFP; cold_4}# ---------------------------------------------------------------# First get my sfGFP pUC19 clones, assembled with Gibson assembly#clone_plate1=p.ref("sfgfp_puc19_gibson_v1_clone",id=inv["sfgfp_puc19_gibson_v1_clone"],cont_type="96-pcr",storage="cold_4",discard=False)clone_plate2=p.ref("sfgfp_puc19_gibson_v3_clone",id=inv["sfgfp_puc19_gibson_v3_clone"],cont_type="96-pcr",storage="cold_4",discard=False)water_tube=p.ref("water",cont_type="micro-1.5",storage="cold_4",discard=True).well(0)master_tube=p.ref("master",cont_type="micro-1.5",storage="cold_4",discard=True).well(0)primer_tube=p.ref("primer",cont_type="micro-1.5",storage="cold_4",discard=True).well(0)pcr_plate=p.ref(expid("pcr_plate"),cont_type="96-pcr",storage="cold_4",discard=False)init_inventory_well(clone_plate1.well("A1"))init_inventory_well(clone_plate2.well("A1"))seq_wells=["B2","B4","B6",# clone_plate1"D2","D4","D6",# clone_plate2"F2","F4"]# control# clone_plate2 was diluted 4X (20ul->80ul), according to NEB instructionsassertclone_plate1.well("A1").volume==µl(18),clone_plate1.well("A1").volumeassertclone_plate2.well("A1").volume==µl(78),clone_plate2.well("A1").volume# --------------------------------------------------------------# Provisioning#p.provision(inv["water"],water_tube,µl(500))# primers, diluted 2X, discarded at the endp.provision(inv["M13_F"],primer_tube,µl(13))p.provision(inv["M13_R"],primer_tube,µl(13))p.transfer(water_tube,primer_tube,µl(26),mix_after=True,mix_vol=µl(20),repetitions=10)# -------------------------------------------------------------------# PCR Master mix -- 10ul SYBR mix, plus 1ul each undiluted primer DNA (100pmol)# Also add 15ul of dead volume#p.provision(inv['SensiFAST_SYBR_No-ROX'],master_tube,µl(11+len(seq_wells)*10))p.transfer(primer_tube,master_tube,µl(4+len(seq_wells)*4))p.mix(master_tube,volume=µl(63),repetitions=10)assertmaster_tube.volume==µl(127)# 15ul dead volumep.distribute(master_tube,pcr_plate.wells(seq_wells),µl(14),allow_carryover=True)p.distribute(water_tube,pcr_plate.wells(seq_wells),[µl(ul)forulin[5,4,2,4,2,0,6,6]],allow_carryover=True)# Template -- starting with some small, unknown amount of DNA produced by Gibsonp.transfer(clone_plate1.well("A1"),pcr_plate.wells(seq_wells[0:3]),[µl(1),µl(2),µl(4)],one_tip=True)p.transfer(clone_plate2.well("A1"),pcr_plate.wells(seq_wells[3:6]),[µl(2),µl(4),µl(6)],one_tip=True)assertall(pcr_plate.well(w).volume==µl(20)forwinseq_wells)assertclone_plate1.well("A1").volume==µl(11)assertclone_plate2.well("A1").volume==µl(66)# --------------------------------------------------------------# qPCR# standard melting curve parameters#p.seal(pcr_plate)p.thermocycle(pcr_plate,[{"cycles":1,"steps":[{"temperature":"95:celsius","duration":"2:minute"}]},{"cycles":40,"steps":[{"temperature":"95:celsius","duration":"5:second"},{"temperature":"60:celsius","duration":"20:second"},{"temperature":"72:celsius","duration":"15:second","read":True}]}],volume=µl(20),# volume is optionaldataref=expid("qpcr"),dyes={"SYBR":seq_wells},# dye must be specified (tells transcriptic what aborbance to use?)melting_start="65:celsius",melting_end="95:celsius",melting_increment="0.5:celsius",melting_rate="5:second")# --------------------------------------------------------------# Gel -- 20ul required# Dilute such that I have 11ul for sequencing#p.unseal(pcr_plate)p.distribute(water_tube,pcr_plate.wells(seq_wells),µl(11))p.gel_separate(pcr_plate.wells(seq_wells),µl(20),"agarose(8,0.8%)","ladder1","10:minute",expid("gel"))# This appears to be a bug in Transcriptic. The actual volume should be 11ul# but it is not updating after running a gel with 20ul.# Primer tube should be equal to dead volume, or it's a wasteassertall(pcr_plate.well(w).volume==µl(31)forwinseq_wells)assertprimer_tube.volume==µl(16)==dead_volume['micro-1.5']+µl(1)assertwater_tube.volume>µl(25)# ---------------------------------------------------------------# Test and run protocol#jprotocol=json.dumps(p.as_dict(),indent=2)!echo'{jprotocol}'|transcripticanalyzeopen("protocol_{}.json".format(experiment_name),'w').write(jprotocol)
WARNING:root:Low volume for well sfgfp_puc19_gibson_v1_clone/sfgfp_puc19_gibson_v1_clone : 11.0:microliter
I can use Transcriptic's data API to access the raw qPCR data as json.
This feature is not very well
documented,
but it can be extremely useful. It even gives you access to some
diagnostic data from the robots, which could help with debugging.
Here are the Ct (cycle threshold) values for each well. The Ct is
simply the point at which the fluorescence exceeds a certain value. It
tells us approximately how much DNA is currently present (and hence
approximately how much we started with).
# Simple util to convert wellnum to wellnamen_w={str(wellnum):'ABCDEFGH'[wellnum//12]+str(1+wellnum%12)forwellnuminrange(96)}w_n={v:kfork,vinn_w.items()}ct_vals={n_w[k]:vfork,vinpp_data["amp0"]["SYBR"]["cts"].items()}ct_df=pd.DataFrame(ct_vals,index=["Ct"]).Tct_df["well"]=ct_df.indexf,ax=plt.subplots(figsize=(16,6))_=sns.barplot(y="well",x="Ct",data=ct_df)
We can see that amplification happens earliest in wells D2/4/6 (which
uses DNA from my "v3" Gibson assembly), then B2/4/6 (my "v1" Gibson
assembly). The differences between v1 and v3 are mainly that the v3 DNA
was diluted 4X according to the NEB protocol, but both should work.
There is some amplification after cycle 30 in the control wells (F2, F4)
despate having no template DNA, but that's not unusual since they
include lots of primer DNA.
I can also plot the qPCR amplification curve to see the dynamics of the
amplification.
Overall, the qPCR results looks great, with good amplification for both
versions of my Gibson assembly, and no real amplification in the
control. Since the v3 assembly worked a bit better than v1 I will use
that from here on.
Results: Gibson assembly gel
The gel is also very clean, showing strong bands at just below 1kb in
lanes B2, B4, B6, D2, D4, D6, which is the size I expect (the insert is
about 740bp, and the M13 primers are about 40bp upstream and
downstream). The second band corresponds to primers. We can be pretty
sure of this since lanes F2 and F4 have only primer DNA and no template
DNA.
Gel electrophoresis: the "v3" Gibson assembly has stronger bands (D2,
D4, D6), in line with the qPCR data above.
Step 4. Transformation
Transformation
is the process of altering an organism by adding DNA. So in this
experiment I am transformingE. coli with the sfGFP-expressing
plasmid pUC19.
I am using an easy-to-work-with Zymo DH5α
Mix&Go
strain and the recommended Zymo
protocol.
This strain is part of the standard Transcriptic inventory. In general,
transformations can be tricky since competent cells are quite fragile,
so the simpler and more robust the protocol the better. In regular
molecular biology labs, these competent cells would likely be too
expensive for general use.
Zymo Mix & Go cells have a simple protocol
The trouble with robots
This protocol is a good example of how adapting human protocols for use
with robots can be difficult, and can fail unexpectedly. Protocols can
be surprisingly vague ("shake the tube from side to side"), relying on
the shared context of molecular biologists, or they may ask for advanced
image processing ("check that the pellet was resuspended"). Humans don't
mind these tasks, but robots need more explicit instructions.
There are some interesting timing issues with this transformation. The
transformation protocol advises that the cells not stay at room
temperature for more than a few seconds, and that the plate should be
pre-warmed to 37C. In theory, you would want to start the pre-warming so
it ends at the same time as the transformation, but it's not clear how
the Transcriptic robots would handle this situation — to my knowledge,
there is no way to sync up the steps of the protocol exactly. A lack of
fine control over timing seems like it will be a common issue with
robotic protocols, due to the comparative inflexibility of the robotic
arm, scheduling conflicts, etc. We will have to adjust our protocols
accordingly.
There are usually reasonable solutions: sometimes you just have to use
different reagents (e.g., hardier cells, like the Mix&Go cells above);
sometimes you just try overkill (e.g., shake the thing ten times instead
of three); sometimes you have to come up with tricks to make the process
work better with robots (e.g., use a PCR machine for heat-shocking).
Of course, the big advantage is that once the protocol works once, you
can mostly rely on it to work again and again. You may even be able to
quantify how robust the protocol is, and improve it over time!
Test Transformation
Before I start transforming with my fully assembled plasmid, I run a
simple experiment to make sure that a transformation using regular pUC19
(i.e., no Gibson assembly, and no sfGFP insert DNA) works. pUC19
contains an ampicillin-resistance gene, so a successful transformation
should allow the bacteria to grow on plates that contain this
antibiotic.
I transfer the bacteria straight onto plates ("6-flat" in Transcriptic's
terminology) that either have ampicillin or no ampicillin. I expect that
transformed bacteria contain an ampicillin-resistance gene, and hence
will grow. Untransformed bacteria should not grow.
"""Simple transformation protocol: transformation with unaltered pUC19"""p=Protocol()experiment_name="debug_sfgfp_puc19_gibson_v1"inv={"water":"rs17gmh5wafm5p",# catalog; Autoclaved MilliQ H2O; ambient"DH5a":"rs16pbj944fnny",# catalog; Zymo DH5α; cold_80"LB Miller":"rs17bafcbmyrmh",# catalog; LB Broth Miller; cold_4"Amp 100mgml":"rs17msfk8ujkca",# catalog; Ampicillin 100mg/ml; cold_20"pUC19":"rs17tcqmncjfsh",# catalog; pUC19; cold_20}# Catalogtransform_plate=p.ref("transform_plate",cont_type="96-pcr",storage="ambient",discard=True)transform_tube=transform_plate.well(0)# ------------------------------------------------------------------------------------# Plating transformed bacteria according to Tali's protocol (requires different code!)# http://learn.transcriptic.com/blog/2015/9/9/provisioning-commercial-reagents# Add 1-5ul plasmid and pre-warm culture plates to 37C before starting.### Extra inventory for plating#inv["lb-broth-100ug-ml-amp_6-flat"]="ki17sbb845ssx9"# (kit, not normal ref) from blogpostinv["noAB-amp_6-flat"]="ki17reefwqq3sq"# kit idinv["LB Miller"]="rs17bafcbmyrmh"## Ampicillin and no ampicillin plates#amp_6_flat=Container(None,p.container_type('6-flat'))p.refs["amp_6_flat"]=Ref('amp_6_flat',{"reserve":inv['lb-broth-100ug-ml-amp_6-flat'],"store":{"where":'cold_4'}},amp_6_flat)noAB_6_flat=Container(None,p.container_type('6-flat'))p.refs["noAB_6_flat"]=Ref('noAB_6_flat',{"reserve":inv['noAB-amp_6-flat'],"store":{"where":'cold_4'}},noAB_6_flat)## Provision competent bacteria#p.provision(inv["DH5a"],transform_tube,µl(50))p.provision(inv["pUC19"],transform_tube,µl(2))## Heatshock the bacteria to transform using a PCR machine#p.seal(transform_plate)p.thermocycle(transform_plate,[{"cycles":1,"steps":[{"temperature":"4:celsius","duration":"5:minute"}]},{"cycles":1,"steps":[{"temperature":"37:celsius","duration":"30:minute"}]}],volume=µl(50))p.unseal(transform_plate)## Then dilute bacteria and spread onto 6-flat plates# Put more on ampicillin plates for more opportunities to get a colony#p.provision(inv["LB Miller"],transform_tube,µl(355))p.mix(transform_tube,µl(150),repetitions=5)foriinrange(6):p.spread(transform_tube,amp_6_flat.well(i),µl(55))p.spread(transform_tube,noAB_6_flat.well(i),µl(10))asserttransform_tube.volume>=µl(15),transform_tube.volume## Incubate and image 6-flat plates over 18 hours#forflat_name,flatin[("amp_6_flat",amp_6_flat),("noAB_6_flat",noAB_6_flat)]:fortimepointin[6,12,18]:p.cover(flat)p.incubate(flat,"warm_37","6:hour")p.uncover(flat)p.image_plate(flat,mode="top",dataref=expid("{}_t{}".format(flat_name,timepoint)))# ---------------------------------------------------------------# Analyze protocol#jprotocol=json.dumps(p.as_dict(),indent=2)!echo'{jprotocol}'|transcripticanalyze#print("Protocol {}\n\n{}".format(experiment_name, protocol))open("protocol_{}.json".format(experiment_name),'w').write(jprotocol)
In the following plate photographs, we can see that with no antibiotic
(left-hand side plates), there is growth on all six plates, though the
amount of growth is quite variable, which is worrying. Transcriptic's
robots do not seem to do a great job with spreading, a task that does
require some dexterity.
In the presence of antibiotic (right-hand side plates), I also see
growth, though again it's inconsistent. The first two antibiotic plates
look odd, with lots of growth, which is likely the result of adding 55µl
to these plates compared to the 10µl I added to the no-antibiotic
plates. The third plate has some colonies and is essentially what I
expected to see for all the plates. The last three plates should have
some growth but do not. My only explanation for these odd results is
that I did insufficient mixing of cells and media, so almost all the
cells were dispensed into the first two plates.
(I really should have also done a positive control here with
untransformed bacteria on ampicillin plates, but I had already done this
in a previous experiment, so I know that the stocked ampicillin plates
kill this strain of E. coli. Growth was much weaker in the ampicillin
plates despite dispensing a greater volume, as expected.)
Overall, the transformation worked well enough to proceed, but there are
some kinks to work out.
Plates of cells transformed with pUC19 after 18 hours: no antibiotic
(left) and antibiotic (right)
Transformation with assembled product
Since the Gibson assembly and a simple pUC19 transformation seem to
work, I can now attempt a transformation with a fully-assembled
sfGFP-expressing plasmid.
Apart from the assembled insert, I will also add some IPTG and X-gal to
the plates, so that I can see the successful transformation with a
blue–white screen.
This additional information is useful since if I am transforming with
regular pUC19, which does not contain sfGFP, it would still confer
antibiotic resistance.
Absorbance and Fluorescence
sfGFP fluoresces best with 485nm excitation / 510nm emission wavelengths
(according to this
chart).
I found that 485/535 worked better at Transcriptic, I assume because 485
and 510 are too similar. I measure the growth of the bacteria at 600nm
(OD600).
My IPTG is at a concentration of 1M and should be used at 1:1000
dilution. My X-gal is at a concentration of 20mg/ml and should be used
at a 1:1000 dilution (20mg/µl). Hence to a 2000µl LB-broth, I add 2µl of
each.
According to one
protocol you
should first spread 40µl of X-gal at 20mg/ml and 40µl of IPTG at 0.1mM
(or 4µl of IPTG at 1M) and then dry it for 30 minutes. That procedure
did not work for me, so instead I mix IPTG, X-gal and competent cells,
and spread that mixture directly.
"""Full Gibson assembly and transformation protocol for sfGFP and pUC19
v1: Spread IPTG and X-gal onto plates, then spread cells
v2: Mix IPTG, X-gal and cells; spread the mixture
v3: exclude X-gal so I can do colony picking better
v4: repeat v3 to try other excitation/emission wavelengths"""p=Protocol()options={"gibson":False,# do a new gibson assembly"sanger":False,# sanger sequence product"control_pUC19":True,# unassembled pUC19"XGal":False# excluding X-gal should make the colony picking easier}fork,vinlist(options.items()):ifvisFalse:deloptions[k]experiment_name="sfgfp_puc19_gibson_plates_v4"# -----------------------------------------------------------------------# Inventory#inv={# catalog"water":"rs17gmh5wafm5p",# catalog; Autoclaved MilliQ H2O; ambient"DH5a":"rs16pbj944fnny",# catalog; Zymo DH5α; cold_80"Gibson Mix":"rs16pfatkggmk5",# catalog; Gibson Mix (2X); cold_20"LB Miller":"rs17bafcbmyrmh",# catalog; LB Broth Miller; cold_4"Amp 100mgml":"rs17msfk8ujkca",# catalog; Ampicillin 100mg/ml; cold_20"pUC19":"rs17tcqmncjfsh",# catalog; pUC19; cold_20# my inventory"puc19_cut_v2":"ct187v4ea7vvca",# inventory; pUC19 cut with EcoRI; cold_20"IPTG":"ct18a2r5wn6tqz",# inventory; IPTG at 1M (conc semi-documented); cold_20"XGal":"ct18a2r5wp5hcv",# inventory; XGal at 0.1M (conc not documented); cold_20"sfgfp_pcroe_v8_amplified":"ct1874zqh22pab",# inventory; sfGFP amplified to 40ng/ul; cold_4"sfgfp_puc19_gibson_v3_clone":"ct188ejywa8jcv",# inventory; assembled sfGFP; cold_4# kits (must be used differently)"lb-broth-100ug-ml-amp_6-flat":"ki17sbb845ssx9",# catalog; ampicillin plates"noAB-amp_6-flat":"ki17reefwqq3sq"# catalog; no antibiotic plates}## Catalog (all to be discarded afterward)#water_tube=p.ref("water",cont_type="micro-1.5",storage="ambient",discard=True).well(0)transform_plate=p.ref("trn_plate",cont_type="96-pcr",storage="ambient",discard=True)transform_tube=transform_plate.well(39)# experimenttransform_tube_L=p.ref("trn_tubeL",cont_type="micro-1.5",storage="ambient",discard=True).well(0)transctrl_tube=transform_plate.well(56)# controltransctrl_tube_L=p.ref("trc_tubeL",cont_type="micro-1.5",storage="ambient",discard=True).well(0)## Plating according to Tali's protocol# http://learn.transcriptic.com/blog/2015/9/9/provisioning-commercial-reagents#amp_6_flat=Container(None,p.container_type('6-flat'))p.refs[expid("amp_6_flat")]=Ref(expid("amp_6_flat"),{"reserve":inv['lb-broth-100ug-ml-amp_6-flat'],"store":{"where":'cold_4'}},amp_6_flat)noAB_6_flat=Container(None,p.container_type('6-flat'))p.refs[expid("noAB_6_flat")]=Ref(expid("noAB_6_flat"),{"reserve":inv['noAB-amp_6-flat'],"store":{"where":'cold_4'}},noAB_6_flat)## My inventory: EcoRI-cut pUC19, oePCR'd sfGFP, Gibson-assembled pUC19, IPTG and X-Gal#if"gibson"inoptions:puc19_cut_tube=p.ref("puc19_ecori_v2_puc19_cut",id=inv["puc19_cut_v2"],cont_type="micro-1.5",storage="cold_20").well(0)sfgfp_pcroe_amp_tube=p.ref("sfgfp_pcroe_v8_amplified",id=inv["sfgfp_pcroe_v8_amplified"],cont_type="micro-1.5",storage="cold_4").well(0)clone_plate=p.ref(expid("clone"),cont_type="96-pcr",storage="cold_4",discard=False)else:clone_plate=p.ref("sfgfp_puc19_gibson_v3_clone",id=inv["sfgfp_puc19_gibson_v3_clone"],cont_type="96-pcr",storage="cold_4",discard=False)IPTG_tube=p.ref("IPTG",id=inv["IPTG"],cont_type="micro-1.5",storage="cold_20").well(0)if"XGal"inoptions:XGal_tube=p.ref("XGal",id=inv["XGal"],cont_type="micro-1.5",storage="cold_20").well(0)## Initialize inventory#if"gibson"inoptions:all_inventory_wells=[puc19_cut_tube,sfgfp_pcroe_amp_tube,IPTG_tube]assertpuc19_cut_tube.volume==µl(66),puc19_cut_tube.volumeassertsfgfp_pcroe_amp_tube.volume==µl(36),sfgfp_pcroe_amp_tube.volumeelse:all_inventory_wells=[IPTG_tube,clone_plate.well(0)]if"XGal"inoptions:all_inventory_wells.append(XGal_tube)forwellinall_inventory_wells:init_inventory_well(well)print("Inventory: {}{}{}".format(well.name,well.volume,well.properties))## Provisioning. Water is used all over the protocol. Provision an excess since it's cheap#p.provision(inv["water"],water_tube,µl(500))# -----------------------------------------------------------------------------# Cloning/assembly (see NEBuilder protocol above)## "Optimized efficiency is 50–100 ng of vectors with 2 fold excess of inserts."# pUC19 is 20ng/ul (78ul total).# sfGFP is ~40ng/ul (48ul total)# Therefore 4ul of each gives 80ng and 160ng of vector and insert respectively#defdo_gibson_assembly():## Combine all the Gibson reagents in one tube and thermocycle#p.provision(inv["Gibson Mix"],clone_plate.well(0),µl(10))p.transfer(water_tube,clone_plate.well(0),µl(2))p.transfer(puc19_cut_tube,clone_plate.well(0),µl(4))p.transfer(sfgfp_pcroe_amp_tube,clone_plate.well(0),µl(4),mix_after=True,mix_vol=µl(10),repetitions=10)p.seal(clone_plate)p.thermocycle(clone_plate,[{"cycles":1,"steps":[{"temperature":"50:celsius","duration":"16:minute"}]}],volume=µl(50))## Dilute assembled plasmid 4X according to the NEB Gibson assembly protocol (20ul->80ul)#p.unseal(clone_plate)p.transfer(water_tube,clone_plate.well(0),µl(60),mix_after=True,mix_vol=µl(40),repetitions=5)return# --------------------------------------------------------------------------------------------------# Transformation# "Transform NEB 5-alpha Competent E. coli cells with 2 μl of the# assembled product, following the appropriate transformation protocol."## Mix & Go http://www.zymoresearch.com/downloads/dl/file/id/173/t3015i.pdf# "[After mixing] Immediately place on ice and incubate for 2-5 minutes"# "The highest transformation efficiencies can be obtained by incubating Mix & Go cells with DNA on# ice for 2-5 minutes (60 minutes maximum) prior to plating."# "It is recommended that culture plates be pre-warmed to >20°C (preferably 37°C) prior to plating."# "Avoid exposing the cells to room temperature for more than a few seconds at a time."## "If competent cells are purchased from other manufacture, dilute assembled products 4-fold# with H2O prior transformation. This can be achieved by mixing 5 μl of assembled products with# 15 μl of H2O. Add 2 μl of the diluted assembled product to competent cells."#def_do_transformation():## Combine plasmid and competent bacteria in a pcr_plate and shock#p.provision(inv["DH5a"],transform_tube,µl(50))p.transfer(clone_plate.well(0),transform_tube,µl(3),dispense_speed="10:microliter/second")assertclone_plate.well(0).volume==µl(54),clone_plate.well(0).volumeif'control_pUC19'inoptions:p.provision(inv["DH5a"],transctrl_tube,µl(50))p.provision(inv["pUC19"],transctrl_tube,µl(1))## Heatshock the bacteria to transform using a PCR machine#p.seal(transform_plate)p.thermocycle(transform_plate,[{"cycles":1,"steps":[{"temperature":"4:celsius","duration":"5:minute"}]},{"cycles":1,"steps":[{"temperature":"37:celsius","duration":"30:minute"}]}],volume=µl(50))returndef_transfer_transformed_to_plates():asserttransform_tube.volume==µl(53),transform_tube.volumep.unseal(transform_plate)num_ab_plates=4# antibiotic places## Transfer bacteria to a bigger tube for diluting# Then spread onto 6-flat plates# Generally you would spread 50-100ul of diluted bacteria# Put more on ampicillin plates for more opportunities to get a colony# I use a dilution series since it's unclear how much to plate#p.provision(inv["LB Miller"],transform_tube_L,µl(429))## Add all IPTG and XGal to the master tube# 4ul (1M) IPTG on each plate; 40ul XGal on each plate#p.transfer(IPTG_tube,transform_tube_L,µl(4*num_ab_plates))if'XGal'inoptions:p.transfer(XGal_tube,transform_tube_L,µl(40*num_ab_plates))## Add the transformed cells and mix (use new mix op in case of different pipette)#p.transfer(transform_tube,transform_tube_L,µl(50))p.mix(transform_tube_L,volume=transform_tube_L.volume/2,repetitions=10)asserttransform_tube.volume==dead_volume['96-pcr']==µl(3),transform_tube.volumeasserttransform_tube_L.volume==µl(495),transform_tube_L.volume## Spread an average of 60ul on each plate == 480ul total#foriinrange(num_ab_plates):p.spread(transform_tube_L,amp_6_flat.well(i),µl(51+i*6))p.spread(transform_tube_L,noAB_6_flat.well(i),µl(51+i*6))asserttransform_tube_L.volume==dead_volume["micro-1.5"],transform_tube_L.volume## Controls: include 2 ordinary pUC19-transformed plates as a control#if'control_pUC19'inoptions:num_ctrl=2assertnum_ab_plates+num_ctrl<=6p.provision(inv["LB Miller"],transctrl_tube_L,µl(184)+dead_volume["micro-1.5"])p.transfer(IPTG_tube,transctrl_tube_L,µl(4*num_ctrl))if"XGal"inoptions:p.transfer(XGal_tube,transctrl_tube_L,µl(40*num_ctrl))p.transfer(transctrl_tube,transctrl_tube_L,µl(48))p.mix(transctrl_tube_L,volume=transctrl_tube_L.volume/2,repetitions=10)foriinrange(num_ctrl):p.spread(transctrl_tube_L,amp_6_flat.well(num_ab_plates+i),µl(55+i*10))p.spread(transctrl_tube_L,noAB_6_flat.well(num_ab_plates+i),µl(55+i*10))asserttransctrl_tube_L.volume==dead_volume["micro-1.5"],transctrl_tube_L.volumeassertIPTG_tube.volume==µl(808),IPTG_tube.volumeif"XGal"inoptions:assertXGal_tube.volume==µl(516),XGal_tube.volumereturndefdo_transformation():_do_transformation()_transfer_transformed_to_plates()# ------------------------------------------------------# Measure growth in plates (photograph)#defmeasure_growth():## Incubate and photograph 6-flat plates over 18 hours# to see blue or white colonies#forflat_name,flatin[(expid("amp_6_flat"),amp_6_flat),(expid("noAB_6_flat"),noAB_6_flat)]:fortimepointin[9,18]:p.cover(flat)p.incubate(flat,"warm_37","9:hour")p.uncover(flat)p.image_plate(flat,mode="top",dataref=expid("{}_t{}".format(flat_name,timepoint)))return# ---------------------------------------------------------------# Sanger sequencing, TURNED OFF# Sequence to make sure assembly worked# 500ng plasmid, 1 µl of a 10 µM stock primer# "M13_F" : "rs17tcpqwqcaxe", # catalog; M13 Forward (-41); cold_20 (1ul = 100pmol)# "M13_R" : "rs17tcph6e2qzh", # catalog; M13 Reverse (-48); cold_20 (1ul = 100pmol)#defdo_sanger_seq():seq_primers=[inv["M13_F"],inv["M13_R"]]seq_wells=["G1","G2"]p.unseal(pcr_plate)forprimer_num,seq_wellin[(0,seq_wells[0]),(1,seq_wells[1])]:p.provision(seq_primers[primer_num],pcr_plate.wells([seq_well]),µl(1))p.transfer(pcr_plate.wells(["A1"]),pcr_plate.wells(seq_wells),µl(5),mix_before=True,mix_vol=µl(10))p.transfer(water_tube,pcr_plate.wells(seq_wells),µl(9))p.mix(pcr_plate.wells(seq_wells),volume=µl(7.5),repetitions=10)p.sangerseq(pcr_plate,pcr_plate.wells(seq_wells[0]).indices(),expid("seq1"))p.sangerseq(pcr_plate,pcr_plate.wells(seq_wells[1]).indices(),expid("seq2"))return# ---------------------------------------------------------------# Generate protocol## Skip Gibson since I already did itif'gibson'inoptions:do_gibson_assembly()do_transformation()measure_growth()if'sanger'inoptions:do_sanger_seq()# ---------------------------------------------------------------# Output protocol#jprotocol=json.dumps(p.as_dict(),indent=2)!echo'{jprotocol}'|transcripticanalyze#print("\nProtocol {}\n\n{}".format(experiment_name, jprotocol))open("protocol_{}.json".format(experiment_name),'w').write(jprotocol)
Once the colonies are growing on an ampicillin plate, I can "pick"
individual colonies and inoculate wells in a 96-well plate with those
colonies. There is an autoprotocol colony-picking command
(autopick)
for this purpose.
"""Pick colonies from plates and grow in amp media and check for fluorescence.
v2: try again with a new plate (no blue colonies)
v3: repeat with different emission and excitation wavelengths"""p=Protocol()options={}fork,vinlist(options.items()):ifvisFalse:deloptions[k]experiment_name="sfgfp_puc19_gibson_pick_v3"defplate_expid(val):"""refer to the previous plating experiment's outputs"""plate_exp="sfgfp_puc19_gibson_plates_v4"return"{}_{}".format(plate_exp,val)# -----------------------------------------------------------------------# Inventory#inv={# catalog"water":"rs17gmh5wafm5p",# catalog; Autoclaved MilliQ H2O; ambient"LB Miller":"rs17bafcbmyrmh",# catalog; LB Broth Miller; cold_4"Amp 100mgml":"rs17msfk8ujkca",# catalog; Ampicillin 100mg/ml; cold_20"IPTG":"ct18a2r5wn6tqz",# inventory; IPTG at 1M (conc semi-documented); cold_20# plates from previous experiment, must be changed every new experimentplate_expid("amp_6_flat"):"ct18snmr9avvg9",# inventory; Ampicillin plates with blue-white screening of pUC19plate_expid("noAB_6_flat"):"ct18snmr9dxfw2",# inventory; no AB plates with blue-white screening of pUC19}# Tubes and plateslb_amp_tubes=[p.ref("lb_amp_{}".format(i+1),cont_type="micro-2.0",storage="ambient",discard=True).well(0)foriinrange(4)]lb_xab_tube=p.ref("lb_xab",cont_type="micro-2.0",storage="ambient",discard=True).well(0)growth_plate=p.ref(expid("growth"),cont_type="96-flat",storage="cold_4",discard=False)# My inventoryIPTG_tube=p.ref("IPTG",id=inv["IPTG"],cont_type="micro-1.5",storage="cold_20").well(0)# ampicillin plateamp_6_flat=Container(None,p.container_type('6-flat'))p.refs[plate_expid("amp_6_flat")]=Ref(plate_expid("amp_6_flat"),{"id":inv[plate_expid("amp_6_flat")],"store":{"where":'cold_4'}},amp_6_flat)# Use a total of 50 wellsabs_wells=["{}{}".format(row,col)forrowin"BCDEF"forcolinrange(1,11)]abs_wells_T=["{}{}".format(row,col)forcolinrange(1,11)forrowin"BCDEF"]assertabs_wells[:3]==["B1","B2","B3"]andabs_wells_T[:3]==["B1","C1","D1"]defprepare_growth_wells():## To LB, add ampicillin at ~1/1000 concentration# Mix slowly in case of overflow#p.provision(inv["LB Miller"],lb_xab_tube,µl(1913))forlb_amp_tubeinlb_amp_tubes:p.provision(inv["Amp 100mgml"],lb_amp_tube,µl(2))p.provision(inv["LB Miller"],lb_amp_tube,µl(1911))p.mix(lb_amp_tube,volume=µl(800),repetitions=10)## Add IPTG but save on X-Gal# http://openwetware.org/images/f/f1/Dh5a_sub.pdf# "If you are concerned about obtaining maximal levels of expression, add IPTG to a final concentration of 1 mM."# 2ul of IPTG in 2000ul equals 1mM#p.transfer(IPTG_tube,[lb_xab_tube]+lb_amp_tubes,µl(2),one_tip=True)## Distribute LB among wells, row D is control (no ampicillin)#cols=range(1,11)row="D"# control, no ABcwells=["{}{}".format(row,col)forcolincols]assertset(cwells).issubset(set(abs_wells))p.distribute(lb_xab_tube,growth_plate.wells(cwells),µl(190),allow_carryover=True)rows="BCEF"forrow,lb_amp_tubeinzip(rows,lb_amp_tubes):cwells=["{}{}".format(row,col)forcolincols]assertset(cwells).issubset(set(abs_wells))p.distribute(lb_amp_tube,growth_plate.wells(cwells),µl(190),allow_carryover=True)assertall(lb_amp_tube.volume==lb_xab_tube.volume==dead_volume['micro-2.0']forlb_amp_tubeinlb_amp_tubes)returndefmeasure_growth_wells():## Growth: absorbance and fluorescence over 24 hours# Absorbance at 600nm: cell growth# Absorbance at 615nm: X-gal, in theory# Fluorescence at 485nm/510nm: sfGFP# or 450nm/508nm (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2695656/)#hr=4fortinrange(0,24,hr):ift>0:p.cover(growth_plate)p.incubate(growth_plate,"warm_37","{}:hour".format(hr),shaking=True)p.uncover(growth_plate)p.fluorescence(growth_plate,growth_plate.wells(abs_wells).indices(),excitation="485:nanometer",emission="535:nanometer",dataref=expid("fl2_{}".format(t)),flashes=25)p.fluorescence(growth_plate,growth_plate.wells(abs_wells).indices(),excitation="450:nanometer",emission="508:nanometer",dataref=expid("fl1_{}".format(t)),flashes=25)p.fluorescence(growth_plate,growth_plate.wells(abs_wells).indices(),excitation="395:nanometer",emission="508:nanometer",dataref=expid("fl0_{}".format(t)),flashes=25)p.absorbance(growth_plate,growth_plate.wells(abs_wells).indices(),wavelength="600:nanometer",dataref=expid("abs_{}".format(t)),flashes=25)return# ---------------------------------------------------------------# Protocol steps#prepare_growth_wells()batch=10foriinrange(5):p.autopick(amp_6_flat.well(i),growth_plate.wells(abs_wells_T[i*batch:i*batch+batch]),dataref=expid("autopick_{}".format(i)))p.image_plate(amp_6_flat,mode="top",dataref=expid("autopicked_{}".format(i)))measure_growth_wells()# ---------------------------------------------------------------# Output protocol#jprotocol=json.dumps(p.as_dict(),indent=2)!echo'{jprotocol}'|transcripticanalyzeopen("protocol_{}.json".format(experiment_name),'w').write(jprotocol)
The blue–white screening worked beautifully, with mostly white colonies
on the antibiotic plates (1-4) and blue only on the non-antibiotic plate
(5-6). This is exactly what I expect, and I was relieved to see it,
especially since I was using my own IPTG and X-gal that I shipped to
Transcriptic.
Blue–white screening plates with ampicillin (1-4) and no antibiotic
(5-6)
However, the colony-picking robot did not work well with these blue and
white colonies. The image below was generated by subtracting successive
plate photographs after each round of plate picking and increasing the
contrast of the differences (using
GraphicsMagick). This way, I can
visualize which colonies were picked (albeit imperfectly since picked
colonies are not completely removed).
I also annotate the image with the number of colonies reported picked by
Transcriptic. The robot is supposed to pick a maximum of 10 colonies
from the first five plates. However, few colonies were picked overall,
and when they were picked they look to be often blue. The robot only
managed to find ten colonies on a control plate with only blue colonies.
My working theory is that the colony-picking robot preferentially
selected blue colonies since those have the highest contrast.
Blue–white screening plates with ampicillin (1-4) and no antibiotic
(5-6), annotated with number of colonies picked
Blue–white screening did serve a purpose in that it showed me that most
of colonies were being correctly transformed, or at least that an
insertion was happening. However, to get better colony picking, I repeat
the experiment without X-gal.
Given only white colonies to pick, the colony-picking robot successfully
picked 10 colonies from each of the first five plates. I have to assume
most of the picked colonies have successful insertions.
Colonies growing on ampicillin plates (1-4) and no antibiotic plates
(5-6)
Results: Transformation with assembled product
After growing 50 picked colonies in a 96-well plate for 20 hours, I
measure fluorescence to see if sfGFP is being expressed. Transcriptic
uses a Tecan
Infinite
plate-reader to measure fluorescence and absorbance (and luminescence if
you want that).
In theory, any well that has growth has an assembled plasmid, since it
needs antibiotic resistance to grow, and every assembled plasmid is
expressing sfGFP. In reality, there are many reasons why that might not
happen, not least of which is that you can lose the sfGFP gene from the
plasmid without losing ampicillin resistance. A bacterium that loses the
sfGFP gene has a selection advantage over its competitors because it is
not wasting energy on that, so given enough generations of growth this
is certain to happen.
I collect absorbance (OD600) and fluorescence data every four hours for
20 hours (~60 generations).
I plot the data at hour 20, and a contrail of previous timepoints. I
only really care about the data at hour 20 since that's approximately
when fluorescence should peak.
Fluorescence vs OD600: wells with ampicillin are black, control wells
with no ampicillin are grey. A green glow is applied to wells with
plasmids where I have validated the sfGFP protein sequence is correct.
I run a
miniprep
to extract the plasmid DNA, then Sanger sequence using M13 primers.
Unfortunately, for some reason, minipreps are currently only available
via Transcriptic's web-based protocol launcher and not through
autoprotocol. I sequence the three wells with the highest fluorescence
readings (C1, D1, D3), and three others (B1, B3, E1) and align the
(forward and reverse) sequences against sfGFP with
muscle.
In wells C1, D3, and D3 there is a perfect match to my original sfGFP
sequence, while in wells B1, B3, and E1, there are gross mutations or
the alignment just fails.
Three glowing colonies
The results are good, though some aspects are surprising. For example,
the fluorescence reader starts out at a very high reading at timepoint 0
(40,000 units), for no apparent reason. By hour 20, it has settled down
to a more reasonable pattern, with a clear basal correlation between
OD600 and fluorescence (I assume because of a minor overlap in spectra),
plus some outliers with high fluorescence. Eyeballing, it looks like it
could be one, three or perhaps 11-15 outliers.
Some of the wells showing high fluorescence readings are in control
wells (i.e., no ampicillin, colored grey), which is surprising since in
these wells there is no selection pressure so I expect the plasmid to be
lost.
Based on the fluorescence data and sequencing results, it appears that
only three out of 50 colonies produce sfGFP and fluoresce. That's not
nearly as many as I expected. However, because there were three separate
growth stages (on the plate, in the growth well, for miniprep), the
cells have undergone about 200 generations of growth by this stage, so
there were quite a lot of opportunities for mutations to occur.
There must be ways to make this process more efficient, especially since
I am far from an expert on these protocols. Nevertheless, we have
successfully produced transformed cells expressing an engineered GFP
using only Python code!
Part Three: Conclusions
Cost
Depending on how you measure it, the cost of this experiment was around
$360, not including the money I spent on debugging:
$70 to synthesize the DNA
$32 to PCR and add flanks to the insert
$31 to cut the plasmid
$32 for Gibson assembly
$53 for transformation
$67 for colony picking
$75 for 3 minipreps and sequencing
I think the cost could probably be brought down to $250-300 with some
tweaks. For example, getting a robot to pick 50 colonies is susprisingly
expensive, and probably overkill.
In my experience, this price seems expensive to some (molecular
biologists) and cheap to others (computational people). Since
Transcriptic basically just charges for reagents at list price, the main
cost difference is in labor. A robot is already pretty cheap per hour,
and doesn't mind getting up in the middle of the night to take a
photograph of a plate. Once the protocols are nailed down, it's hard to
imagine that even a grad student will be cheaper, especially if you
factor in opportunity costs.
To be clear, I am only talking about replacing routine protocols —
cutting-edge protocol development will still be done by skilled
molecular biologists — but a lot of exciting science uses only boring
protocols. Until recently, many labs manufactured their own oligos, but
now few would bother — it's just not worth anyone's time, even grad
students, when IDT will ship them to you within a couple of days.
Robot labs: pros and cons
Obviously, I'm a big believer in robotic labs. There are some really fun
and useful things about doing experiments with robots, especially if
you're primarily a computational scientist and are allergic to latex
gloves and manual labor:
Reproducibility! This is probably the biggest advantage. It
includes the consistency of robots and the ability to publish your
protocol in autoprotocol format, instead of awkward English prose
(and the passive voice is not even minded by me...)
Scalability You can repeat my experiment 100 times with different
parameters, without too much marginal work.
Arbitrarily complicated protocols, for example PCR touchdown.
This might seem minor or even counterproductive, but if a protocol is
going to be run hundreds or thousands of times by different labs, why
not optimize the protocol to a fraction of a degree? Or even use
statistics / machine learning to improve the protocol over time? It
drives me crazy to see a protocol that might be used tens of
thousands of times recommend performing an operation for 2-3 minutes.
Which is it?
Fine-tuning You can repeat experiments after changing just one
minor detail. It's really hard to ceteris paribus as a human.
Virtuality Run experiments or monitor results while away from the
lab, like in
Vienna.
Expressiveness You can use programming syntax to encode
repetitive steps or branching logic. For example, if you wanted to
dispense 1 to 96μl of reagent and (96-x)μl of water into a 96 well
plate this can be concisely written.
Machine-readable data Results data is almost always returned as
csv, or something else you can compute on.
Abstraction Ideally, you could run the entire protocol above
while remaining agnostic to the reagents or style of cloning used,
and drop in a replacement protocol if it worked better.
There are some catches too of course, especially since it's very early
in the evolution of these tools. If it were the internet it would be
around 1994:
Transporting samples back and forth to Transcriptic is a chore. I'm
not sure how to solve this, though the more you can do at the cloud
lab the less you need to transport. That is partially why synthetic
biology is a good fit for cloud labs over, say, diagnostics with
human samples.
Debugging protocols remotely is difficult and can be expensive —
especially differentiating between your bugs and Transcriptic's bugs.
There are lots of experiments you just can't do yet. At the time of
writing, Transcriptic only supports bacterial experiments (no yeast,
no mammalian cells, though these are coming).
For many labs it may be more expensive to use a cloud lab than just
getting a grad student (marginal cost per hour: ~$0) to do the work.
This depends on how much the lab needs the grad student's hands
compared to their brain.
Transcriptic doesn't run experiments on the weekend yet.
Understandable, but it can be inconvenient, even when your project is
not so time-sensitive.
Software is eating protein
Even though there's quite a lot of code here and quite a lot of
debugging, I think it's feasible to produce some software that takes as
input a protein sequence and as output creates bacteria that express
that protein.
To make that work, a few things need to happen:
True integration of Twist/IDT/Gen9 with Transcriptic (this will
probably be slow because of low demand currently).
Very robust versions of the protocols I have outlined above, to
account for differences in protein sequence composition, length,
secondary structure, etc.
Replacing various custom tools (NEB's Gibson protocol generator,
IDT's codon optimizer) with open-source equivalents (e.g.,
primer3).
For many applications, you also want to purify your protein (using a
tag and a column),
or perhaps just get the bacteria to
secrete it. Let's assume that
we can soon do this in a cloud lab too, or that we can do experiments
in vivo (i.e., within the bacterial cell).
There are also lots of opportunities to make the protocol actually work
better than a human-run version, for example: design of promoters and
RBSs to optimize expression specific to your sequence; statistics on the
probability of success of the experiment based on comparable
experiments; automated analysis of gels.
Why bother with all this?
After all that, it might not be totally clear why you would want to
engineer a protein like this. Here are some ideas:
Make a protein sensor to detect
something dangerous/unhealthy/delicious like gluten.
Make a
BiTE
to treat a specific cancer you just sequenced. (This could be
trickier than it sounds).
Make a topical vaccine that can enter the body via hair
follicles (I
don't recommend trying this at home).
Mutagenize your protein 100 different ways and characterize the
changes.
Then scale it up to 1,000, or 10,000? Maybe characterize the
mutations of GFP?
For more ideas on what is possible you only have to look at the hundreds
of iGEM projects
that are already out there.
Finally, thanks to Ben Miles at Transcriptic for helping me finish this project.
In this experiment I use standard M13 primers to amplify part of the
plasmid, pUC19. In theory, these primers will amplify a ~110bp fragment.
I evaluate the experiment using a standard qPCR curve (with SYBR), and
then running a gel to check the size of the amplified fragment(s).
One nice aspect of this experiment is that all of the reagents used are
available in Transcriptic's standard catalog, which means the experiment
can be performed remotely, without shipping anything to Transcriptic. It
could be a nice test-case for a new Taq mastermix, or a new PCR
protocol.
# (Python 3 setup cell omitted)# https://developers.transcriptic.com/docs/how-to-write-a-new-protocol# https://secure.transcriptic.com/_commercial/resources?q=waterimportjsonimportautoprotocolfromautoprotocol.protocolimportProtocolp=Protocol()# 3 cols: 0=template+primers+mastermix, 1=primers+mastermix, 2=water# 3 rows: A, B, C repeatedexperiment_name="puc19_m13_v1"inv={}inv['SensiFAST SYBR No-ROX']="rs17knkh7526ha"inv['water']="rs17gmh5wafm5p"inv['M13 Forward (-20)']="rs17tcpupe7fdh"inv['M13 Reverse (-48)']="rs17tcph6e2qzh"inv['pUC19']="rs17tcqmncjfsh"#--------------------------------------------------------# Provisioning things for my PCR## Provision a 96 well PCR plate (https://developers.transcriptic.com/v1.0/docs/containers)# Type Max Dead Safe Capabilities Price# 96-pcr 160 µL 3 µL 5 µL pipette, sangerseq, spin, thermocycle, incubate, gel_separate $2.49#pcr_plate=p.ref("pcr_plate",cont_type="96-pcr",storage="cold_4")#--------------------------------------------------------# SYBR-including mastermix# http://www.bioline.com/us/downloads/dl/file/id/2754/sensifast_sybr_no_rox_kit_manual.pdf# Instructions: 10ul mastermix + 0.8ul primer (400nM) + 0.8ul primer (400nM) + <=8.4ul template (~100ng) + 20-vol H20##mastermix_tube = p.ref("mastermix_tube", cont_type="micro-2.0", storage="cold_20")forwellin["A1","B1","C1","A2","B2","C2"]:p.provision(inv['SensiFAST SYBR No-ROX'],pcr_plate.wells(well),"10:microliter")#--------------------------------------------------------# M13 primers# I choose m13 (-20) and (-48) because of similar Tm. This amplifies ~110bp including primers.## 100pmol == 1ul, since Transcriptic dilutes the 1300-1900pmol into 13-19ul (depending on the primer)# I want 400nM in the final 20ul according to the SensiFAST documentation (==8pmol in 20ul)# 1ul primer in 12ul total equals 8pmol/ul## http://www.idtdna.com/pages/products/dna-rna/readymade-products/readymade-primers# Name sequence Tm Anhyd. pmoles in 10ug# M13 Forward (-20) GTA AAA CGA CGG CCA GT 53.0 5228.5 1912.6# M13 Forward (-41) CGC CAG GGT TTT CCC AGT CAC GAC 65.5 7289.8 1371.7# M13 Reverse (-27) CAG GAA ACA GCT ATG AC 47.3 5212.5 1918.3# M13 Reverse (-48) AGC GGA TAA CAA TTT CAC ACA GG 57.2 7065.7 1415.2primer_tube=p.ref("primer_tube",cont_type="micro-2.0",storage="cold_20")p.provision(inv['M13 Forward (-20)'],primer_tube.wells(0),"1:microliter")# fwd -20p.provision(inv['M13 Reverse (-48)'],primer_tube.wells(0),"1:microliter")# rev -48p.provision(inv['water'],primer_tube.wells(0),"10:microliter")# water#--------------------------------------------------------# pUC19# 1000ug/ml -> 1ul = 1ug == 1000ng. Add 1ul to 49ul to get ~20ng/ul# Then I can transfer 5ul to get 100ng total#template_tube=p.ref("template_tube",cont_type="micro-2.0",storage="cold_20")p.provision(inv['pUC19'],template_tube.wells(0),"1:microliter")p.provision(inv['water'],template_tube.wells(0),"49:microliter")# water#--------------------------------------------------------# Move all the reagents into the pcr plate# The "dispense" command does not work because it needs >=10ul per dispense# in increments of 5ul#forwells,ulin(["A1","B1","C1"],4),(["A2","B2","C2"],9),(["A3","B3","C3"],20):forwellinwells:p.provision(inv['water'],pcr_plate.wells(well),"{}:microliter".format(ul))p.transfer(template_tube.wells(0),pcr_plate.wells(["A1","B1","C1"]),"5:microliter")p.transfer(primer_tube.wells(0),pcr_plate.wells(["A1","B1","C1","A2","B2","C2"]),"1:microliter")#--------------------------------------------------------# Thermocycle, with a hot start (95C for 2m)# Based on http://www.bioline.com/us/downloads/dl/file/id/2754/sensifast_sybr_no_rox_kit_manual.pdf# I also found http://www.environmental-microbiology.de/pdf_files/M13PCR_13jan2014.pdf# p.seal before thermocycling is enforced by transcriptic#p.seal(pcr_plate)p.thermocycle(pcr_plate,[{"cycles":1,"steps":[{"temperature":"95:celsius","duration":"2:minute"}]},{"cycles":40,"steps":[{"temperature":"95:celsius","duration":"5:second"},{"temperature":"60:celsius","duration":"20:second"},{"temperature":"72:celsius","duration":"15:second","read":True}]}],volume="20:microliter",# volume is optionaldataref="qpcr_{}".format(experiment_name),# Dyes to use for qPCR must be specified (tells transcriptic what aborbance to use?)dyes={"SYBR":["A1","B1","C1","A2","B2","C2","A3","B3","C3"]},# standard melting curve parametersmelting_start="65:celsius",melting_end="95:celsius",melting_increment="0.5:celsius",melting_rate="5:second")#--------------------------------------------------------# Run a gel# agarose(8,0.8%): 8 lanes, 0.8% agarose 10 minutes recommended# 10 microliters is used in the example documentation# ladder1: References at 100bp, 250bp, 500bp, 1000bp, and 2000bp.# The gel already includes SYBR green#p.gel_separate(pcr_plate.wells(["A1","B1","C1","A2","B2","C2","A3","B3"]),"10:microliter","agarose(8,0.8%)","ladder1","10:minute","gel_{}".format(experiment_name))#--------------------------------------------------------# Analyze and output the protocol#jprotocol=json.dumps(p.as_dict(),indent=2)print(jprotocol)open("protocol.json",'w').write(jprotocol)uprint("Analyze protocol")!echo'{jprotocol}'|transcripticanalyze
After waiting several days in the Transcriptic queue for the experiment
to start, the actual run took just a couple of hours.
We use 9 wells in total:
Wells [ABC]1, "template": containing template, primers and mastermix
Wells [ABC]2, "no template control": containing primers and mastermix
Wells [ABC]3, "water": containing only water
Column 1: Template
The template appears to be amplified correctly. There is an issue with
the background subtraction in well C1, which has been normalized to
start at about -2000 RFU. Re-graphing the raw data without background
subtracted shows that this is probably a technical artifact.
Unfortunately, because of this issue I do not get a Ct for well C1. The
Cts are reasonably close at 4 and 9, though a <0.5 Ct difference
between technical replicates is apparently
desirable.
Column 2: No template control
We can see from the melting curve and amplification plots below that
there is some signal in the non-template wells, indicating the presence
of some double-stranded DNA. Ideally, since there is no template in
these wells, we expect only single-stranded primer DNA, and no signal.
I checked the M13 primers for possible primer–dimer problems using IDT's
OligoAnalyzer. If the problem
were primer–dimers, we would expect the melting curve to peak at a lower
temperature than the template (generally <80C), and be at least 8
Ct higher than the
template. We also
generally expect a deltaG below -9 kcal/mole. The lowest deltaG for
these primers was far higher at -4, as you'd expect for commonly used
primers.
The Ct for this amplification event is very high at 33-35, which
indicates how much effort it was to amplify, and also that I could have
avoided this problem by reducing the number of cycles to 30.
Column 3: Water
All of the water-only wells were blank, as expected. Although this is
more of a sanity check than anything, one possible problem that could
have been revealed is cross-contamination (e.g., due to reused pipette
tips.)
show_html("<h1>Transcriptic plots from qPCR</h1>")show_html("<h3>No template control (A2, B2, C2) highlighted in blue</h3>")Images(["puc19_melt.png","puc19_amp0.png"],header=["Melting curve","Amplification"])
Transcriptic plots from qPCR
No template control (A2, B2, C2) highlighted in blue
After the PCR finishes, we run a gel to see if the amplified product is
the right size (approximately 100bp.)
I am guessing the gel was done manually, since there was a significant
delay between the qPCR ending and the gel starting. The image also looks
manually taken and is an unusual resolution. Unfortunately, it was not
really possible to tell from the gel if my amplification was clean. The
smallest gel ladder available on Transcriptic is 100bp-2kb, so my 100bp
fragment is difficult to separate out. I can only discriminate four of
the five rungs in the ladder.
The template itself can be seen at the top of the ladder in wells A1,
B1, C1 (pUC19 is about 2.6kb), and the area around 100bp-250bp is darker
in the wells that contain the template, so the results are reasonable.
I used SVG to rotate the image a bit, and draw horizontal lines
corresponding to the rungs of the ladder.
w,h=887,278# the dimensions of the imagedefbox(xy,wh,rgba,text):box='''<rect x="{}" y="{}" width="{}" height="{}" fill="rgba({:d},{:d},{:d},{:f})" stroke="none" />
'''.format(xy[0],xy[1],wh[0],wh[1],rgba[0],rgba[1],rgba[2],rgba[3])text='''<text x="{}" y="{}" text-anchor="left" font-size="12" fill="rgba({:d},{:d},{:d},{:f})">
{}</text>'''.format(xy[0]+wh[0]+10,xy[1],rgba[0],rgba[1],rgba[2],rgba[3],text)returnbox+textsvgs=['''<image xlink:href="static/pUC19_PCR_files/gel_puc19_m13_v1.jpg" x="0" y="0" width="{:d}px" height="{:d}px"
transform="rotate(-.8)"/>'''.format(w,h)]svgs+=[box((0,108),(w,1),(255,0,0,1),"start")]svgs+=[box((0,156),(w,1),(0,0,255,1),"2kb")]svgs+=[box((0,193),(w,1),(255,0,255,1),"1kb")]svgs+=[box((0,220),(w,1),(0,255,255,1),"500bp")]svgs+=[box((0,250),(w,1),(0,255,0,1),"250bp / 100bp")]show_svg(''.join(svgs),w=w+200,h=h)
new_section("Conclusions")
Conclusions
I was pretty happy with the results of this experiment.
Although I could not totally explain the signal in the "no template
control" wells, nor competely confirm that the expected 100bp fragment
was amplified from the gel image, the data was very consistent across
all wells, and I believe the amplification worked fine.
This experiment cost less than $26, which is pretty good.
If I could just eliminate the three day wait for the experiment to start,
then this really would be cloud lab computing —
the time to do the experiment is not even so different to running a typical MCMC chain!