Brian Naughton // Sat 27 July 2019 // Filed under biotech // Tags sequencing flongle nanopore homelab biotech

It's been a dream of mine for a long time to be able to do sequencing at home — just take whatever stuff I want: microbiome, viral/bacterial infections, insects, fungi, foods, sourdough, sauerkraut, and sequence it. Now at last, with the debut of Oxford Nanopore's flongle, it's possible!

So, a few months back, I bought some flongles (basically on launch day) and set up a home sequencing lab. In this post I'll describe what's in the lab and my first sequencing experiments.

What is a flongle?

As a refresher, Oxford Nanopore's MinION sequencer is a hand-held, single molecule nanopore sequencer. As DNA passes through a pore, the obstruction modulates the current across the pore in a pattern that can be mapped onto a sequence of nucleotides.

DNA going through a pore, from an ONT video.

The MinION device itself is a fairly simple container for the nanopore-containing flow-cells. The standard MinION flow-cell contains 512 channels (each of which has one active pore at a time), plus the ASIC chip that reads the changes in current. There's a great explainer on the Oxford Nanopore website.

Oxford Nanopore's newest flow-cell, the flongle ("flow-cell dongle"), is basically a cheaper, more disposable version of the standard MinION flow-cell. The flongle snaps into an adapter that includes the ASIC you usually find in a regular flow-cell.

(a) MinION sequencer with loaded flow-cell. (b) Flongle and adapter.

Whereas a 512 channel flow-cell costs $475-900, 126 channel flongles cost about $90-150 each. At the time of writing, you have to buy at least a starter pack of 12 for $1860, which includes the adapter.

Each pore can output several megabases of sequence, so the total amount of sequence can be in the hundreds of megabases (the biggest flongle run I know of is ~2Gb, in the hands of an experienced lab.)

For now, you have to spend quite a lot to get access to flongles, and supply is severely limited. I have received only four so far, out of the 48 I bought! (That was the minimum order size at launch.) So, there are definitely beta program issues here. Still, $100 NGS runs!

Home Lab Equipment

Surprisingly, you don't actually need that much expensive equipment to do nanopore sequencing. For my home lab, I bought the following:

The lab during a sequencing run. MinION running and Used flongle on the desk in front

(a) Eppendorf 5415C centrifuge. (b) Anova sous vide in a Costco coffee can

DNA Extraction

The first step in sequencing is DNA extraction. I bought a Zymo Quick-DNA Microprep Plus Kit for $132 (50 preps, so a little under $3 per prep).

This kit takes "20 minutes" (it takes me about 40 minutes including setting up). It can work with "cell culture, solid tissue, saliva, and any biological fluid sample". This prep is very easy to do, and almost all the reagents are just stored at room temperature. They claim it can recover >50kb fragments, which is very respectable. This is far from the megabase-long "whale" reads some labs work to achieve, but those preps are much more complex and time-consuming. Generally speaking, 10kb reads are more than long enough for most use-cases, and even 100bp-1kbp reads can still be used for species ID.

I am lucky to have access to a nanodrop spectrophotometer at work, so I can check my DNA quality. (Nanodrops cost thousands of dollars, even second-hand.) I think this wouldn't matter if I was was sequencing saliva repeatedly: that seems to work the same every time. However, it matters quite a bit for experimenting with sequencing different sample types.

Library prep

Library prep is the process of preparing the DNA for sequencing, for example by attaching the "motor protein" that ratchets the DNA through the pore one base at a time. Not being an experimentalist, I like to stick to the simplest possible protocols. That means rapid DNA extraction and ONT's rapid library prep (RAD-004), which costs $600 for 12 preps ($50 per prep).

Library prep is a little harder than DNA extraction, but still only takes around 40 minutes. There are some very low volumes involved (0.5µl, which is as low as my pipettes go!), and you need two water bath temperatures, but overall it's pretty straightforward.

The total time from acquiring a sample to beginning sequencing is maybe 1.5 hours. You definitely pay for this convenience in read length and throughput, but the tradeoff is not too bad.

Admittedly, the cost is more like $150 than $100 per run, but with the nuclease wash protocol now available to rejuvenate flow-cells, I think it's ok to round down...

Experiment one: saliva

This is probably the simplest possible experiment: extract human and bacterial DNA from saliva, and sequence it. Saliva has lots of human DNA — surprisingly, most of it is from white blood cells — and plenty of oral microbiome bacteria, and it's easy to get as much as you want. However, since bacterial genomes are about 0.001X the size of a human genome, you'd need 1000 bacterial genomes for every human genome if you want equal coverage of both.

Experiment one: (a) DNA quantification from saliva. (b) A decent read length distribution, topping out at 60kb.

This experiment generated a pretty respectable 100 megabases of sequence in 24 hours, which is basically what I was hoping for.

As soon as the DNA is loaded, reads start to get written to disk. After a minute, you have reads you can feed into BLAST to see if everything is working as expected. The instant access to data is a great reward for doing the boring prep work.

First sequencing run at home: pores sequencing, 34 minutes and 10 megabases in!

There are a few ways to analyze the data. There are several metagenome analysis tools, like Centrifuge and Kraken. I spent a couple of days(!) downloading the Centrifuge databases — which are massive since they need reference sequence data from bacteria, fungi, viruses etc. — only to have the build fail right afterward.

Luckily, Oxford Nanopore has some convenient tools online for analysis. It turns out that one of these, What's In My Pot (WIMP) is based on Centrifuge so it's convenient to just run that.

Experiment one: WIMP results for unfiltered saliva

As we can see, >99% of the reads are human or Escherichia. Upon closer inspection, the reads labeled "Escherichia coli" and "Escherichia virus Lambda" are all lambda DNA. As a QC step, I spiked lambda DNA (provided by ONT for QC purposes) into my DNA library at approximately 13% by volume. About 12% of my reads are lambda, so I know the molarity of my input sample is not too far off the reference lambda DNA.

After you get past the human and lambda DNA, the vast majority of reads map to known oral microbiome bacteria. Without anything to compare to, I can't point to any specific trends here yet.

Human DNA

What can you do with 80 megabases of human DNA? I know from just BLASTing reads that the accuracy is consistently 89-91%. Since a hundred megabases is only a 0.03X genome, it's not very useful for any human genetics tasks except maybe ancestry assignment, Gencove-style.

One thing I can do is intersect these reads with my 23andme data, and see how often it's concordant (the 23andme data is diploid and these are single molecule reads so it's not quite simple matching). Doing this intersection using bcftools and including only high quality reads resulted in only a few hundred SNPs. I did not find any variants that disagreed, which was surprising but nice to see.

Experiments two and three: failing to filter saliva

Obviously, it's a waste to generate so many human reads. Since I don't need my genome sequenced again (ok, I only have exome data), especially 0.03X at a time, I wanted to try to enrich for oral bacteria. There are host depletion kits that apparently work well, but that's kind of expensive, so I wanted to see what would happen if I just tried to physically filter saliva.

We know that human cells are usually >10µm and bacterial cells are usually <10µm so that's a pretty simple threshold to filter by. I bought a "10µm" paper filter on amazon, and just filtered saliva through it.

Experiments two and three produced almost identical results. The only differences were that after experiment two failed I tried to eliminate contamination from the paper filter by pre-washing it, and I quantified the DNA with a nanodrop, a step I skipped in experiment two. After multiple rounds of filtering–centrifugation–pouring off, I only managed to get 10ng/µl of DNA, which is very low. However, I knew that my first 32ng/µl run worked fine, so I convinced myself it must reach the recommended minimum of 5 femtomoles of DNA (that's only 3 billion molecules!), especially since the 260/280 was not that bad.

The experiment worked as planned, in the sense that instead of 99+% human DNA, I got 50% human DNA and 50% bacterial. However, instead of 100 megabases, I only got 2, and most were low quality!

Experiment three: (a) DNA quantification from filtered saliva. (b) WIMP on filtered saliva

My best guess here is that somehow the paper contaminated the DNA, since the pores apparently got clogged after just a couple of hours. I should at least have made sure I had a lot of DNA, though I don't have great ideas on how to do that beyond just spitting for an hour... It's likely I'll just need to use a proper microbiome prep kit next time.

Experiment four: wasp sequencing

Amazingly, despite having pretty small genomes (100s of megabases), most insects have never been sequenced! It's not clear to me that you can create a high quality genome assembly from only flongle reads, but if you can get 100 megabases of DNA, that's definitely a good start.

We have a wasp trap in our back yard. It caught a wasp but we were not sure what kind. It could be the most common type of wasp in the area, the western yellowjacket. It looks exactly like one, which is a bit of a clue.

Distribution of western yellowjacket vs common aerial yellowjacket, according to iNaturalist

But eyes can deceive. The only real way to figure out for sure if this is even a wasp is by sequencing its genome, or at least it would be if there were a reference genome. Surprisingly there is no genome for the western yellowjacket or the other likely species, the common aerial yellowjacket.

(a) Before mushing, with an aphid and other tiny insects in the second tube. (b) After mushing, which was pretty gross.

I took the wasp, plus an aphid that looked freshly caught in a spiderweb, and a few other tiny insects scurrying around nearby. Then I mushed them up and used the Zymo solid tissue protocol to extract DNA.

Experiment four: (a) DNA quantification and (b) read-length distribution

This time the DNA extraction was great quantity and quality. The total amount of sequence generated was 100 megabases again. However, the read length is extremely short on average. A general rule for nanopore sequencing is that you get out what you put in. In retrospect this problem should have been pretty obvious: although it looked ok, the wasp was not fresh enough so its DNA was very degraded.

Interestingly, there are quite a few long fragments (>5kb) in here too, and these map imperfectly to various aphid genomes (indicating that this particular aphid has also not been sequenced) and bacteria including possible wasp endosymbionts like Pantoea agglomerans. This is expected if the aphid and bacteria are fresh.

Experiment four: (a) WIMP of wasp and aphid reads. (b) BLASTing reads produces better results

I also ran WIMP but it turned out not to be useful, since this is not a "real" metagenomics run (i.e., it's not mainly a mixture of microbes and viruses). The closest matches are just misleading.

It would have been nice to be the first to assemble the western yellowjacket genome, or even a commensal bacterial genome, but I would have needed a lot more reads. Wasp genomes are around 200 megabases, so to get a decent quality genome I'd need at least 10 gigabases of sequence (50X coverage). That means a MinION run (or several), perhaps polished with some illumina data. The commensal bacteria are probably under 5 megabases, so it would be easier to create a reference genome, assuming any could be grown outside the wasp...

Next steps

Four flongles in, I am still pretty amazed that I can generate a hundred megabases of sequence, at home, for so little money and equipment.

I can almost run small genome projects at home, and submit the assembled genomes to ncbi. (I still need more sequencing throughput to do this in earnest.) Like the western yellowjacket, there are tons of genomes yet to be sequenced that should be sequenced. In general, plants and more complex eukaryotes will be too difficult, but bacteria, fungi, and insects should all be possible at home.

Preserving the DNA sequences of species could become an extremely important step in conservation and even de-extinction. The only group I know of doing work in this area is revive & restore. One of their projects is to try to bring back the woolly mammoth by bootstrapping from elephant to elephant–mammoth hybrid, and eventually to full mammoth. Of course this would not be possible without the mammoth genome sequence. The list of endangered species is very long, so there's a lot to do.

Brian Naughton // Sun 11 November 2018 // Filed under sequencing // Tags biotech sequencing dna

I took a look at the data in Albert Vilella's very useful NGS specs spreadsheet using Google's slick colab notebook. (If you have yet to try colab it's worth a look.)

Doing this in colab was a bit trickier than normal, so I include the code here for reference.

First, I need the gspread lib to parse google sheets data, and the id of the sheet itself.

!pip install --upgrade -q gspread
sheet_id = "1GMMfhyLK0-q8XkIo3YxlWaZA5vVMuhU1kg41g4xLkXc"

Then I authorize myself with Google (a bit awkward but it works).

from google.colab import auth

import gspread
from oauth2client.client import GoogleCredentials

gc = gspread.authorize(GoogleCredentials.get_application_default())

I parse the data into a pandas DataFrame.

sheet = gc.open_by_key(sheet_id)

import pandas as pd
rows = sheet.worksheet("T").get_all_values()
df = pd.DataFrame.from_records([r[:10] for r in rows if r[3] != ''])

I have to clean up the data a bit so that all the sequencing rates are Gb/day numbers.

import re
dfr = df.rename(columns=df.iloc[0]).drop(index=0).rename(columns={"Rate: (Gb/d) ":"Rate: (Gb/d)"}).set_index("Platform")["Rate: (Gb/d)"]
dfr = dfr[(dfr != "--") & (dfr != "TBC")]
for n, val in enumerate(dfr):
  if "-" in val:
    rg ="(\d+).(\d+)", val).groups()
    val = (float(rg[0]) + float(rg[1])) / 2
    dfr[n] = val
dfr = pd.DataFrame(dfr.astype(float)).reset_index()

I tacked on some data I think is representative of Sanger throughput, if not 100% comparable to the NGS data.

A large ABI 3730XL can apparently output up to 1-2 Mb of data a day in total (across thousands of samples). A lower-throughput ABI SeqStudio can output 1-100kb (maybe more).

dfr_x = pd.concat([dfr, 
                   pd.DataFrame.from_records([{"Platform":"ABI 3730xl", "Rate: (Gb/d)":.001}, 
                                              {"Platform": "ABI SeqStudio", "Rate: (Gb/d)":.0001}])])

dfr_x["Rate: (Mb/d)"] = dfr_x["Rate: (Gb/d)"] * 1000

If I plot the data there's a pretty striking, three-orders-of-magnitude gap from 1Mb-1Gb. Maybe there's not enough demand for this range, but I think it's actually just an artifact of how these technologies evolved, and especially how quickly Illumina's technology scaled up.

import seaborn as sns
import matplotlib.pyplot as plt
f, ax = plt.subplots(figsize=(16,8))
fax = sns.stripplot(data=dfr_x, y="Platform", x="Rate: (Mb/d)", size=8, ax=ax);
fax.set(xlim=(.01, None));

sequencing gap plot

Getting a single 1kb sequencing reaction done by a service in a day for a couple of dollars is easy, so the very low throughput end is pretty well catered for.

However, if you are a small lab or biotech doing any of:

  • microbial genomics: low or high coverage WGS
  • synthetic biology: high coverage plasmid sequencing
  • disease surveillance: pathogen detection, assembly
  • human genetics: HLA sequencing, immune repertoire sequencing, PGx or other panels
  • CRISPR edits: validating your edit, checking for large deletions

you could probably use a few megabases of sequence now and then without having to multiplex 96X.

If it's cheap enough, I think this is an interesting market that Nanopore's new Flongle can take on, and for now there's no competition at all.

Brian Naughton // Thu 12 July 2018 // Filed under biotech // Tags biotech alzheimers antibody

There have been a lot of results coming out from Alzheimer's trials recently, and a lot of discussion about the "amyloid hypothesis" and its role in the disease. In this post I'll review some of the evidence, and see how it relates to data from recent AD trial results from Merck and Biogen/Eisai. I mainly reference three good reviews that cover most of the basic facts and arguments around the amyloid hypothesis. Much of my additional data is from AlzForum, a fantastic resource for Alzheimer's news.

The basics

A simplified model of the amyloid hypothesis is that the cell-surface protein APP (Amyloid Precursor Protein) gets cleaved by BACE1 and γ-secretase and released as a 42 amino acid peptide, Aβ42; Aβ42 forms oligomers, then extracellular plaques in the brain; these oligomers and/or plaques somehow lead to intracellular Tau tangles which cause neuronal death.

One big question here is whether it's the plaques or oligomers that are the main trigger:

Several similar studies suggest that Aβ — particularly soluble oligomers of Aβ42 (Shankar et al, 2008) — can trigger AD‐type tau alterations

This model is nicely summarized by a diagram from NRD:

amyloid primer

As the diagram shows, the obvious drug targets are γ-secretase and BACE1 (to stop Aβ42 production), Aβ42 monomers/oligomers/plaques (to reduce plaque formation), Tau (to prevent Tau tangles).

There have been drugs targeting all of these processes. None have been successful:

The only approved drugs for Alzheimer's are fairly ineffectual cholinesterase inhibitors (and an accompanying NMDA receptor inhibitor). These drugs are usually thought of more as symptom relief than treatment.

Aβ42 Antibodies

Why do drug companies keep making Aβ42 antibodies after so many failures? In fact, there is quite a bit of variability in what these antibodies actually do. Ryan Watts, now CEO of Denali Therapeutics, gave an interview with AlzForum back in 2012 where he explained the difference between Genentech's crezenumab and other Aβ42 antibodies.

Q: How is crenezumab different from the other Aβ antibodies that are currently in Phase 2 and 3 trials?

A: We have a manuscript under review that describes its properties. Basically, crenezumab binds to oligomeric and fibrillar forms of Aβ with high affinity, and to monomeric Aβ with lower affinity. By comparison, solanezumab binds monomeric Aβ, and gantenerumab binds aggregated Aβ, as does bapineuzumab. Crenezumab binds all forms of the peptide. Crenezumab is engineered on an IgG4 backbone, which allows it to activate microglia just enough to promote engulfment of Aβ, but not so strongly as to induce inflammatory signaling through the p38 pathway and release of cytokines such as tumor necrosis factor α. Crenezumab is the only IgG4 anti-Aβ antibody in clinical development that I am aware of. We have not seen vasogenic edema in our Phase 1 trials, which was the first main hurdle for us to overcome.

Biogen describes the MOA of their aducanumab antibody like this:

Aducanumab is thought to target aggregated forms of beta amyloid including soluble oligomers and insoluble fibrils which can form into amyloid plaque in the brain of Alzheimer’s disease patients.

Denali Therapeutics

As an aside, Denali is not working on an Aβ42 inhibitor (perhaps for IP reasons since Ryan Watts was heavily involved in the development of crezenumab). Apart from their novel RIPK1 program, they are still pursuing BACE1 and Tau.

Our lead RIPK1 product candidate, DNL747, is a potent, selective and brain penetrant small molecule inhibitor of RIPK1 for Alzheimer’s disease and ALS. Microglia are the resident immune cells of the brain and play a significant role in neurodegeneration. RIPK1 activation in microglia results in production of a number of pro-inflammatory cytokines that can cause tissue damage.

Our three antibody programs are against known targets including aSyn, TREM2 and a bi-specific therapeutic agent against both BACE1 and Tau. Our BACE1 and Tau program is an example of combination therapy, which we believe holds significant promise in developing effective therapies in neurodegenerative diseases.

How does amyloid cause disease?

By one definition, the amyloid hypothesis "posits that the deposition of the amyloid-β peptide in the brain is a central event in Alzheimer's disease pathology". There are several ways that amyloid could cause AD. This diagram from a 2011 NRD review shows three options:

amyloid hypothesis

  • Aβ trigger: Aβ triggers the disease once it reaches a threshold, and once it starts, reducing Aβ levels does not help
  • Aβ threshold: Aβ triggers the disease once it reaches a threshold, but reducing Aβ levels back below the threshold does help
  • Aβ driver: Aβ causes Alzheimer's, and reducing Aβ levels at any time should ameliorate disease

Simplifying, if the Aβ trigger model is correct, then we don't expect anti-Aβ42 antibodies to work, except perhaps preventatively. If the Aβ driver model is correct, then these antibodies should work, at least partially.

From the same review:

A strong case can be made that the deposition of amyloid-β in the brain parenchyma is crucial for initiating the disease process, but there are no compelling data to support the view that, once initiated, the disease process is continuously driven by or requires amyloid-β deposition.

For this reason, after Aβ42 antibody trials fail, the stock answer from pharma is that they need to begin treatment earlier. Of course, the earlier you treat, the longer the trial takes, and the more you need new amyloid detection technologies like Florbetavir/PET to see what's going on. So it's probably natural that there is a gradual transition to ever earlier interventions and longer trials, even though this can also seem like excuse-making.

Evidence for the amyloid hypothesis

Despite all the failed drugs and holes in our understanding, the amyloid hypothesis remains durable due to the weight of evidence in its corner.


Mutations in APP both cause and prevent Alzheimer's. Half of people with trisomy 21 (or any APP duplication, it seems) develop AD by the time they reach their fifties.

A protective variant found in APP also points to a causal relationship, and therapeutic potential (see Robert Plenge on allelic series).

We found a coding mutation (A673T) in the APP gene that protects against Alzheimer's disease and cognitive decline in the elderly without Alzheimer's disease. This substitution is adjacent to the aspartyl protease β-site in APP, and results in an approximately 40% reduction in the formation of amyloidogenic peptides in vitro. Carriers are about 7.5 times more likely than non-carriers to reach the age of 85 without suffering major cognitive decline

A cryoEM structure of Aβ42 fibril from 2017 gives us structural evidence for why APP mutations should be protective or damaging, suggesting that APP's effect on AD is via amyloid/Aβ42. amyloid


The APOE e4 allele strongly predisposes people to Alzheimer's. It's one of the strongest genetic associations known, besides Mendelian diseases. In 2018, Yadong Huang's team at the Gladstone Institiute used iPSCs to investigate the mechanism. Confusingly, they found that APOE is independently associated with both Aβ42 and Tau.

"ApoE4 in human neurons boosted production of Aβ40 and Aβ42"

"It does not do that in mouse neurons. Independent of its effect on Aβ, ApoE4 triggered phosphorylation and mislocalization of tau."

"Based on these data, we should lower ApoE4 to treat AD"

This research may also help explain why mouse models of Alzheimer's have often been misleading.

Other evidence

  • Mutations in PSEN1 and PSEN2 (components of gamma-secretase) cause Alzheimer's.
  • Other diseases are caused by mutations in amyloid-forming proteins. For example, there is a mutation that enables IAPP to form amyloid, which causes Familial British Dementia. In ALS, the aggregated form of SOD1 may be protective and the soluble form disease-causing.

    The formation of large aggregates is in competition with trimer formation, suggesting that aggregation may be a protective mechanism against formation of toxic oligomeric intermediates.

Criticism of the amyloid hypothesis

The main criticism of the antibody hypothesis is that we have been testing anti-amyloid drugs — especially antibodies against Aβ42 — for a long time now, and none of them have had any effect on disease progression.

Derek Lowe (and many of his commenters) has written especially skeptically on his blog:

Eli Lilly remains committed to plunging through this concrete wall headfirst. [...] our gamma-secretase inhibitor completely failed in 2010. Then we took our antibody, solanezumab into Phase III trials that failed in 2012. And found out in 2013 that our beta-secretase inhibitor failed.

Morgan Sheng, VP of Neuroscience at Genentech, is much more positive. In a recent interview in NRD he said:

Let me start by saying that I fully believe in the amyloid hypothesis, and I think it’s going to be vindicated completely within years. [...] phase III results from drugs like Eli Lilly’s solanezumab suggest these agents sort of work; they just don’t work very well

It seems like targeting Tau is an acceptable strategy to amyloid hypothesis skeptics because it's not targeting Aβ42, even though it's still part of the standard amyloid hypothesis model. Drugs that are based on the "amyloid hypothesis" and drugs that work by trying to reduce amyloid tend to get conflated in a confusing way.

Evidence against the amyloid hyothesis

Here I am mainly summarizing from a 2015 review. In this review, the author mainly disputes the "linear story" of the amyloid hypothesis and not the fact that Aβ plays some kind of role in AD.

  • Many people have plaque but no disease.

    The existence of this group of individuals (healthy, but amyloid positive) is a substantial challenge to the amyloid cascade hypothesis. It is clearly possible to have amyloid deposits without dementia; therefore amyloid is not sufficient to cause disease.

    Such individuals are not rare; rather, they account for a quarter to a third of all older individuals with normal or near-normal cognitive function.

  • Anti-Aβ42 antibodies can reduce plaque without alleviating the disease.

    The second test of the amyloid cascade hypothesis has also been done: amyloid has been removed from the brains of individuals with AD and from mice with engineered familial forms of the disease. Here the tests have been less definitive and the evidence is mixed.

  • Other drugs that should work (beta-secretase inhibitors, gamma-secretase inhibitors, BACE1 inhibitors, Tau inhibitors) don't appear to work.

  • Mutations in the Tau gene can cause dementia without plaques forming, so amyloid is not a necessary step in the process.

  • We do not understand AD pathology well. For example, what are the toxic species of Aβ and Tau? What is the connection between Aβ and tangle pathology? Do Tau tangles spread between neurons like prions?

  • There are other possible causes of AD. For example, certain infections could be causative.


Recent work showing an association between herpes virus and Alzheimer's could be thought of as supporting or disputing the amyloid hypothesis. In this model, the virus "seeds" amyloid plaque formation, which then sequesters the virus. The idea that amyloid plaques are protective is not entirely new, beginning with the "bioflocculant hypothesis" for Aβ, published in 2002.

[Robinson and Bishop] posited that Aβ’s aggregative 332 properties could make it ideal for surrounding and sequestering pathogenic agents in the brain

If herpes causes AD, then we'd expect to see evidence in epidemiological datasets. Both herpes infection and periodontitis appear to be associated with AD risk. Further, antiherpetic medications appear to reduce the risk of AD. A lot more could be done here with a large database of phenotypic information, like UK biobank...

Relatedly, a Bay Area company, Cortexyme, recently raised $76M to pursue an AD drug against a bacterial protease found in plaques.

Recent news

So what about the recent trial results? There were two major trials with new results this year: Merck's BACE1 inhibitor, verubecestat, and Biogen/Eisai's anti-Aβ42 antibody, BAN2401. Meanwhile the trial design for Biogen's aducanumab is being tweaked — not a good sign generally — and there should be new data on that later this year.


After failing a Phase III in 2017 verubecestart had more bad news last month:

Treatment with verubecestat reduced the concentration of Aβ-40 and Aβ-42 in cerebrospinal fluid by 63 to 81%, which confirms that the drug had the intended action of reducing Aβ production. In the PET amyloid substudy, treatment with verubecestat reduced total brain amyloid load by a modest amount; the mean standardized uptake value ratio was reduced from 0.87 at baseline to 0.83 at week 78 in the 40-mg group. These results suggest that lowering Aβ in the cerebrospinal fluid is associated with some reduction in brain amyloid.

Notably, despite the drug working as intended, the reduction in brain amyloid was minimal. Hence, some people claim that amyloid removal has not been tested: gc tweet

Merck's Aβ42 antibody, BAN2401

New Phase II results for Merck's soluble protofibril antibody, BAN2401, were just released in July 2018. The results were hotly disputed because while the Bayesian analysis failed to show an effect, an alternative p-value based analysis (ANCOVA) showed positive results. I don't know exactly what the differences between the analyses were, but generally you would hope for agreement between the two, unless the effect was pretty marginal or just not real. The data pulled out in the tweet below shows how strange this situation is. Merck tweet

Given the ambivalent nature of the result, naturally some saw it as positive news, since there was at least something, while skeptics saw the opposite.


Aducanumab often seems like the most promising anti-Aβ antibody, and maybe the last chance for anti-Aβ antibodies to prove themselves. Back in 2015, Aducanumab showed some promising Phase Ib results. (I even wrote about it).

“They’re the most striking data we have seen with anything, period,” says [an AD trialist]

However, since then the many related trial failures, plus Biogen changing the trial design due to "variability", have left many people pessimistic. Perhaps BAN2401's recent results, however unsatisfying, show that an Aβ inhibitor is not just doomed to show no effect.


There doesn't actually seem to be much controversy about whether amyloid has a role in Alzheimer's; the genetic evidence is especially hard to dispute. I think the disagreement is more whether reducing Aβ plaques (or oligomers) can treat or prevent Alzheimer's. If the plaque is protective, then it's possible that reducing plaque may even worsen the disease.

There are also still plenty of unanswered disease mechanism questions, like whether it's oligomers or plaques that are causative, how Tau tangles cause neuronal death, and how tangles spread from neuron to neuron. Also, a 2018 paper suggests that Tau's function is the opposite of what we thought: instead of stabilizing microtubules, it keeps them dynamic.

One obvious question is why are there not more Tau-based drugs? Tau pathology is not a new idea and Tau's causal relationship with dementia is one of the least controversial parts of the AD story. In fact, there are now at least five Phase I trials underway, so these drugs might just be lagging behind Aβ42 antibodies by a few years. Certainly, Tau tangles being intracellular and in the brain makes drug development more complicated.

"Anti-tau antibodies don’t enter neurons and they don’t bind intracellular tau. We’ve invested a lot of careful rigorous work to try and understand this and I hope that the field will agree that we can put to rest that question"

(Crossing the blood-brain barrier is a problem for almost all AD drugs and especially antibodies — an interesting rule of thumb is that about 0.1% of antibody gets into the brain.)

Despite all the failures, I think the story is coming together and I'm pretty optimistic. We haven't actually tried that many ways of attacking the disease. I think that reducing plaque and/or oligomers very early could still work — mainly because we have seen the "drug" APP A673T working — meanwhile, reducing Tau tangles is arguably the most promising avenue of intervention, and it is yet to be properly tested.

Brian Naughton // Mon 11 September 2017 // Filed under biotech // Tags biotech vc

A brief look at Y Combinator's biotech investments in 2017.

Read More
Brian Naughton // Tue 27 June 2017 // Filed under biotech // Tags biotech drug development

Some notes on drug development: classes of drug targets and therapies.

Read More
Brian Naughton // Mon 06 February 2017 // Filed under biotech // Tags biotech transcriptic snakemake synthetic biology

How to automate protein synthesis pipeline using transcriptic and snakemake

Read More
Brian Naughton // Thu 26 January 2017 // Filed under biotech // Tags biotech iolt nodemcu arduino

An internet-connected lab thermometer

Read More
Brian Naughton // Sun 05 June 2016 // Filed under biotech // Tags data biotech genomics statistics

A review of interesting things in biotech, genomics, data analysis

Read More
Brian Naughton // Wed 24 September 2014 // Filed under data // Tags biotech wolfram alpha

Naming Boolean Biotech

Read More
Brian Naughton // Mon 01 September 2014 // Filed under outsourcing // Tags outsourcing experiments virtual biotech

Options for outsourcing

Read More

Boolean Biotech © Brian Naughton Powered by Pelican and Twitter Bootstrap. Icons by Font Awesome and Font Awesome More