Brian Naughton // Tue 27 June 2017 // Filed under biotech // Tags biotech drug development

I looked around for a broad review of this area, discussing classes of drug targets and therapies, but didn't really find anything. These are some notes I made in an effort to help understand the landscape.

Drug target types

Every drug has a "target", a biomolecule that the drug affects in humans. There are only a few options for targets.

  • Underexpressed protein
    Sometimes, a protein is lacking and needs to be upregulated or replaced. The solution, for extracellular proteins, can be as simple as injecting the missing protein. This may sound like it's an obscure element of drug development, but it's basically the foundation of biotech. Genzyme started out replacing the missing enzymes in Gaucher and Fabry disease; Genentech got their start with recombinant insulin, an underexpressed peptide; Amgen's first big drug was recombinant G-CSF (neupogen).

  • Overexpressed protein
    Sometimes, you are making too much protein and you need to dial it down. Many cancer targets are like this (e.g., HER2 amplification), though some are mutated too. This is one of the easiest scenarios for drug development, because all you have to do is knock the protein down with high specificity. It's much easier to break something than fix it.

  • Malfunctioning protein
    If you have a genetic disease, there's a good chance it's due to an absent or malfunctioning protein. Cystic fibrosis has over a thousand known causative mutations and most of these mutations cause misfolding. How do you fix a misfolded protein? It's always difficult and sometimes impossible. Vertex's CF drugs represent one unusual success story. Their latest drug, Orkambi, is actually two drugs in one: one that increases CFTR's activity (Kalydeco, a drug originally for the G551D mutation only), and the other helps the mutated CFTR fold properly.

  • Non-proteins
    There are things in the body besides proteins, for example, hormones, lipids, DNA. These are much less common targets for therapies, though amazingly gene therapy is starting to become feasible. (There are now two human gene therapies approved in Europe, and several veterinary gene therapies). According to a 2016 review in NRDD, FDA-approved drugs targeted non-proteins in only 28/695 cases.

Choice of Therapy

There are a plethora of options these days.

  • Small molecules
    • Chemical
      This is what most people mean by a drug. You use chemistry to make them: perhaps you have a large library of randomly generated chemicals that you screen against a target, or perhaps you design your drug and simulate target binding computationally. Computer-aided drug design (CADD) works sometimes and is improving quickly, partially thanks to deep learning.
    • Biological
      Unlike chemically-derived small molecules, you don't create these, you find them in organisms in the soil, the ocean, etc. These molecules are generally more complex structurally than regular small molecules, since they've had millions of years to evolve that complexity. Many drugs come from natural sources, especially antibiotics; the main issue is how to find them. We have already found much of the low-hanging fruit, like penicillin, many times over.
    • DNA-encoded
      It's obvious that you can use DNA to encode and evolve proteins. Less obviously, you can encode small molecules too, by chemically fusing DNA barcodes to your chemicals. These libraries are interesting because just like proteins, you can evolve them, but you retain the flexible chemistry of small molecules. DiCE Molecules, which launched last year, is doing this.
  • Proteins
    • Recombinant
      You can make almost any protein recombinantly in a microbe, then use that to replace a missing (extracellular) enzyme. Proteins are big, and get degraded quickly in the gut by proteases, so you usually have to inject them. (If ingesting proteins had drug-like effects, then eating would be more hazardous!) They are also usually too large to enter cells efficiently, further limiting their use as drugs.
    • Antibodies
      Antibodies are proteins, but with a specific template for interfacing with the immune system. They are very large and cannot enter cells, so they only target cell-surface or extracellular targets. That works extremely well for the 10-20% of proteins that are accessible that way. Antibodies are our closest thing to a magic bullet, and are among the most successful drugs of all time. You can evolve antibodies in mice, use phage display, or identify them in humans. There are several varieties: antibody fragments, nanobodies, BiTEs, etc.
    • Peptides
      Peptides are basically just short proteins. They are interesting potential drugs because they exhibit some of the properties of small molecules (small size, sometimes able to enter cells) and some of the properties of proteins (safe, easy to synthesize). Peptides are still rapidly degraded in the gut, so most are injected. Some of the limitations of peptides being proteins can be overcome by using peptidomimetics like D-peptides.
    • Peptides (DNA-encoded)
      Peptides are generally encoded by DNA anyway, but there is a nice way to evolve a library of peptides, analogous to DNA-encoded small molecules: the decades-old-but-still-futuristic phage display. You can use phage display to select for antibody fragment binding too.
    • Protein + nucleotide
      CRISPR/Cas9 is a gene editing technology combining an enzyme that cuts DNA (Cas9) and an guide RNA that defines the target. Cas9 is a sophisticated enzyme, so it is quite large and cannot enter cells unaided. That means to make Cas9 into a real therapy you need a way to deliver it. Older gene editing technologies like TALENs and Zinc Fingers also work well if they can be delivered, though they are much harder to program than Cas9.
    • Protein + small molecule
      Antibody-drug conjugates promise to combine the advantages of antibodies (targeting cells with high specificity) with small molecules (ability to enter cells). The canonical example is an antibody designed to find a cancer cell, then deliver an attached "warhead" small molecule to kill it. Stemcentrx is one of many companies working on ADCs.
  • Nucleotides
    • mRNA
      Instead of delivering proteins to cells, why not just deliver the instructions to make the protein? Unfortunately, just like proteins get degraded by proteases, RNA gets degraded by nucleases, so again it comes down to delivery. Like proteins, the body tolerates nucleotides very well. Moderna is developing several mRNA-based therapies.
    • Antisense RNA
      Antisense RNA binds to mRNA and interferes with translation. There have been two successful antisense RNA therapies recently: one for Crohn's (mongersen, which can be orally administered, since Crohn's is a disease of the gut), and one for SMA (nusinersen, which is injected into the CSF). Despite not being the coolest technology around, antisense RNA has had some amazing successes.
    • Aptamers
      Like antibodies, aptamers are evolved to bind their targets with high specificity, but instead of amino acids, they are made from DNA or RNA. Aptamers can be evolved using a simple process called SELEX (unfortunately IP-encumbered). As nucleotides, aptamers are degraded quickly and require special delivery mechanisms. The only approved aptamer therapy, Macugen, is for AMD and it's injected directly into the eye.
    • RNAi
      RNAi encompasses a few related systems that are difficult for me to differentiate: siRNA, shRNA, miRNA, piRNA. RNA interference is an extremely effective tool for modifying cell lines, but so far has been less effective as a therapy. Alnylam has several RNAi therapies in development.
    • Raw DNA/RNA
      Raw DNA or RNA with the right sequence will do gene editing at low frequency, even without a proper vector. You can also inject complete plasmids that get transcribed and translated (assuming some accompanying stress like electroporation). This is called a DNA vaccine, which is conceptually quite similar to an mRNA, except that DNA vaccines are injected with an adjuvant like Freund's adjuvant, to elicit an immune response to the resultant protein.
  • Cells
    Cell-based therapies are becoming increasingly important, especially for cancer treatment. In cancer, the idea is to train the immune system (usually T-cells) to attack cancer cells more efficiently. Since the vast majority of cancers are quashed by the immune system anyway, this makes a lot of sense. CAR-Ts — perhaps the best-known cell therapy for cancer — are T-cells that have been extracted and forced to express a cell-surface antibody fragment with specificity for a cancer, which forces the T-cell to engage the cancer. There's a lot more going on in cell-based therapy, like stem cell therapies, but I don't know too much about this.


The big advantage of small molecules over proteins and nucleotides — and a watchword for most of the novel therapies above — is delivery: small molecules can get across the gut and penetrate the cell membrane to bind with targets inside the cell. Proteins and nucleotides are degraded quickly, especially in the gut, and proteins are usually way too large to enter cells anyway.

There are exceptions to this rule, cases when cells are more accessible: for example, the eye is easier to access than most organs (e.g., the aptamer therapy, Macugen); if your target is in the gut then you may be able to deliver it orally (e.g., the antisense RNA therapy, mongersen); if you can apply your therapy ex vivo (especially retransplantable tissue like liver or bone marrow) then you no longer need a delivery vector (e.g., Provenge and other adoptive cell transfers, gene editing in embryos). Examining the pipelines of the new batch of gene editing and RNA therapy biotechs, we see that the diseases they tackle are very often in the bone marrow, liver or eye.

If your target is inside a cell, the cell cannot be extracted for treatment ex vivo, and you can't develop a good small molecule, then what can you do? Many of the modern protein or nucleotide-based therapeutics listed above need some some additional help to protect them from degradation, localize them to target cells, and penetrate the cell membrane. It could be argued that delivery technology has lagged behind drug technology generally. There are still no great options for delivery.

  • Viruses
    Viruses, especially AAV, are especially useful for nucleotide therapies like gene therapy. This makes sense when you consider that viruses are nanomachines, evolved over billions of years to target cells and inject nucleotides. However, since viruses are often immunogenic, there can be side-effects.
  • Nanoparticles / Liposomes
    In rare cases, liposomes have been used to deliver drugs, like cisplatin. We were recently surprised to learn that Moderna, the mRNA biotech, is using lipid-nanoparticles to deliver mRNA to cells. Virus-like particles (essentially noninfectious viruses) may combine the best of both worlds.

For a more detailed discussion on delivery methods, there's a good recent review in NRDD.


It feels apt to write about virtual companies from the beautiful new Hanahaus space in downtown Palo Alto. $3 an hour for a seat, and coffee by Blue Bottle. Rent in Palo Alto is actually not so bad if you share...

Hanahaus, Palo Alto

My impression of the typical "virtual biotech" is a company that is spun out to develop a compound originally discovered in an academic lab, or licensed from a larger biotech. There are only a few employees, usually pharma veterans, whose job is to shepherd the compound from CRO to CRO, and develop just enough evidence that the compound can be sold.

Recently, developments in biotech — analogous to the move to cloud computing in IT — may allow for a more complete virtual drug development company. Below I summarize how this might work, and the companies and technologies that enable it.

Choose your therapy

Generally, biologics are going to be a better fit for a virtual model than small molecules.

The chemistry of drug development requires very specialized expertise, and large pharma/biotech has institutional knowledge that is extremely difficult to compete with. Also, because small molecules can be made of anything, their off-target effects can be difficult to predict (even aspirin is not completely understood).

Nucleotide-based technologies like RNAi (Alnylam), mRNA (Moderna), and CRISPR/Cas9 (Caribou, Editas) would be ideal. Since they are nucleotide-based, binding relies on sequence identity, so it's much closer to a digital system. Theoretically, you can change targets simply by changing the nucleotide sequence, which makes the process much more predictable. Nucleotide binding is generally easier to predict because a 1D search space (the human genome, plus perhaps commensal bacterial genomes) is so much more constrained than a 3D search space (all structures/epitopes present in and on cells). Of course, these technologies have their own issues in that they are new and untested.

Protein-based biologics are arguably a good compromise. For example: enzymes (enzyme replacement therapy is worth billions of dollars a year), antibodies (seven of the eight top selling drugs in 2013 were antibodies), BiTEs and CAR-Ts (cancer immunotherapy companies like Juno are showing extremely promising results). These technologies provide a more consistent design template than a small molecule (i.e., DNA), but there is still a lot that remains unpredictable, such as off-target binding for antibodies, or even how the protein will fold.

Drug repositioning

Another reasonable option is to use an existing library of small molecules (e.g., from NCATS) with some additional data that can be mined (e.g., expression changes in model organisms). This process is usually called drug repositioning, and there are indeed many such companies springing up as the amount of available data increases, and methods for prediction using statistical models (machine learning) improves (twoXAR, NuMedii, AtomWise).

You can also combine these two concepts, by applying CRISPR/Cas9 to a model organism to create a disease model, and then testing that model against a library of compounds (Recursion Pharma, Perlstein lab). Creating these disease models straightforwardly may be one of the major initial uses of CRISPR/Cas9 (amazingly, now mainstream enough that you can order yours from Agilent).

Choose your advantage

Without the resources of a large biotech, how can a virtual company compete? After all, pharma/biotech has thousands of potential therapies sitting on the shelf. A therapy that works great in yeast, or even mouse, is not necessarily worth much because most of the risk in drug development happens after the preclinical research stage (an orphan disease with no treatments is an easier sell).

Since the eventual goal is a safe and effective therapy, that means there are three advantages your therapy could have:

  • More Safety The therapy has already been shown to be safe in a clinical trial, or is a generic/off-patent drug (twoXAR, Recursion, NuMedii)
  • More Efficacy The therapy works in multiple distinct organisms, so it should work in humans (Perlstein lab)
  • More Safety and More Efficacy The therapy comes directly from a human, therefore there is some indication that it's safe and effective in a human (X01, Neurimmune). Recent applications of human genetics in drug discovery (e.g., PCSK9 inhibitors) rely on a similar concept.

Create and test your therapy

  • Create
    • Design: A good example of where protein engineering is important is BiTEs. You can think of two things that should be colocated (like T-cells and cancer cells), and synthesize a molecule that binds both.
    • Find: A surprising fraction of drugs are still "natural products", many discovered through bioprospecting. Recently, with the incredible amount of sequencing capacity available, we can do this at scale from microbes (Warp Drive Bio) or maybe even from humans.
    • Repurpose: You can just try all the compounds in a commercial screening library. They may have already been picked over though!

Percentage of drug approvals that were natural products ("N") Newman & Cragg 2013

  • Test
    • Model organism: Testing in simple model organisms is great, if you have a good model (apparently, yeast is a good model for Alzheimer's) It also helps you parallelize your experiments since you can grow these little organisms in wells.
    • Human cells: This method becomes especially powerful when combined with CRISPR/Cas9, even with a relatively low yield for now. Rooster Bio and Extem Bio are two startups providing MSCs (mesenchymal stem cells — not iPSCs) at competitive prices (Extem claims to have the largest stem cell library by several orders of magnitude). Of course, every large biotech is using stem cells too (e.g., AstraZeneca).
    • Animal: Mammalian animal models (usually mice or rats) are expensive, (probably $10k-100k per experiment) but currently necessary for any kind of serious drug development effort.

Choose your development methods

Since this company is virtual, there are severe limitations on what is possible, so the choice of development methods is extremely important. The experiments must be inherently amenable to virtualization.

Synthetic biology

If you are going to develop a biologic virtually and on the cheap, then you'll probably want to use synthetic biology. The iGEM synthetic biology competition gives some indication of how that might work (list of iGEM projects). iGEM is mostly focused on bacterial sensors and the like, but when the worlds of iGEM and drug development collide it is fascinating.

Synthetic biology allows you to iterate on and parallelize your experiments in ways that are very suited to virtualization. For example, if you want to do some mutagenesis on your protein, you can use a kit or write some code to edit the sequence directly. The use of synthetic DNA means you can worry less about the experimental process (purification of PCR products, general lab hygiene) and rely less on the hard-won expertise of lab science. You get increased reproducibility for free.

Synthesizing DNA is still expensive at 10-20c per base (that's at least a million times more expensive than sequencing) but companies like Gen9, Twist Bioscience and Cambrian Genomics should be able to bring the price down an order of magnitude within a few years. That will mean $10-50 proteins and antibody fragments, which should enable a lot more kinds of parallel experimentation.

You can get a bit of help with designing your vector and protein using software like Genome Compiler or Benchling (as used by Gen9).

Rob Carlson's DNA synthesis cost curve

Cloud labs

The other crucial ingredient in the modern virtual biotech is the cloud lab (as I've discussed in several previous posts): Transcriptic, Autodesk Wet Lab Accelerator (in beta just this week, and built on top of Transcriptic), Arcturus BioCloud, Emerald Cloud Lab (in beta), Synthego (not yet live) and Riffyn (not yet live). None of these companies existed just a year or two ago.

Just like SnapChat can build a massive messaging app on top of Google App Engine to compete with Facebook et al., and many other lean internet companies build on top of AWS, the virtual biotech should take advantage of scalable cloud services for experiments too.

Sadly, you cannot do everything with synthetic biology and cloud labs just yet. For experiments that don't fit into these boxes, there is always Science Exchange and Assay Depot. There are also a couple of exciting stealth animal experimentation startups coming out soon too! The CRO world, like IT vendors before the era of AWS, is set for disruption.


Boolean Biotech © Brian Naughton Powered by Pelican and Twitter Bootstrap. Icons by Font Awesome and Font Awesome More