VHH design competition results and easymosaic

A few months ago I launched a VHH binder design mini-competition. The itch I wanted to scratch was to see how well binder design tools do when run without hand-holding by the developers themselves—i.e., when run the way a typical user would.

There are more details in the original blogpost, but the gist was that the competitor submits a script to generate designs, and I run that script on a target.

If we had a "best script" for binder design, kind of like AlphaFold 3 is for folding, it would be hugely enabling for scientists.

I ended up allowing $100 of compute per design, which I thought was just on the edge of possibly producing a binder. It's also approximately the price of testing one design in the lab, which seems like a reasonable benchmark. The consensus from experts I talked to was that this would be insufficient to generate a binder. Turns out they were right! Nevertheless, here is the rundown.

Competitors

I convinced one person to enter this competition: Nick Boyd from Escalante Bio. Nick won the recent Adaptyv Nipah G competition using his own Mosaic protein design library (and it wasn't close!)

As you'd expect, Nick entered using a Mosaic script, similar to his Nipah G script, but adapted to generate a VHH instead of a mini-binder. While Mosaic is well validated for mini-binders, it has not really been tested for VHH designs, which are generally believed to be more difficult.

I entered using a BoltzGen script. My reasoning was that BoltzGen showed very strong results for VHH designs in their preprint, though they certainly used a lot more GPU hours than I did.

BoltzGen has arguably the strongest published VHH design results

Results

I tested the designs against MBP, part of Adaptyv's BenchBB benchmark, which is a set of seven standardized targets designed to be used for benchmarking. If you elect to make the results public, as I did, you get a discount.

I posted the scripts and full results from Apaptyv on the competition github repo. The results should also appear on proteinbase.com in the near future. Of course, there is not much to see here, since none of the designs bound!

EasyMosaic

One complication of Mosaic compared to other tools like BindCraft, BoltzGen, or mBER is that Mosaic is a library, so the user is expected to define their own optimization parameters and loss function. For example, you could define a loss function as a weighted sum of ipTM, pLDDT, and distance to epitope. Different binder design problems might require a different balance of weights. This is a very powerful approach, and allows the user to tune Mosaic for different targets and use-cases, but it can be difficult to know where to start.

Part of the point of this competition was to see if Mosaic could be packaged into a user-friendly script. Since its success in the Nipah G competition, there has been quite a bit of interest in this.

With some advice from Nick on parameters, I made a web-based interface to mosaic called easymosaic. As with most of my stuff, it runs on modal and lets you run Mosaic with some reasonable default parameters for mini-binders or VHHs. The minibinder parameters should match the parameters used by Nick in the Nipah G competition.

Easymosaic is designed to do a decent job producing a binder without the need for parameter tuning. Your mileage will certainly vary a lot based on your target!

Like protein folding tools, easymosaic's interface has almost no options

Mosaic-TUI

Nick's own Mosaic-TUI is a similar idea, but is more suitable for power users. It runs in the terminal, exposes all the relevant parameters, and has some nice features like the ability to use multiple GPUs.

Both easymosaic and Mosaic-TUI use B200 GPUs by default, so it is very easy to spend hundreds of dollars for a few good designs. Each design, before filtering out the bad ones, can cost $1 or more.

Mosaic-TUI has a sweet retro-futuristic UI

Sadly it's a bit too late to use either of these tools to enter the Adaptyv RBX1 competition but I'm sure there will be more competitions coming!

Hopefully, binder design tools will make some advances and I can try this again in a year or so, with a better chance of success. There are still plenty of things to try: combining the strengths of diffusion with hallucination; grounding designs in physics, etc.


Boolean Biotech VHH Design Competition 2025

This is quite a departure for this blog, but I thought it might be fun to follow Adaptyv Bio, Specifica, Ginkgo, et al. and run my own (tiny) protein design competition, the "Boolean Biotech VHH Design Competition 2025"!

Why do this when there are other, larger competitions? The twist is that instead of submitting a design tuned to the target, you submit a script that outputs designs for any target. The goal is to see how good we are at making VHHs with open models, limited compute, and no manual supervision. I am optimistic I'll get at least one submission!

The rules

  • For simplicity, entrants should use the standard hNbBCII10 VHH.
>hNbBCII10
QVQLVESGGGLVQPGGSLRLSCAASGGSEYSYSTFSLGWFRQAPGQGLEAVAAIASMGGLTYYADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCAAVRGYFMRLPSSHNFRYWGQGTLVTVSS
>hNbBCII10_with_CDRs_Xd
QVQLVESGGGLVQPGGSLRLSCAASXXXXXXXXXXXLGWFRQAPGQGLEAVAAXXXXXXXXYYADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCXXXXXXXXXXXXXXXXXXWGQGTLVTVSS
  • The target is Maltose-binding protein from BenchBB (PDB code: 1PEB). BenchBB, an Adaptyv Bio project, has seven targets to choose from. The alternatives are either too large (Cas9), arguably too small (BHRF1, BBF-14) or already well-trodden (EGFR, PD-L1 and IL-7Rα).
  • You submit one Python script that I can run using the uvx modal run command below (optionally using --with PyYAML or other libraries.) The script should use ideally use all the chains in the PDB file as the target. For simplicity, if your design tool uses only sequence and not structure, extract the sequence from the PDB file.
uvx modal==1.2.0 run {your_pipeline_name}.py --input-pdb {pdb_name}.pdb
  • I will use $50 of compute on any GPU available on modal to produce a binder. The modal script should output a file called {your_pipeline_name}.faa with a maximum of 10 designs that looks like this:
>{optional_info_1}
{binder_seq_1}
>{optional_info_2}
{binder_seq_2}
...
  • I can run non-open pipelines (e.g., pipelines that use PyRosetta), but the intention of the competition is to compare open pipelines, e.g., FreeBindCraft over BindCraft.
  • To rank designs, I will fold with AF2-Multimer with 3 recycles and MSA, and take the designs with the maximum ipTM. Of course, your script is free to do its own ranking and output a single result.
  • I will submit the 10 best submissions to BenchBB, with a max of one per entrant (though if there are fewer than 10 entries, I'll run more than one per entrant.) I'll test more if i can! Ideally I would like to test multiple targets with the same pipeline.
  • You have until Friday November 7th to submit. This is not that much time but the hope is that submissions should mostly run existing open pipelines, adapted to run as a single modal script, so there should not be a lot of target-specific tuning going on.
  • Since I have no idea if I'll get any submissions within this timeframe, and it's a pretty casual competition, I reserve the right to change the rules above a bit. I think there is a good chance I'll end up just submitting some designs myself, but it's fun to let other people try if they like!

The competition

Obviously, this will be a small competition, so I won't be too strict if there are issues, but I don't want to spend time on environments, jax, cuda, etc. This is a very appealing aspect of forcing the competition to run on uv and modal: one portable script should be able to do whatever you need.

All the code, designs and stats will be made public, and will appear on ProteinBase (Adaptyv Bio's public database), hopefully a few weeks after the competition ends. Adaptyv Bio has its own BenchBB stuff in the works too.

The prize is even better than lucre: it's glory, and maybe a t-shirt? A plausibly easy way to enter would be to use the IgGM modal app from the biomodals repo, which should be almost plug-and-play here.

This competition is difficult, maybe way too difficult! Even the best models today recommend testing 10s of designs for every target. So it might be impossible, but I am struck that one submission from the BindCraft team to the 2024 Adaptyv competition bound, and at 100nM too!

If the expected outcome of everything failing comes to pass, maybe I will try again when the technology has progressed a bit.

(Thanks to Nick Boyd for help with figuring out the rules.)


OneStart 2015

Boolean Biotech was fortunate to have been included as a semi-finalist in Oxford Biotech Roundtable's OneStart competition.

http://onestart.co/update/onestart-2015-semi-finalists-announced