–––––––– –––––––– archives investing twitter
Brian Naughton | Sun 09 February 2025 | health | health

This post is a kind of brain dump of health-related product information I have gathered over the past few years. The major themes are eliminating plastic, heavy metals, and other suspicious stuff, and using stainless steel, cast iron, silicone, and glass where possible. This post is really long, so I put a tl;dr list of all the products at the end of the article. I may update this page if I think of more!

There are a few resources that have useful information:

  • Consumer Reports actually does a decent amount of health stuff. It is behind a paywall, but accessible for free online through some libraries. Usually when they look into a subject, they do a pretty thorough job. For example, they tested 126 herbs and spices for heavy metals.
  • ConsumerLab's mission is "To help consumers and healthcare professionals find the best quality health and nutrition products through independent testing and evaluation." ConsumerLab tests a lot of products, but all the information is behind a paywall. I bought a subscription to test it out, and I think it's decent. They seem quite careful in their testing, but the probability they have the specific information you are looking for is low. For example, their article on protein powders is extremely rambling, but for heavy metal testing they just link to a confusing Clean Labels Project page.
  • Cochrane collaboration is "a charitable organisation formed to synthesize medical research findings to facilitate evidence-based choices about health interventions". This is much broader than product reviews, but if they cover your specific topic of interest it can be very informative. The reviews are rigorous to the point of being over-cautious. If there were no clinical trial for parachute use, they would have to say the data is inconclusive. This humorous BMJ article agrees:

    Parachute use did not reduce death or major traumatic injury when jumping from aircraft in the first randomized evaluation of this intervention. However, the trial was only able to enroll participants on small stationary aircraft on the ground, suggesting cautious extrapolation to high altitude jumps.

  • Elevate is a nice, short book that had some practical advice on the 80/20 of health interventions. Most memorably, the author recommends wearing natural fibers to limit skin exposure to plastics.
  • plasticlist.org is a project funded by Nat Friedman where they tested ~300 foods for plastic contamination and helpfully made the results publicly available. I believe they are supposed to continue to update the list over time.
  • implasticfree.com is "the easiest, fastest way to find plastic alternatives". They don't seem to do much independent testing but the information I have seen on there is quite well researched. For example, they have a good list of plastic-free kettles.

Protein powder

For various reasons — e.g., edicts from health influencer Peter Attia — many of us are eating more protein these days. Whey, collagen, and creatine are probably the three major protein supplements, and usually come in the form of concentrated powders. Anything concentrated is a concern for toxins, especially heavy metals.

When buying these, I am optimizing for these factors, in order of importance:

  1. Third-party lab tests, preferably with results (since you can test and still get bad results!);
  2. Price;
  3. Form factor (I prefer bags to big tubs);
  4. Quality, including purity (sometimes ConsumerLab has information on this).

My current favorites from each category are the following:

  • AGN Roots Whey Isolate: This whey is inexpensive and their literature has some reassuring information around heavy metal testing, including providing actual numbers. Whey isolate is generally considered better than concentrate, with easier absorption and less cholesterol.

    "AGN Roots Grassfed Whey heavy-metals test results will show a Lead Concentration of < .005 mg/kg (ppm). Our grassfed whey contains up to 100 times lower concentrations of heavy metals (Mercury, Arsenic, Lead, Cadmium) than the typical "grass-fed" whey supplier."

  • Natural Force Collagen: This is a reasonably priced collagen that provides some specific test results.

    "We test for Mercury (Hg), Lead (Pb), Arsenic (As), and Cadmium (Cd) on every batch. Here are some recent test results:
    Unflavored
    Mercury 0.0029 mcg/g (µg)
    Lead 0.0039 mcg/g (µg)
    Arsenic <0.001 mcg/g (µg)
    Cadmium 0.0021 mcg/g (µg)"

  • It's Just! Creatine: This is one of very few creatine brands that claim to test for heavy metals. (In fairness, heavy metal contamination is not believed to be a huge concern for creatine). Naked Nutrition Creatine also claims to test for heavy metals, though neither has specifics about their testing. It's Just! Creatine comes in a bag, which I happen to prefer, so I buy that. I used to buy from bulksupplements.com but there are some concerns about their testing. Creapure is more expensive, has extensive testing (though no specific mention of heavy metals) and has also been recommended to me.

Cinnamon

Don't die guy Bryan Johnson recently tweeted about the presence of heavy metals in cinnamon. There are actually two kinds of cinnamon: Ceylon cinnamon ("true cinnamon") and Cassia cinnamon. Johnson was referencing Ceylon cinnamon, which tastes better than Cassia and has less coumarin. Coumarin is a liver toxin, so in general best avoided.

Coumarin is a flavouring substance which is contained in relatively high concentrations in cinnamon varieties collectively known as "Cassia cinnamon". In especially sensitive persons, even comparatively small quantities of coumarin can cause liver damage, although the effect is usually reversible.

In 2024, Consumer Reports tested 36 cinnamon powders for lead. 365 Whole Foods Organic Ground Cinnamon came out on top at 0.02 ppm. However — like all cinnamon that does not specifically say Ceylon — this is the cheaper Cassia variety. The best Ceylon cinnamon tested was Penzey's at 0.78 ppm. Consumer Reports rated this "ok", but it is uncomfortably close to their "don't use" rating. So, unfortunately, I do not know a great source of cinnamon. According to Consumer Reports, basil, oregano and thyme are also almost universally contaminated!

Peanut butter

Like cinnamon, peanut butter is a seemingly healthy food with some downsides. For one thing, due to their longer growing cycle, nuts can concentrate heavy metals. (Brazil nuts contain so much Selenium that you are only supposed to eat fewer than 5 per day). Also, do not eat a pound a week of peanut butter since that can cause liver damage.

The real problem peculiar to peanut butter is a mold-derived natural product called aflatoxin, a very potent toxin and carcinogen. In 2004, 125 Kenyans died from aflatoxin poisoning after eating contaminated maize, representing a fatality rate of 39%!

Compared to whole peanuts, I am told peanut butter is especially prone to aflatoxin contamination because the mold that produces it can grow on the lower quality peanut scraps and shavings used to make peanut butter. This may or may not be true — I could not find a citation.

There is also the unfortunate fact that the smaller, higher quality peanut butter brands may actually have a higher probability of contamination than the sugar and palm oil-laden Skippys and Jifs of the world, simply because the larger players have enough scale to do frequent testing. The FDA tests for aflatoxins, so in general there should be no brands with a huge problem here.

Because of these health risks, the FDA has published action levels for aflatoxin and regularly tests foods for the presence of aflatoxins. By using modern agricultural and processing techniques, companies can reduce the possibility of contamination in their products.

There is a specific variety of peanut, Valencia peanuts, that is less prone to aflatoxin contamination. Kirkland peanut butter is currently made with US Valencia peanuts. Since we had been buying Kirkland peanut butter for a while, I actually went as far as testing it for aflatoxin with the excellent Trilogy Lab, who graciously tolerated my tiny order. Kirkland had undetectable levels of aflatoxin, so that's what we exclusively buy now.

Aflatoxin levels in Kirkland Valencia peanut butter were undetectable (results from Trilogy Lab)

Turmeric

Turmeric is a popular supplement with pretty weak evidence for any positive effects, but also weak evidence for any negative effects. However, like cinnamon, and powders in general, there are concerns around heavy metal contamination.

Despite the paucity of evidence, I do sometimes take turmeric. I buy it from BioSchwartz because (a) it's turmeric extract, which apparently removes some impurities; (b) BioSchwarz does a lot of testing and is credible enough to be used in some studies.

Interestingly, there is also evidence that curcumin may chelate heavy metals, and hence reduce the total heavy metal burden on the body. On balance I find it worthwhile to take some turmeric, but I don't think it's an obvious win.

Sunscreen

For regulatory reasons — basically, because sunscreen is regulated like a drug in the US — it is difficult to buy good sunscreen in the US. This used to be a niche concern, but is now so mainstream that even AOC talked about it.

Recently, I found a good source of sunscreen information, Lab Muffin Beauty (e.g., her Top Sunscreen Recommendations for 2024). Lab Muffin Beauty is a PhD cosmetic chemist, and seems to have a ton of expertise on the topic. One interesting point she makes is that "physical" (Zinc or Titanium-based) sunblocks are not really any safer than "chemical" sunscreens. The basic argument is that both enter your bloodstream at low concentrations, and both appear to have comparable, and negligible, health impacts. She convinced me, for what it's worth. (Note, usually "sunblock" refers to the physical kind and "sunscreen" the chemical kind.)

Lab Muffin Beauty also had an interesting video that references the cancer-causing thymine dimer formation that can happen when UV damages DNA.

Thymine dimers resulting from UV radiation. These mutations can cause cancer.


The sunscreens I currently buy are:

These two sunscreens are "watery" style, which I prefer. All of the "white" sunscreens I have ever used leave white deposits everywhere. I don't have recommendations for non-watery sunscreens but Lab Muffin Beauty also has plenty of recommendations in her video.

One thing to be aware of is that lots of brands have US versions of their sunscreens where they use the exact same brand names (e.g., La Roche-Posay Anthelios), so it's tricky to know what you are buying.

Toothpaste

Toothpaste has the same problem as sunscreen in that it is regulated like a drug in the US. Again, this means US toothpastes lag in technology by years.

The most notable ingredient that is missing from US toothpaste is NovaMin, a bioactive glass found, for example, in European Sensodyne but not US Sensodyne. This is especially egregious to me because the European Sensodyne reduces sensitivity by remineralizing teeth while the US Sensodyne includes a numbing agent (potassium nitrate) that just masks sensitivity!

(1) Bioactive glass (NovaMin) alone exhibited promising remineralization capabilities compared with a combination of fluoride and bioactive glass or just fluoride; (2) bioactive glass with fluoride seemed to potentiate the effect of fluoride alone

There is actually an interesting story behind NovaMin. To summarize, GSK bought the rights to NovaMin but the US trials were too expensive given the size of the toothpaste market. Realistically, not many people care enough to look into what ingredients US Sensodyne has vs European Sensodyne.

There is good news though. I don't understand the regulatory story here — it may be because they do not claim cavity prevention effects — but Amazon now sells toothpastes containing nano-hydroxyapatite, which seems to be as good as NovaMin at remineralization. For example, Davids, which does not have fluoride, and Made by Dentists, which does. Note that fluoride also remineralizes, albeit via a different mechanism, and I believe using both is recommended.

While fluoride toothpastes work by depositing fluoride ions into your enamel, fluoride free toothpaste with hydroxyapatite works by depositing naturally beneficial materials like calcium and phosphate ions into your enamel.

So, while fluoride is renowned for its strength, hydroxyapatite stands out for its natural affinity with the teeth and its potential to excel in specific oral health aspects. That's why many people suggest that both nano hydroxyapatite and hydroxyapatite are better than fluoride.

Unlike supplements and sunscreen, the effects of NovaMin/nano-hydroxyapatite are quite visible. You can see the "dentine tubule occlusion" process with a Scanning Electron Microscope within just a few days.

Pre- and post-treatment with nano-hydroxyapatite

Biomin F

The strongest remineralizing agent of all appears to be "Biomin F", which deposits fluoroapatite, a bioglass claimed to be 10 times more acid resistant than hydroxyapatite. Of course, you cannot buy Biomin F in the US, despite receiving FDA clearance in 2021 (note, not "FDA approval" as it says in the article!) You can buy Biomin C, but that is very similar to regular nano-hydroxyapatite. Not only that, but stores worldwide are low on Biomin F inventory for some reason. The Biomin Canada store is out of stock until "Q2 2025". Thankfully, at the time of writing, you can buy Biomin F on ProSmile for the reasonable price of $12 per tube. This is where I bought from.

The British Dental Journal news article on Biomin F includes this hilariously mercantile take. Dentists really get away with some things doctors would not dare say:

BioMin F delivers controlled fluoride release to strengthen teeth against acid attack and reduce dentine hypersensitivity including scaling and post-bleaching sensitivity, a problem that reduces patient acceptance of this highly profitable revenue stream.

Floss

Floss is likely a significant source of PFAS (the Teflon-related "forever chemicals" that don't break down) and microplastics. A recent NHANES report found higher PFOAs in dental floss users, but, counterintuitively, lower levels of other PFAS. (NHANES is a massive survey that forms the basis for many observational health studies.)

Flossing with Oral-B Glide, having stain-resistant carpet or furniture, and living in a city served by a PFAS-contaminated water supply were also associated with higher levels of some PFASs.

These are not great studies, and there is nothing all that concerning in these reports, but since floss is something that can directly contact blood, it still seems worth addressing. Just last month, Consumer Reports reviewed PFAS-free flosses. Of those recommended, TreeBird seems to have the best reviews. I have been using a very similar product, Bambo Earth, and it works ok. However, it is quite thin and breaks quite easily.

Flosses tested for the presence of fluorine

Air filters

Despite the increase in awareness of air filtration due to recent forest fires (at least in the US), I think air pollution is still underappreciated as a cause of disease. In a large study published in NEJM in 2017, they found a very strong correlation between PM2.5 and all-cause mortality.

Increases of 10 μg per cubic meter in PM2.5 [was] associated with increases in all-cause mortality of 7.3%

Although this is an observational study, it was large and well-controlled. There are many cities in the USA with an average PM2.5 of 10μg/m³ or greater, and in India, some cities can even average >100μg/m³. Within cities there is also a lot of variation, with houses closer to freeways having much greater PM2.5 levels. The air filtration company IQAir publishes an interesting "most polluted cities" leaderboard.

IQAir's most polluted cities in the US


The content of the pollution matters. An estimated 30% of Bay Area pollution blows in from China, including a lot of lead from coal-fired power plants.

Thankfully, there are plenty of good air filters around these days, and the technology is very standard: essentially an air filter is just a fan and a HEPA filter (the exception is Molekule, which doesn't work well). We have BlueAir and use cheaper third-party filters. I have tested a few of these third-party filters with a particle counter, and they all appear to work fine. Coway and Winix also work fine and are cheaper than BlueAir. Winix even has its own air quality sensor built in, which is pretty amazing for a $180 device. To really go all out, IQAir will sell you a whole-house HEPA filter for around $3000 (many houses already have air filters, but these are usually around MERV 13, where HEPA is closer to MERV 17).

MERV vs HEPA ratings


Although HEPA filters are most important because they catch PM2.5 particles, you can also buy activated carbon filters. Activated carbon captures VOCs like benzene (which a gas range will emit, even when off!) Wildfires are also a major source of VOC pollution. Combination HEPA–activated carbon filters are easy to buy for BlueAir, Coway, and Winix, so I think it's worthwhile.

Nose sprays

Airborne infectious diseases mostly enter through the nose and mouth. Generally, breathing through the nose is thought to be healthier, since the nose has more active barriers than the mouth, including a mucous layer and increased nitric oxide concentration (an antimicrobial). The recent fitness influencer trend of taping your mouth shut at night to improve sleep quality and reduce sleep apnea may also prevent infection.

Studies indicate that NO may also help to reduce respiratory tract infection by inactivating viruses and inhibiting their replication in epithelial cells.

Studies indicate that individuals who snore or breathe through the mouth during sleep—conditions which are highly prevalent in males—are more likely to develop respiratory tract infections. Our anecdotal observations also suggest that favoring nasal breathing during sleep by sealing the mouth with adhesive tape reduces common colds. This phenomenon may be due to the filtration and humidifying effects of the nose on inhaled air and to increased NO levels in the airways, which may decrease viral load during sleep and allow the immune system more time to mount an effective antiviral response.

You can also potentially prevent even more infections by augmenting the nose's defenses. There are at least two products that claim to do this:

Carrageenan

Luca V-Defense is a Korean nose spray that deposits a layer of type of carrageenan on the inside of the nose. As a common food thickener, carrageenan seems to be quite safe. There are several other carrageenan-based sprays on Amazon, like Betadine. Based on the ingredients I don't think there is too much to choose between them.

Strangely, even though the putative mechanism of action is prevention, the trials I found are for treatment post-infection (e.g., for the common cold, influenza, SARS-CoV-2). This is probably because real prevention trials would have to be extremely large and expensive.

The biological effect of carrageenan appears to prevent the virus from binding to cell surfaces or penetrating the cells

The antiviral effect of iota-carrageenan was previously shown in vitro for several other viruses, e.g., influenza A virus, human rhinovirus, endemic human coronaviruses and herpes simplex virus 2

None of the claims or evidence here is amazingly strong, but a physical barrier is a pretty simple and believable mechanism of action, especially for prevention, so I am inclined to believe some efficacy. Also, a common mechanism for infection is simply dried out mucous membranes, especially on airplanes, so carrageenan could also help with that.

Low indoor [relative humidity], as experienced during winter months or inside airplanes, directly or indirectly influences several mechanisms that increase the transmission of respiratory diseases.

Nitric oxide

Enovid (aka VirX) is a nitric oxide spray that is more expensive, more difficult to buy, and arguably less safe than carrageenan. You cannot buy it on Amazon for a reasonable price. Enovid has been tested in a Phase III in India for SARS-CoV-2:

Secondary endpoint assessments demonstrated a greater proportion of patients receiving [NO nasal spray] (82.8%) cleared SARS-CoV-2 (RT-PCR negative) by [end of treatment] compared to placebo (66.7%, p = 0.046), with no virus RNA detected a median of four days earlier compared to placebo (three vs seven days; p = 0.044).

The results for treatment are pretty good, if not amazing. Like carrageenan, it is possible that the NO effect is more relevant for prevention than treatment. It's honesly unclear to me why it helps at all post-infection.

I use both of these products sometimes. In particular, for airplane trips, I generally try to use the Luca spray beforehand, a mask (3M Aura) on the plane (which also helps with the dry air), and maybe Enovid at some point after. Airplanes are a major source of infections, and anecdotally, this protocol helps.

All told, I think the carrageenan sprays are probably a better bet than the NO sprays. They are cheap, easy to buy, have decent evidence for being protective against all airborne infection, and seem very safe.

Gas and induction stoves

Gas stoves produce a lot of noxious fumes, including NO₂, benzene, and CO. Like asbestos, lead, and other once-common toxins, the ubiquity of gas stoves may hide how unhealthy they actually are. By one estimate, gas stoves cause 12% of childhood asthma cases in the US, though honestly this is a weak observational study and the claim is heavily disputed (12% is a number so high that it would require extraordinary evidence).

This increased exposure likely causes ~50,000 cases of current pediatric asthma from long-term NO₂ exposure alone. Short-term NO₂ exposure from typical gas stove use frequently exceeds both World Health Organization and U.S. Environmental Protection Agency benchmarks.

Even with the kitchen extractor hood turned on, NO₂ levels exceed health guidelines


As mentioned above, activated carbon filters can remove some of these VOC pollutants, while regular HEPA filters cannot. Indoor plants may also help a little bit!

Some plants, like snake plants, remove a small amount of VOCs from the air


Venting with an extractor fan helps a lot, but the only real solution is replacing gas with electric, which these days means induction. Consumer Reports recommends this LG Induction Stove ($2800) or the cheaper IKEA TVARSAKER stove ($1400). There are also new lithium battery-augmented stoves like the Copper Charlie and Impulse Labs (cooktop only).

These induction stoves are admittedly pretty expensive, but as a stopgap, the Wirecutter-recommended Duxtop portable induction cooktop is only around $120, and some alternatives are half as much.

Electric kettles

Microplastics and other leached chemicals appear when you add boiling water to plastic. Since I drink a lot of tea, I spent quite a bit of time looking for a kettle with no plastic parts exposed to water. The more I read the more it seemed like every kettle has some small amount of plastic somewhere, so I was not totally satisfied, but I ended up buying a Cosori gooseneck kettle, and I like it. This recent roundup from implasticfree.com recommends the Cosori, plus several other options if gooseneck does not suit.

Tea bags

Tea bags, especially the plasticky-looking ones, can also contain microplastics. We buy Republic of Tea bags, which are made from wood pulp, and sell in bulk bags of 250. The company claims no plastics are used, though this reddit thread throws some doubt on that. Apparently, "plant-based bioplastics" are commonly used to seal tea bags, and there's no doubt many companies will refer to this as "plastic free". implasticfree.com also has a list of approved tea bags. Loose leaf tea is probably safest given the confusion here, but it's too inconvenient for me.

AeroPress

I have been using an Aeropress for many years, and it always bugged me that it's plastic. As plastics go, polypropylene seems like a good one, but it's still not ideal. There have been handmade metal-and-glass versions available online for a few years, but I was never convinced that with welding etc. I wasn't trading plastic contamination for metal contamination. This year, Aeropress finally released a stainless steel and borosilicate glass version, which I bought. It costs $150 and ships in May 2025.

The new glass and stainless steel AeroPress is $150

Travel mugs

In 2024, the giant Stanley travel mugs hydration enthusiasts were buying made the news when they were found to contain lead. Since the lead is totally separated from the interior of the cup, the risk turned out to be probably pretty negligible. Separately, Lead Safe Mama, a blogger who tests household items for lead and other contaminants, investigated and noted that the interior is only recommended for use with water, which is kind of weird.

The HydroFlask equivalent is the same size as the Stanley cup and is all stainless steel, so it's a pretty straightforward swap-in. I bought a HydroFlask, and also ended up also swapping the hard plastic part of the HydroFlask straw for a silicone straw since it comes in contact with hot liquid.

The Hydroflask straw has two separate parts: silicone on top and hard plastic on the bottom

Water filtration

There are two main ways to purify water: activated carbon filters and reverse osmosis. At one point I looked into Berkey water filters, as the only stainless steel activated carbon filter I could find. I decided against Berkey because it lacks certification and leached aluminum in at least one test.

We got PUR filters, which are similar to Brita, but with some minor performance improvements (when I checked 5+ years ago). I tested our PUR-filtered water with a GeneQuant, and in that test it removed a lot of impurities! The good news is that activated carbon filters work well. The bad news is they are annoying: they need to be filled manually, changed regularly, they slow down over time, and everything is plastic.

In retrospect, reverse osmosis filters are probably a better bet. They are out of sight so they require some active maintenance to avoid bacterial and mold growth, and since reverse osmosis removes minerals, they require a separate remineralization filter to taste ok. I have yet to install a reverse osmosis filter, so I don't have any particular recommendations.

The best way to test your water is probably TapScore but it's quite expensive at $200-300.

KitchenAid attachments

Many people own the classic KitchenAid stand mixer. In 2014, Lead Safe Mama found very high levels of lead in the standard KitchenAid attachments. There is a long backstory here and KitchenAid disagrees with the conclusion, but I think overall it was convincing that there was a concerning amount of lead.

The KitchenAid stand mixer and stainless steel attachments


In 2018, KitchenAid started selling stainless steel attachments in response to this finding. We own these and they work great. We had the standard enamel-coated attachments before, and these actually chipped a little bit due to misalignment, so independent of the lead problem, this is a big improvement. If you own this mixer, this is an upgrade I would highly recommend.

Food containers

Most paper and cardboard that touches food is lined with something. It's a pretty complicated zoo of plasticky materials, and I don't understand it very well. Some things are lined with PFAS (e.g., the paper in burger or sandwich wrappers, at least up until recently), some are lined with wax (e.g., the dry wax paper under pizza), some with silicone (e.g., parchment paper), some with polyethylene (e.g., paper coffee cups, butcher's paper).

There are plenty of exceptions to the above, and it's difficult to figure out exactly what is lining what, but I think it's pretty safe to say almost all paper and cardboard that could get soggy has some kind of lining. To my mind, dry wax and parchment paper are probably ok. The other random plastics are things I'd prefer to avoid, given the choice.

Since nothing is really labeled, I am pretty suspicious that there is more PFAS here than advertised, in the same way that BPA-free essentially means "contains something a lot like BPA", e.g., BPS.

The real issue is that the industry is replacing a toxic chemical with another, yet untested chemical, which will require large investments of research funds to carry out applicable studies

Despite the recent phase-outs of PFAS, Consumer Reports did a review in 2022 and found a ton of PFAS.

Identifying the exact type of PFAS in a product is complex: There are more than 9,000 known PFAS, yet common testing methods can identify only a couple dozen.

CR tested multiple samples of 118 products and calculated average organic fluorine levels for each. Overall, CR detected that element in more than half the food packaging tested. Almost a third—37 products—had organic fluorine levels above 20 ppm, and 22 were above 100 ppm.

In a 2022 paper, they found "trillions of sub-100 nm nanoparticles" being released into water from plastic-lined single-use coffee cups.

On a particle number density basis, particles released into water from a single 300 mL hot beverage cup equate to one particle for every seven cells in the human body in a size range available for cellular uptake.

Plastic-lined paper cups leach plastics

SweetGreen

One interesting example is the "fiber bowls" you get from fancier fast food like SweetGreen. Fiber bowls are those brown bowls that look like recycled cardboard, and you'd be forgiven for thinking they are healthier than the classic plastic or styrofoam containers. They used to be lined with PFAS, but in 2020 SweetGreen partnered with FootPrint International to remove PFAS from their bowls.

There is not much information online, but FootPrint have several patents, for example:

Methods and apparatus for vacuum forming and subsequently applying topical coatings fiber-based food containers. The slurry includes an embedded moisture barrier and/or vapor barrier, and the topical coating comprises an oil barrier comprising acrylate, rice bran wax, pectin, and pea protein.

As far as I can tell, acrylate (also listed as acrylic in the patent) is safer than PFAS, since it's harder and so less prone to leach. However, though it might technically not be "plastic" for some reason, it's not far off.

SweetGreen also scored very high for plastic contamination on plasticlist.org, coming in at 4th highest for DEHP out of 300 foods tested. My understanding is that this is more likely to come from the chicken than the bowl.

The plasticlist.org top 7 by DEHP content

Mass spectrometry

One common theme here is that it's really hard to tell what products are actually made from. In most cases the information does not exist, and the manufacturers have little incentive to say anything. If the product is manufactured in China, they may not even know for sure. For example, is there "plastic" in Sweet Green bowls and Republic of Tea tea bags, or just something very similar to plastic? I still don't know.

There is a technology that can detect the constituent molecules of almost anything: mass spectrometry. I really wish someone with a mass spec could review products and tell me what they are made from. Unfortunately, mass specs are pretty expensive ($100k-$1M) and finicky to keep running, and the data analysis is not all that straightforward (as evidenced by the recent study claiming massive amounts of microplastics in the brain).

A rebuttal to the study claiming very large volume of microplastics in the brain

Amazon

Even though the majority of the links above are from Amazon, and I buy from there a lot, in general I prefer not to. It's fairly common for products on Amazon to be counterfeit, mainly because random third-party sellers get mixed in with the official stores. Supplements may be especially prone to counterfeiting, since they are so easy to fake. If you are buying products specifically to avoid contaminants, then the rampant counterfeiting might defeat the purpose. Target, Costco, etc. do not appear to have this problem.

tl;dr

Recommended product list:

Thanks to Darren Zhu for comments on this post!

Comment
Brian Naughton | Mon 30 December 2024 | ai | ai biotech proteindesign

This article is a deeper look at Adaptyv's binder design competition, and some thoughts on what we learned. If you are unfamiliar with the competition, there is background information on the Adaptyv blog and my previous article.

The data

Adaptyv did a really nice job of packaging up the data from the competition (both round 1 and round 2). The also did a comprehensive analysis of which metrics predicted successful binding in this blogpost.

The data from round 2 is more comprehensive than round 1 — it even includes Alphafolded structures — so I downloaded the round 2 csv and did some analysis.

Regressions

Unlike the Adaptyv blogpost, which does a deep dive on each metric in turn, I just wanted to see how well I could predict binding affinity (Kd) using the following features provided in the csv: pae_interaction, esm_pll, iptm, plddt, design_models (converted to one-hot), seq_len (inferred from sequence). Three of these metrics (pae_interaction, esm_pll, iptm) were used to determine each entry's rank in the competition's virtual leaderboard, which was used to prioritize entries going into the binding assay.

I also added one more feature, prodigy_kd, which I generated from the PDB files provided using prodigy. Prodigy is an old-ish tool for predicting binding affinity that identifies all the major contacts (polar–polar, charged–charged, etc.) and reports a predicted Kd (prodigy_Kd).

I used the typical regression tools: Random Forest, Kaggle favorite XGBoost, SVR, linear regression, as well as just using the mean Kd as a baseline. There is not a ton of data here for cross-validation, especially if you split by submitter, which I think is fairest. If you do not split by submitter, then you can end up with very similar proteins in different folds.

# get data and script
git clone https://github.com/adaptyvbio/egfr_competition_2
cd egfr_competition_2/results
wget https://gist.githubusercontent.com/hgbrian/1262066e680fc82dcb98e60449899ff9/raw/regress_adaptyv_round_2.py
# run prodigy on all pdbs, munge into a tsv
find structure_predictions -name "*.pdb" | xargs -I{} uv run --with prodigy-prot prodigy {} > prodigy_kds.txt
(echo -e "name\tprodigy_kd"; rg "Read.+\.pdb|25.0˚C" prodigy_kds.txt | sed 's/.*\///' | sed 's/.*25.0˚C:  //' | paste - - | sed 's/\.pdb//') > prodigy_kds.tsv
# run regressions
uv run --with scikit-learn --with polars --with matplotlib --with seaborn --with pyarrow --with xgboost regress_adaptyv_round_2.py

The results are not great! There are a few ways to slice the data (including replicates or not; including similarity_check or not; including non-binders or not). There is a little signal, but I think it's fair to say nothing was strongly predictive.


Model RMSE (log units) Median Fold Error
Linear Regression 0.150 0.729 1.8x
Random Forest Regression 0.188 0.712 1.4x
SVM Regression 0.022 0.781 1.2x
XGBoost 0.061 0.766 1.2x
Mean Kd only -0.009 0.794 1.9x

XGBoost performance looks ok here but is not much more predictive than just taking the mean Kd

Surprisingly, no one feature dominates in terms of predictive power

Virtual leaderboard rank vs competition rank

If there really is no predictive power in these computational metrics, there should be no correlation between rank in the virtual leaderboard and rank in the competition. In fact, there is a weak but significant correlation (Spearman correlation ~= 0.2). However, if you constrain to the top 200 (of 400 total), there is no correlation. My interpretation is that these metrics can discriminate no-hope-of-binding from some-hope-of-binding, but not more than that.

It may be too much to ask one set of metrics to work for antibodies (poor PLL, poor PAE?), de novo binders (poor PLL), and EGF/TNFa-derived binders (natural, so excellent PLL). However, since I include design_models as a covariate, the regression models above can use different strategies for different design types, so at the very least we know there is not a trivial separation that can be made.

BindCraft's scoring heuristics

So how can BindCraft work if it's mostly using these same metrics as heuristics? I asked this on twitter and got an interesting response.

It is possible that PyRosetta's InterfaceAnalyzer is adding a lot of information. However, if this were the case, you might expect Prodigy's Kd prediction to also help, which it does not. It is also possible that by using AlphaFold2, the structures produced by BindCraft are inherently biased towards natural binding modes. Then a part of the binding heuristics are implicit in the weights of the model?

What did we learn?

I learned a couple of things:

  • Some tools, specifically BindCraft, can consistently generate decent binders, at least against targets and binding pockets present in its training set (PDB). (The BindCraft paper also shows success with at least one de novo protein not present in the PDB.)
  • We do not have a way to predict if a given protein will bind a given target.

I think this is pretty interesting, and a bit counterintuitive. More evidence that we cannot predict binding comes from the Dickinson lab's Prediction Challenges, where the goal is to match the binder to the target. Apparently no approach can (yet).

The Adaptyv blogpost ends by stating that binder design has not been solved yet. This is clearly true. So what comes next?

  • We could find computational metrics that work, based on the current sequence and structure data. For example, BindCraft includes "number of unsatisfied hydrogen bonds at the interface" in its heuristics. I am skeptical that we can do a lot better with this approach. For one thing, Adaptyv has already iterated once on its ranking metrics, with negligible improvement in prediction.
  • We could get better at Molecular Dynamics, which probably contains some useful information today (at exorbitant computational cost), and could soon be much better with deep learning approaches.
  • We could develop an "AlphaFold for Kd prediction". There are certainly attempts at this, e.g., ProAffinity-GNN and the PPB-Affinity dataset to pick two recent examples, but I don't know if anything works that well. The big problem here, as with many biology problems, is a lack of data; PDBbind is not that big (currently ~2800 protein–protein affinities.)

Luckily, progress in this field is bewilderingly fast so I'm sure we'll see a ton of developments in 2025. Kudos to Adaptyv for helping push things forward.

Comment
Brian Naughton | Sat 30 November 2024 | ai | ai biotech proteindesign

Alphafold 3 (AF3) came out in May 2024, and included several major advances over Alphafold 2 (AF2). In this post I will give a brief review of Alphafold 3, and compare the various open and less-open AF3-inspired models that have come out over the past six months. Finally, I will show some results from folding antibody complexes.

Alphafold 3

AF3 has many new capabilities compared to AF2: it can work with small molecules, nucleic acids, ions, and modified residues. It also has arguably a streamlined architecture compared to AF2 (pairformer instead of evoformer, no rotation invariance).

310.ai did a nice review and small benchmark of AlphaFold3 that is worth reading.

The AF3 paper hardly shows any data comparing AF3 to AF2, and is mainly focused on its new capabilities working with non-amino acids. In all cases tested, it performed as well as or exceeded state-of-the-art. For most regular protein folding problems, AF3 and AF2 work comparably well (more specifically, Alphafold-Multimer (AF2-M), the AF2 revision that allowed for multiple protein chains) though for antibodies there is a jump in performance.

Still, despite being an excellent model, AF3 gets relatively little discussion. This is because the parameters are not available so nobody outside DeepMind/Isomorphic Labs really uses it. The open source AF2-M still dominates, especially when used via the amazing colabfold project.

Alphafold-alikes

As soon as AF3 was published, the race was on to reimplement the core ideas. The chronology so far:

Date Software Code available? Parameters available? Lines of Python code
2024-05 Alphafold 3 ❌ (CC-BY-NC-SA 4.0) ❌ (you must request access) 32k
2024-08 HelixFold3 ❌ (CC-BY-NC-SA 4.0) ❌ (CC-BY-NC-SA 4.0) 17k
2024-10 Chai-1 ❌ (Apache 2.0, inference only) ✅ (Apache 2.0) 10k
2024-11 Protenix ❌ (CC-BY-NC-SA 4.0) ❌ (CC-BY-NC-SA 4.0) 36k
2024-11 Boltz ✅ (MIT) ✅ (MIT) 17k

There are a few other models that are not yet of interest: Ligo's AF3 implementation is not finished and perhaps not under active development, LucidRains' AF3 implementation is not finished but is still under active development.

It's been pretty incredible to see so many reimplementation attempts within the span of a few months, even if most are not usable due to license issues.

Code and parameter availability

As a scientist who works in industry, it's always annoying to try to figure out which tools are ok to use or not. It causes a lot of friction and wastes a lot of time. For example, I started using ChimeraX a while back, only to find out after sinking many hours into it that this was not allowed.

There are many definitions of "open" software. When I say open I really mean you can use it without checking with a lawyer. For example, even if you are in academia, if the license says the code is not free for commercial use, then what happens if you start a collaboration with someone in industry? What if you later want to commercialize? These are common occurrences.

In some cases (AF3, HelixFold3, Protenix, and Chai-1), they make a server available, which is nice for very perfunctory testing, but precludes testing anything proprietary or folding more than a few structures. If you have the code and the training set, it would cost around $100k to train one of these models (specifically, the Chai-1 and Protenix papers give numbers in this range, though that is just the final run). So in theory there is no huge blocker to retraining. In practice it does not seem to happen, perhaps for license issues.

The specific license matters. Before today, I thought MIT was just a more open Apache 2.0, but apparently there is an advantage to Apache 2.0 around patents! My non-expert conclusion is that unlicense, MIT and Apache are usable, GPL and CC-BY-NC-SA are not.

Which model to choose?

There are a few key considerations: availability; extensibility / support; performance.

1. Availability

In terms of availability, I think only Chai-1 and Boltz are in contention. The other models are not viable for any commercial work, and would only be worth considering if their capabilities were truly differentiated. As far as I know, they are not.

2. Extensibility and support

I think this one is maybe under-appreciated. If an open source project is truly open and gains enough mindshare, it can attract high quality bug reports, documentation, and improvements. Over time, this effect can compound. I think currently Boltz is the only model that can make this claim.

A big difference between Bolt and Chai-1 is that Boltz includes the training code and neural network architecture, whereas Chai-1 only includes inference code and uses pre-compiled models. I only realized this when I noticed the Chai-1 codebase is half the size of the Boltz codebase. Most users will not retrain or finetune the model, but the ability for others to improve the code is important.

To be clear, I am grateful to Chai for making their code and weights available for commercial purposes, and I intend to use the code, but from my perspective Boltz should be able to advance much quicker. There is maybe an analogy to Linux or Blender vs proprietary software.

3. Performance

It's quite hard to tell from the literature who has the edge in performance. You can squint at the graphs in each paper, but fundamentally all of these models are AF3-derivatives trained on the same data, so it's not surprising that performance is generally very similar.

Chai-1 and AF3 perform almost identically

Boltz and Chai-1 perform almost identically

Protenix and AF-3 perform almost identically

Benchmarking performance

I decided to do my own mini-benchmark, by taking 10 recent (i.e., not in any training data) antibody-containing PDB entries and folding them using Boltz and Chai-1.

Both models took around 10 minutes per antibody fold on a single A100 (80GB for Boltz, 40GB for Chai-1). Chai-1 is a little faster, which is expected since it uses ESM embeddings instead of multiple sequence alignments (MSAs). (Note, I did not test Chai-1 in MSA mode, giving it a small disadvantage compared to Boltz.)

Tangentially, I was surprised I could not find a "pdb to fasta" tool that would output protein, nucleic acids, and ligands. Maybe we need a new file format? You can get protein and RNA/DNA from pdb, but it will be the complete sequence of the protein, not the sequence in the PDB file (this may or may not be what you want). Extracting ligands from PDB files is actually very painful since the necessary bond information is absent! The best code I know of to do this is a pretty buried old Pat Walters gist.

Most of the PDBs I tested were protein-only, one had RNA, and I skipped one glycoprotein. I evaluated performance using USalign, using either the average "local" subunit-by-subunit alignment (USalign -mm 1) or one "global" all-subunit alignment (USalign -mm 2). Both models do extremely well when judged on local subunit accuracy, but much worse for global accuracy — sadly this is quite relevant for an antibody model! It appears that these models well understand how antibodies fold, but not how they bind.

Conclusions

On my antibody benchmark, Boltz and Chai-1 perform eerily similar, with a couple of cases where Boltz wins out. That, combined with all the data from the literature, makes the conclusion straightforward, at least for me. Boltz performs as well as or better than any of the models, has a clean, complete codebase with relatively little code, is hackable, and is by far the most open model. I am excited to see how Boltz progresses in 2025!

Technical details

I ran Boltz and Chai-1 on modal using my biomodals repo.

modal run modal_boltz.py --input-faa 8zre.fasta --run-name 8zre
modal run modal_chai1.py --input-faa 8zre.fasta --run-name 8zre

Here is a folder with all the pdb files and images shown below.

Addendum

On BlueSky, Diego del Alamo notes that Chai-1 outperformed Boltz in a head-to-head of antibody–antigen modeling.

On linkedin, Joshua Meier (co-founder Chai Discovery) recommended running Chai-1 with msa_server turned on, to make for a fairer comparison. I reran the benchmark with Chai-1 using MSAs, and it showed improvements in 8ZRE (matching Boltz) and 9E6K (exceeding Boltz.)

I think it is still fair to say that the results are very close.



Complex Boltz Chai-1
9CIA: T cell receptor complex
Local
TM-Score: 0.9449
RMSD: 1.5783
Global
TM-Score: 0.3928
RMSD: 6.6600
Local
TM-Score: 0.9411
RMSD: 1.3858
Global
TM-Score: 0.3980
RMSD: 7.3400
8ZRE: HBcAg-D4 Fab complex
Local
TM-Score: 0.9216
RMSD: 1.4688
Global
TM-Score: 0.3468
RMSD: 6.6200
Local
TM-Score: 0.9070
RMSD: 1.4062
Global
TM-Score: 0.2856
RMSD: 6.1000
9DF0: PDCoV S RBD bound to PD41 Fab (local refinement)
Local
TM-Score: 0.8733
RMSD: 1.1500
Global
TM-Score: 0.7020
RMSD: 2.7900
Local
TM-Score: 0.2957
RMSD: 2.2400
Global
TM-Score: 0.7022
RMSD: 2.7100
9CLP: Structure of ecarin from the venom of Kenyan saw-scaled viper in complex with the Fab of neutralizing antibody H11
Local
TM-Score: 0.9762
RMSD: 0.9667
Global
TM-Score: 0.6545
RMSD: 2.3700
Local
TM-Score: 0.9607
RMSD: 1.2233
Global
TM-Score: 0.6675
RMSD: 3.2100
9C45: SARS-CoV-2 S + S2L20 (local refinement of NTD and S2L20 Fab variable region)
Local
TM-Score: 0.9903
RMSD: 1.3600
Global
TM-Score: 0.5288
RMSD: 4.0900
Local
TM-Score: 0.9912
RMSD: 2.7033
Global
TM-Score: 0.5141
RMSD: 4.3500
9E6K: Fully human monoclonal antibody targeting the cysteine-rich substrate-interacting region of ADAM17 on cancer cells.
Local
TM-Score: 0.7462
RMSD: 2.4400
Global
TM-Score: 0.7732
RMSD: 4.2200
Local
TM-Score: 0.9676
RMSD: 1.3633
Global
TM-Score: 0.8015
RMSD: 2.7300
9CMI: Cryo-EM structure of human claudin-4 complex with Clostridium perfringens enterotoxin, sFab COP-1, and Nanobody
Local
TM-Score: 0.9307
RMSD: 2.1680
Global
TM-Score: 0.4448
RMSD: 5.6900
Local
TM-Score: 0.9307
RMSD: 2.3560
Global
TM-Score: 0.4464
RMSD: 4.4400
9CX3: Structure of SH3 domain of Src in complex with beta-arrestin 1
Local
TM-Score: 0.8978
RMSD: 1.3867
Global
TM-Score: 0.5045
RMSD: 2.3200
Local
TM-Score: 0.8916
RMSD: 1.2617
Global
TM-Score: 0.4487
RMSD: 2.6700
9DX6: Crystal structure of Plasmodium vivax (Palo Alto) PvAMA1 in complex with human Fab 826827
Local
TM-Score: 0.7870
RMSD: 3.1400
Global
TM-Score: 0.5551
RMSD: 5.6100
Local
TM-Score: 0.2757
RMSD: 2.3067
Global
TM-Score: 0.5861
RMSD: 4.7600
9DN4: Crystal structure of a SARS-CoV-2 20-mer RNA in complex with FAB BL3-6S97N .
Local
TM-Score: 0.9726
RMSD: 0.8500
Global
TM-Score: 0.9850
RMSD: 0.9300
Local
TM-Score: 0.9938
RMSD: 0.4550
Global
TM-Score: 0.9957
RMSD: 0.4900
Comment
Brian Naughton | Sat 07 September 2024 | biotech | biotech ai llm

Some notes on the Adaptyv binder design competition

Read More

A simulation of evolution and predator–prey dynamics

Read More

Using LLMs to search PubMed and summarize information on longevity drugs.

Read More
Brian Naughton | Sun 14 January 2024 | datascience | datascience ai llm

Using LLMs to search pubmed and summarize information.

Read More

An example and examination of using modal for bioinformatics

Read More
Brian Naughton | Mon 04 September 2023 | biotech | biotech machine learning ai

Molecular dynamics code for protein–ligand interactions

Read More

Using colab to chain computational drug design tools

Read More

Boolean Biotech © Brian Naughton Powered by Pelican and Twitter Bootstrap. Icons by Font Awesome and Font Awesome More