archives –––––––– @btnaughton
Brian Naughton | Tue 06 January 2015 | data | biobank

Over the couple of months I've formally applied to or contacted several biobanks to inquire about access to data and samples.

Kaiser RPGEH biobank

Kaiser's biobank is large, at 200k samples, but only 20k are blood samples, and the rest saliva. Eventually, they plan to reach 500k, but I assume most of these will be saliva too. Kaiser has a well-structured application process designed for collaborations. The one downside is that it's pretty long and complex, as the flowchart below shows. I submitted a "pre-application", which is a three or four page form, including a few thousand words describing the project. I now have to wait for Kaiser to match me with a researcher on their side.

flowchart

Vanderbilt BioVU

Vanderbilt's biobank is also large at 180k samples and, since it's tied to Vanderbilt's EMR, the phenotype data is rich. However, to do any research with BioVU, you need to first identify a collaborator at Vanderbilt. It seems like one of the major goals of the biobank is to generate collaborations. BioVU is part of the eMERGE network, a group of biobanks that include EMR data and genotype data.

Mount Sinai BioME

Mount Sinai has 30k samples, and like Vanderbilt, it's part of the eMERGE network. They have a portal called BioSERVE, but it is down at the time of writing. Like Vanderbilt, you really need a collaborator to help you navigate this one.

Estonian Biobank

The Estonian Biobank has 50k samples and the population is skewed quite old (perhaps 10% over 80). Unfortunately, only 15k of their samples are genotyped. The Estonians have a refreshingly straightforward application form and process for data access.

data

China Kadoorie Biobank

The China Kadoorie biobank is one of several international biobanks that are administered from Oxford. The biobank is 500k samples and they had BGI genotype 100k samples with a 384 SNP panel. The content of the panel is sadly unspecified. The CKB application process is pretty straightforward and they have a lot of projects underway. The recommended first step with CKB is an informal inquiry by email, so that's what I did.

UK Biobank

The UK biobank is large at 500k samples, and ostensibly open to biotech collaborations. It's also possibly the best phenotyped, although I believe no EMR data is included. However, UK Biobank are yet to genotype their samples, and the oldest participants are only 69 — an odd restriction on an otherwise amazing dataset and the major reason I have not applied here.

Other biobanks

There are a number of other biobanks out there, with varying degrees of obfuscation of what's actually in the bank and how to get access. I think this will improve over time, but meanwhile, I'd love to figure out which biobanks are really open and which are not.

Comment

Boolean Biotech © Brian Naughton Powered by Pelican and Twitter Bootstrap. Icons by Font Awesome and Font Awesome More