1000 genotypings, and 3 years of openSNP!

In 2011, someone asked me how many genotypes I personally would expect users to upload, and if I remember correctly, I said 30. That was quite the understatement: Just a few days ago on the 30th of May, openSNP received its 1000th genotyping!

On this happy occasion we thank the users and participants for their trust in the project and their continued support and interest.

Since 2011, people have used openSNP in research, art, and their own projects, have written additional software to interact with openSNP’s API, written great comments, and much more. We have published a paper on openSNP a few months ago, which is for most of us the first publication in our careers.

openSNP has come a long way, here are the first three commits from June 2011, with Basti’s oldest on the bottom:


That was nearly exactly 3 years ago! Since then we’ve (among other things) learned to write proper commit messages.

So what does the future hold for openSNP?

  1. A better server: We recently received a grant from Bayer HealthCare so we can move to bigger servers, so that the site should load and react much faster. Maybe we can even start hosting bigger datasets?
  2. Pre-given phenotypings: One of the biggest problems with openSNP’s data is the high amount of variation in phenotypes entered by users: researchers who want to work with the data still have some manual cleaning to do. We’ve prepared a set of phenotypes for which users can only choose their variation; this should greatly improve the speed with which researchers can start working on the data.
  3. Faster parsing: we’ve replaced the Ruby-based genotyping parser by a 99% complete implementation written in Go. So far, it’s much much faster, but still only marginally tested.
  4. A variety of smaller things: A stats-page, bug-fixes, genosets, etc. – have a look at the Issues page here.

We thank you for your continued support and interest, and here’s to many more years! If you know of any other project that uses openSNP data, feel free to post it in the comments!

11 thoughts on “1000 genotypings, and 3 years of openSNP!

  1. Anaro Lexnon says:

    Congratulations, thanks for all the hard work – and for rounding up these projects, hadn’t heard of most of them, very cool.

    I hope the ‘pre-given’ phenotypes will be in addition too, rather than replacing the ‘Lanierian’ freedom we have now though, it’s one of the things that excites me about the platform.

    Be interesting to see how long until the next 1000…

  2. Anastasius says:

    In order to make any sense of it, you definitely should work on your think about a more structured questionaire à la 23andme for the environment effects. Also think about pre-existing answers-tags like on askubuntu.com, which only can be created by older users. Without consistency this projects is no help.

  3. Anastasius says:

    Are there definitive cancer markers that can be found on the level of a 23andme-dna-file?

    • phi1ipp says:


      SNPedia has an overview of the SNPs linked to mutations in the BRCA1/BRCA2 genes, which are linked to breast cancer. http://www.snpedia.com/index.php/BRCA1

      You may remember when Angelina Jolie was in the news for having a dobule mastectomy, she had family history and she carries some risk alleles: http://www.nytimes.com/2013/05/14/opinion/my-medical-choice.html?_r=0

      At least some of these SNPs are on the 23andme array:
      https://opensnp.org/snps/rs28897696 – all users carrying the “no risk” allele
      https://opensnp.org/snps/rs55770810 – all carrying the “no risk” allele
      https://opensnp.org/snps/rs1799950 – a few carrying the “higher risk” allele
      and many more.

      Interestingly, all of these are on the opposite strand compared to the “known” risk alleles. That’s just from the SNP array, we can’t fix that.

      • Anastasius says:

        But if I understand correctly, that’s still just a correlation to a tendency/risk to get a certain type of cancer the average population has. Is there anyway to tell that a certain type of cancer already is present in your dna with high certainty?

      • phi1ipp says:

        Aah I see – as far as I know, to find out whether you are actually carrying cancer right now you’d have to, by chance, sequence an actual cancer cell (as in for example http://www.ncbi.nlm.nih.gov/pubmed/17418407 )
        That’s very hard to do, especially since the usual genotyping sets like 23andme just use oral swabs, and the chances to just randomly pick up a few cancer cells there are slim. Furthermore, there are always cancer cells present in your body, they’re just being destroyed by your body. Any sequencing set might pick up these cells and you’ll get false positive signals..

      • Anastasius says:

        I see. I guess I have to attend this genetics intro course on Coursera. Hence this whole phenotype correlations are nothing more than a horoscope, if the standard deviation is not extraordinarily high.

  4. […] What if such data were opened up? Everyone, has access to data contributed to the openSNP genome sharing platform. The catch with openSNP is that the number of people willing to make their DNA public is still small, but growing. […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: