Support for 23andMe-exome-data

In the last weeks some people contacted us because they wanted to upload their exome-data to openSNP. While we started out with a focus on SNP-data (mainly because it’s much more abundant) providing support for exome-data has been on our ToDo-list for a couple of month now, so we did some work on it. Today we start with support for the data which is generated by 23andMe through their Exome-service.

So if you’ve got your exome sequenced through 23andMe you can now upload it to openSNP as well. Unfortunately there are some drawbacks right now:

  1. We only support upload for the VCF (Variant Call Format) and not for the BAM-file which contains the raw reads as well.
  2. We won’t parse all of your data. We will scan your exome for those SNPs which are already in the openSNP-dataset, but we won’t add any new ones to the web interface and won’t make other variants available for browsing on openSNP in the near future.

There are two main reasons for this limited feature-set:

  1. openSNP is our hobby project and we run with limited financial and computational resources. Basically right now we don’t have the computational power and storage to store and parse complete exome-BAM-files to deliver any benefit.
  2. Mining literature for known SNPs is easy, because a unique identifier exists for each SNP. This identifier is used in most publications and is seldom used for other purposes than referencing the SNPs. For Copy Number Variations etc. this isn’t that easy and we still haven’t had a good idea to get around this.

Nevertheless: If you’ve got your exome sequenced through 23andMe you can now upload your VCF-file. We will search for all known SNPs and make your genotypes available on the web-frontend of openSNP. And of course the complete VCF-file will be downloadable for anyone who is interested in it. Have fun uploading your data. And as always: Let us know if something doesn’t work as expected or if you have any further ideas for us.

Advertisements

A petition on Open Access

We are strong supporters of the open-science movement and are happy to see that there is a new petition aimed at the Obama administration. The goal of this new petition is to make Open Access mandatory for all publications funded by tax-money.

While one of the aims of openSNP is to provide an Open Data platform it would be impossible to provide this service without the great benefits of Open Access, of which we make extensive use when we link to external information and papers. Would all research still be hidden behind paywalls we could not provide you with information on any SNPs. If Open Access would be mandatory we could provide you with even more information and literature!

This is the full text of the petition:

WE PETITION THE OBAMA ADMINISTRATION TO:

Require free, timely access over the Internet to journal articles arising from taxpayer-funded research.

We believe in the power of the Internet to foster innovation, research, and education. Requiring the published results of taxpayer-funded research to be posted on the Internet in human and machine readable form would provide access to patients and caregivers, students and their teachers, researchers, entrepreneurs, and other taxpayers who paid for the research. Expanding access would speed the research process and increase the return on our investment in scientific research.

The highly successful Public Access Policy of the National Institutes of Health proves that this can be done without disrupting the research process, and we urge President Obama to act now to implement open access policies for all federal agencies that fund scientific research.

The best thing is that you don’t have to be a US citizen to sign the petition. Just register on the petition platform and give your signature. In total 25,000 signatures are required before June 19 2012. If this limit is reached the petition will be placed in the Executive Office of the President for review.

The campaign behind it, access2research, is driven by many open access advocates, including Michael Carroll, Heather Joseph, Mike Rossner, and John Wilbanks.

Tagged ,

Replies to all genotyping-applications have been sent

It took us some time to read through all your great applications and it was a hard task to decide which applications we should consider for the free genotypings. After heavy discussions and bargaining we had two things: a list of successful applications and a short list of applicants that would get a 23andme-kit if there would be a DNADay-discount with 23andMe this year.

By now all applicants should have received an email from us which either brings the good or bad news. If you haven’t received an email you should take a look into your spam-folder (look for a message from info@opensnp.org). A handful of applicants either had some typo in their address or have closed down the account they used to register for the application. If you haven’t heard from us at all this might be the reason.  In this case, please contact us again and tell us your old or original e-mail-address.

For everybody on the shortlist: Unfortunately 23andMe doesn’t offer discounts this year, so we can’t consider your applications. We’re really sorry about this as we really would like to genotype all of you. Drop us an email if you want to get notified about a potential follow up free-genotyping. We’re still trying to get some funding for this.

Please give us your postal address if your application was successful. We need it to order you a spit kit.

Thanks again to everybody who participated.

Cheers,
the openSNP-team

Update on the free genotypings and new features on openSNP

The free genotypings
This night the deadline for free genotypings passed, and we are overwhelmed by the amount of responses. In less than 24 hours after the first blog post over 200 people sent us their applications, and in total over 450 people applied. For us the real work now begins, and we will do our best to go through all the applications, select the most interesting, and then contact every applicant. 
 
Entering Phenotypes
We also used the past weeks to implement some more features into openSNP. This time we focused on end-users entering their data into openSNP, especially on entering phenotypic variations. Entering a lot of phenotypes was a bit cumbersome and took too much time. Now, registered users will find a subtle change to their dashboard: The tab which shows phenotypes you haven’t yet provided now features a small button that allows you to easily fill in your information.
Bild
 A modal window (like the one in the screenshot above) will appear and you can choose your variation from the list of already known answers, or you can choose to add a new one. This should make entering data a lot faster. 
 
Phenotype Recommendations
We also added a recommendation engine for phenotypes, which can be found on individual phenotype pages. An additional tab will show you which phenotypes have most frequently also been entered by users who have entered variations on the page you are looking at. 
Bild
Additionally, you can use this system to get recommendations about similar phenotypes and even variations. If you enter your variation for a phenotype through the old phenotype view you will get a similar picture to this: 
Bild The top row shows you up to three phenotypes and the variation which is often entered by users who have provided similar variations to you. So lets’s say you’ve just entered that your Body Mass Index is above 30, the system might point out that people with this variation frequently have entered that they have a high blood pressure. 
 
The lower row shows you up to three phenotypes which are often entered by users who have entered any information about similar phenotypes as you, irrespective of the variation provided. For example, if you’re interested in visible traits and have just provided information about your eye colour you will see that people who are interested into this also provide information about their hair and skin colour. 
 
Given the rising numbers of different phenotypes which you can enter in openSNP, this should help you to find those that are of interest of you. You might have noticed that you wont get those recommendations if you are using the new quick-entry-feature on your dashboard. Our reasoning for this: Using the dashboard you can enter a large amount of variation in rapid succession. Here, displaying recommendations after each phenotype entered would slow you down. But if you like to “browse” and don’t want to rapidly enter large amounts of data, you might benefit from having some recommendations what might also be of interest for you. 
 
How do you feel about this? Would you like to see an option to get those recommendations also after entering variation using the dashboard? 

Apply now for a free genotyping

At the end of last year we announced that we’ve got some funding from the German WikiMedia foundation to get more people – who are willing to share their results – genotyped. We have now settled on a process that should allow us to perform the project without too many problems. Starting today, you can apply for one of the free genotypings. The deadline for applications is Sunday, 03/25/12 23:59 o’clock, so you still have some time to think about an application. In the two weeks following the deadline, we will select as many participants as we can afford to get genotyped using the 5000 Euros we received from Wikimedia. We’ll get in contact with everybody who has sent an application to let all applicants know whether their application was successful or not.

The genotyping will be done through 23andMe. We will order you a gift kit which will be delivered to your address. These gift kits include prepaid access to the 23andMe website for 12 months, so you can check up on the latest findings about your genetic variation as well. After this 12 month period, those features will expire automatically, you don’t have to cancel any subscriptions.

Our application form contains some standard questions (Where do you live? Does 23andMe deliver to your country? etc.) but also some details about your motivation, why you want to make your dataset available to the public and why your data might be of great interest (For example: Do you have a rare disease where research is lacking?). Additionally, we will also try to get people genotyped who are currently under-represented in publicly available data sets. Most data up to now is from WEIRDs: Western, Educated, Industrialized, Rich and Democratic people (most are probably male, too).

We would like you to deposit the final raw data, which you will then be able to download from 23andMe, into the openSNP database, ideally along with some phenotypic information about yourself. So please think about the possible consequences which may arise by doing so before you apply for one of the genotypings. The application process has some questions about possible consequences as well (just so we get a feeling of whether you know what you are doing). If you get your results, but then find the results too problematic to publish: That is fine. We are aware of this possibility and while it would suck for us as it means less data, you are the one who has the last word in this matter. Some information that might make you a bit more comfortable with the idea of sharing data: We won’t release the names of any applicants (whether successful or unsuccessful) and you can sign up to openSNP using a pseudonym, plus we don’t log any IPs used to access openSNP.

tl;dr

We offer you the chance to get genotyped through 23andMe for free if you are willing to share the data with the public. Here’s the planned schedule:

  • Until 03/25/12 23:59 o’clock you can apply for a genotyping using this application form
  • We select the lucky winners between 03/26/12 and 04/08/12 and get in contact with every applicant.
  • Mid-April: You should receive the 23andMe-kits in your mail.
  • End of May: You should receive the results of the genotyping, so you can upload the results to openSNP.

If you’ve got any questions regarding the application process, the schedule etc., just let us know using the comments or write us an email to info@opensnp.org. We will try to answer all of your questions as fast as possible.

Good luck,
wishes your openSNP-team!

Videos on openSNP & DAS-support

Another thing we have been procrastinating for far too long: Creating videos on the idea of openSNP and some screencasts that show how you can use openSNP to enter data about yourself and how you can get the data out again for your own research. So here we go. The first video is a small “self-interview” I did to tell you a bit how we started, what you should keep in mind privacy-wise before starting to use openSNP etc.

The next video shows you how you can use the openSNP-frontend to enter your phenotype-data, what kind of information the individual SNP-pages can show you and how you can subscribe to the openSNP-RSS-feeds to be notified about the latest genotyping-files etc. A small new feature which is missing from the video as we implemented it after recording the video: The news-page features a tab that includes the latest publications on the SNPs we have in the database. And for all info-junkies: You can also subscribe to the latest publications using RSS.

The last video shows how you can query the APIs which we have implemented. Bonus-Content: This video includes the first preview on how you can use the Distributed Annotation System to visualize individual genotyping files! In short: You can use http://opensnp.org/das/sources to get a list of all DAS-sources we have with openSNP. Each source represents all SNPs we have of a single user, regardless of how many genotyping files a user has provided.

If you want to use a DAS-source in a genome browser, for example in MyKaryoView you can use the features-commmand of DAS. The link for this is http://opensnp.org/das/$user_id/features, where you have to replace $user_id by the ID of the user you are interested in. If you want to query SNPs between chromosomal positions using DAS you can use http://opensnp.org/das/$user_id/features?segment=$chromosome_name:start,stop. So http://opensnp.org/das/1/features?segment=1:1,1000000 will give you all my SNPs on Chromosome 1 between position 1 and 1000000.

If you want to see an example for a visualization using DAS look at the video below. The DAS features are still experimental. I will attend this DAS workshop to get some help with the final implementation, so if you have suggestions: Please let us know!

Enjoy playing around with this features. As usual: Let us know if you find any bugs we have missed!

Some progress on the API: JSON endpoints

Some weeks ago we stated that we are working on implementing the Distributed Annotation System into openSNP. And I’m sorry that I’ll have to announce that we are not finished with this yet. We just underestimated the amount of time it would take to finish this. But to make up for this we just released some JSON (JavaScript Object Notation) endpoints which you can use to get data out of openSNP. JSON can be easily parsed using software and is already widely used, especially in web applications. For a start we added JSON support for the user-index, for genotypes at single SNPs and for all phenotypes of a given user and I’ll briefly discuss how you can access the different JSON endpoints.

Let’s start with the user-index, which can easily be accessed at http://opensnp.org/users.json. This includes the complete list of all openSNP-users and their names, their unique user-IDs and all the genotyping-files (with the unique genotype-IDs and the download-links). We hope that this makes an ideal entry-point if you are looking for genotyping-files and the user-IDs to further query the openSNP-database.

If you want to get genotypes of single or multiple users for a given SNP you can use the JSON endpoint at http://opensnp.org/snps/json/$snpname/$userid.json. Just replace $snpname by the Rs-ID you are interested in and $userid by the unique ID of the user you are interested in. For example: http://opensnp.org/snps/json/rs9939609/1.json gives you my genotype at Rs9939609. If you are interested in the genotypes of multiple users you can concatenate this into a single query by either using commas to provide multiple User-IDs (for example http://opensnp.org/snps/json/rs9939609/1,6,8.json) or by giving a range of user-IDs (for example http://opensnp.org/snps/json/rs9939609/1-8.json).

Similarly you can access all phenotypic information of a given user by using http://opensnp.org/phenotypes/json/$userid.json. Again: Just replace $userid by the unique ID of the user you are interested in. For example http://opensnp.org/phenotypes/json/1.json gives you all the phenotypic information I have entered about myself so far. Concatenating multiple users into one query works just as for the SNP/User-combinations by using commas (http://opensnp.org/phenotypes/json/1,6,8.json) or ranges (http://opensnp.org/phenotypes/json/1-8.json). In any case: If you request data of users or user/SNP-combinations that don’t exist the JSON-hash you will get back includes the key “error”, just like this.

This are all the options you can supply by using our JSON-endpoints right now. There are no API-keys and no rate limits. We will just see how it turns out and if any limiting of the access will be necessary in the future. We hope that this will allow more easily reusing the openSNP-data and you maybe have already some nice ideas for remixes/browser plugins/younameit. If you have any requests, which kind of JSON-endpoints you need or would like us to add, just let us know. We are currently experimenting with this JSON-stuff and are open for any critique, comments, ideas etc. If you want to help us to implement further features into openSNP: Please do so, we are open for everybody who wants to participate and want to invite you to do so. The source code is freely available and there is a Google Group/mailinglist where we discuss bug-fixing, new features etc. So you might want to join us there.

Videos and Slides on the recent talks

A happy new year from the openSNP-team! Philipp and I are back from our talks. If you couldn’t make it to Berlin you can now watch the videos that were recorded during our talks. You can watch the recordings from our talk on crowdsourcing genome-wide association studies at the 28th Chaos Communication Congress at YouTube or in better quality here. If you are interested in our slides you can get them at SlideShare or as LaTeX-sources at GitHub.

For those of you who speak german: You might be interested in our talk on the privacy implications of the coming post-genomics era, which we gave at the 0. Spackeriade. You can watch it on YouTube as well or download the video. Again: The slides can be found at SlideShare or as LaTeX-sources on GitHub.

Thanks for all who helped on the slides, gave us their feedback and of course all of you who approached us after the talks – in real life or via email – and had some ideas for new features. We already started to work those. Stay tuned to see some changes on openSNP in the next weeks.

Happy Holidays

We have some last news before we leave for our holidays. Let’s start with the biggest news: We were able to secure a little funding through the WissensWert-contest of the german Wikimedia Foundation (sorry, the posting is in German as well). This means that we will have up to 5000 Euros that we can spent to get some more people, who are in love with sharing as we are, genotyped. We will release more details on this as soon as possible.

Additionally Philipp and I will be in Berlin between 12/27 and 12/31. As we have mentioned before we will give a talk on openSNP and crowd-sourced genome-wide association studies at the 28th Chaos Communication Congress. The talk will be on 12/28 at 11 pm. This talk will be in english and there should be day passes, so if you are in town you can pay us a visit. If you are If you’re able to speak or understand German you can also pay a visit to the 0th Spackeriade which takes place on 12/29. We will talk about the implications of the post-genomics-era on privacy.

Thanks again for all your support, for voting for openSNP in the different contests we have entered, for sharing your data with us, for finding bugs, for spreading the word. Have some nice holidays and maybe we’ll see some of you for a beer in Berlin.

The WissensWert contest vote is now open!

Hello everyone!
The Mendeley/PLoS Binary Battle is now over, so it’s time for the next one  – the WissensWert-Contest (page is in German) by the German Wikimedia-Foundation.

They have pledged up to 7000€ for ideas that promote open knowledge and open science – naturally, we had to apply!

What we’re trying to do with the money (if we win) is to give out free genotypings to people who can’t afford them, but still want to participate in the research. We could give out up to 35 genotypings with the money. Of course, just because we give anyone the money to get genotyped doesn’t mean that they have to publish it – we can’t force you to reveal potentially damaging information about you.

To win this contest we need your voteyou can vote here, the page is in German but you only have to activate the radio-button for project “02-Open (Citizen) Science durch mehr öffentlich verfügbare Genotypisierungen” (that’s us, translation: “Open (Citizen) Science through more openly available genotypings”) and then press the submit-button on the bottom of the page. You don’t necessarily have to supply any of the additional information (which seems to be for their statistics), but if you speak or write German I’m sure they would appreciate the input!

We’re thrilled to have won the Mendeley/PLoS Binary Battle and we’re sure we couldn’t have done it without you guys. Thanks for your votes & your continuing support!

The openSNP-team

%d bloggers like this: