A good five days have passed since we started openSNP and it has been a lot of fun, even more work and not that much sleep since then. But it’s time to answer some questions and give you some feedback on what we have done so far and what we are up to:
Who is behind openSNP?
Whoops, this is definitely something we could have made more clear in the first posting. But here we go: Bastian, Fabian and Philipp did their undergraduate studies in Life Sciences and are currently doing their master-programmes. Bastian currently studies Ecology & Evolution, Fabian studies Biology and Philipp studies Computer Science. Helge is the only “real” web developer on the team and has helped us a lot in testing much of the things we did. We are not working full time on this project, this is more of a hobby. Please give us some time to answer your questions, fix bugs and stuff like this as we are doing this in our free time besides our studies and day jobs.
Why all this?
OpenSNP is a non-profit, open-source project that is about sharing genetical and phenotypic information. The idea to this project came to Bastian after he was genotyped by 23andMe in May and started playing around with his data. During his research he became frustrated, because it was not that easy to find mode data. He started working on openSNP to fix this. To be clear: This project is not about making money, selling data or to quote Google: “We don’t wanna be evil”. We are just interested in making science more open and accessible.
Up to now 20 people have registered with openSNP and eight of them have uploaded their genotyping files. All genotyping files are now parsed into the openSNP database. Together, this already accounts for 1327142 different SNPs and in total we have 7672504 SNPs in the database. Given our bumpy start those numbers are great.
Many of you have found bugs in openSNP and we have tried hard to fix them all. For example: There were some bugs in the commenting/messaging system which could break displaying those pages correctly. There was also a bad usability bug on the settings page. Those bugs should all be fixed by now. Thanks to all of you, especially to Nash, who totally deserves his “extremly high” on finding openSNP-bugs. If you discover any other bugs: Just let us now, we will start hunting them down right away.
The performance, especially in the first days and regarding the parsing of your genotyping files was horrible. Two factors caused this trouble: #1: Our bad job on writing a performant import script. #2: our limited server capacities. We worked hard on the first issue and somehow we solved it (In tech-speak: We drastically optimized the number of database transactions). Now it should take a maximum of 3 hours to parse a file with 1.5 million SNPs. This is as fast as it gets, given our current server capacities.
For the tech-savvy people: Right now, openSNP runs on a single-core machine with only 1 GB of RAM. Even now this is not enough power to deliver a good experience. But we are already looking for a larger machine (with more cores and much more RAM) to give you a better time using openSNP.
We already have a number of ideas and new features we want to implement into openSNP and we would like to present some of them to you:
- Adding support for Family Tree DNA, which is another service that provides DTC-testing. Nash was kind enough to provide us a file which we can use to implement the file upload for this provider.
- Mail Notifications: Right now users don’t get notified about new content they may be interested in: New messages, new phenotypes, new replies to their comments. In the near future, we will implement those important mail notifications (Of course: you will be able to easily disable/enable them in the settings).
- Implementing Social Media. We know: People love to socialize and share, so we are probably going to implement support for Facebook, Twitter et al., so you can easily share the latest phenotypes you entered or the latest achievements you unlocked (if you want to).
- Making downloading genotyping files of users even easier. Right now, there is no easy way to download an annotated data-set of a single user. This will be fixed.
- Would you like to see a “following”-feature for phenotypes? The idea would be that by this you can be easily notified by mail about all changes to this phenotype: So if there are new comments, new variations and new genotyping-files available you could get an email. Is this something you’d like?
- Would you like to use openSNP to annotate papers? Say you read a paper which was linked on openSNP, would you like to link your comments to this paper and make it available for others? Of course, this could be linked using the PLoS or Mendeley API.
We really appreciate getting your feedback on the feature ideas we already have. And as usual: You have an idea for a feature that is missing and not on this list? Please let us know.