Category Archives: Miscellaneous - Page 3

23andme Sale

23andme is having the DNA Day sale early.

Monday April 11, 2011, they are selling the kits for FREE with a $9/month 1 year commitment. So basically a total of $108. This is compared to $199 + $9/month (=$307) regular price. It's even less than the Christmas sale (assuming you cancel subscription after a year).почему одинаковые на внешний вид дома имеют разную цену?

The sale is on midnight Pacific time tonight (3am Eastern time or 7am GMT) and will end April 11 11:59pm Pacific time (2:59am Eastern time or 6:59am GMT April 12).

Spread the word and get people to participate in our Harappa Project too.

Two Steps Forward, Two Steps Back

I got my daughter a netbook, so now my computer is doing Harappa Project work 24x7.

Also, Simranjit was nice enough to offer me the use of a server. For privacy reasons, I am not going to upload any of the participants' data there but it is much faster than my machine and hence very useful for running Admixture on the reference data (especially with crossvalidation).

As for steps back, I downloaded the current 1000genomes data (1,212 samples, 2.4 million SNPs). It's in vcf format. Using vcftools to convert it to ped format will take about 3 weeks. Yes you heard that right. BTW, the good stuff from a South Asian point of view will come later this year with a 100 Assamese Ahom, 100 Kayadtha from Calcutta, 100 Reddys from Hyderabad, 100 Maratha from Bombay and 100 Lahori Punjabis.

Also, I spent most of Sunday evening and night in the ER and got a diagnosis of ureterolithiasis for my efforts. All I can say is: Three cheers for Percocet!!

UPDATE: Dienekes was kind enough to send me his conversion code which looking at the source code should run really fast.

I am still astonished at why the vcftools conversion code is so slow. May be I should look at their source code.

My Genetic Journey II

While my computer's busy running K=12 admixture on batch 7, K=17 admixture on batch 1, some MClust experiments and converting 1000genomes data from vcf to ped and I am reeling from the pollen count (3,939 yesterday), here are some links to my personal genetics blogging.

For the record, my daughter complains about all the "Trantor windows" open on the computer all the time. She calls the terminal windows "Trantor" because of the shell prompt. My desktop is named Trantor. Now who can guess what my laptop, my other desktop and my wireless network are named?

Dienekes on ANI/ASI

Dienekes has a word of caution about choosing reference populations and admixture results.

Consider a sample of 25 Mexicans from the HapMap and 25 Yoruba from the Hapmap, 25 Iberian Spanish from the 1000 Genomes Project, and 25 Pima from the HGDP as parental populations. We obtain for our Mexican sample:

  • 59.7% European
  • 36.9% "Native American"
  • 3.4% African

Let's run a final experiment with just the Mexicans, Spanish, and Yoruba, i.e., with no Native American samples. At K=3 we obtain:

  • 70% "Native American"
  • 29.7% European
  • 0.4% African

The "Native American" component has increased again! The explanation is simple: as we exclude less admixed Native American groups, Mexicans appear (comparatively) more Native American. The "Native American pole" has shifted, and so has the relative position of populations between them.

In other terms, what is labeled "Native American" in the three experiments is not the same: in the first one it is anchored on the more unadmixed Pima, in the last one in the more admixed Mexicans.

Thus, it seems that unadmixed reference samples are much more useful in getting good results from Admixture.

Then he runs Admixture on the Reich et al dataset for South Asians and tries to estimate the relationship between the Ancestral North Indian percentage computed by Reich et al and his K=2 admixture results on the same data.

Dienekes then included South Asian Dodecad participants in the analysis and ran a K=4 admixture analysis on Reich et al + Dodecad South Asian data, including Yoruba and Beijing Chinese from the HapMap to catch any African or East Asian ancestry.

Here are the admixture results for the reference populations:

The R2 correlation between the West Eurasian admixture component and the Reich et al ANI component is 0.98 which is good. His relationship equation comes out to:

ANI = 0.779*WestEurasian + 39.674

Using this relationship, he calculates the ANI and ASI (Ancestral South Indian) components for Dodecad project members. My results (DOD128) are as follows:

East Eurasian 0.0%
African 3.5%
Ancestral North Indian 75.9%
Ancestral South Indian 20.6%

I should point out that due to my recent Egyptian ancestry, my ANI result is wrong since it's collecting all of the non-African Egyptian in there too.

Also, in the case of Razib, I don't think his East Asian 14.4% should be separated out from his ANI-ASI like that. At least some of it should form part of his ASI percentage in my opinion.

Otherwise, this seems like a very good exercise by Dienekes.

Your Genes, Regulated?

The FDA had a meeting the last two days:

FDA is convening this two-day meeting to seek the Panel’s expert opinion and input on scientific issues concerning Direct to Consumer (DTC) genetic tests that make medical claims.

This meeting is focused specifically on issues regarding clinical genetic tests that are marketed directly to consumers (DTC clinical genetic tests), where a consumer can order tests and receive test results without the involvement of a clinician.

The American Medical Association of course wants to limit genetic testing so that you would need a doctor to supervise everything.

We urge the Panel to offer clear findings and recommendations that genetic testing, except under the most limited circumstances, should be carried out under the personal supervision of a qualified health care professional, and provide individuals interested in obtaining genetic testing access to qualified health care professionals for further information.

23andme had two presentations at the meeting which they have posted on their blog.

In our presentations, we take the position that all genetic testing services, whether ordered by a physician or offered through direct access, should adhere to the same standards. We simultaneously request that the FDA consider redefining and establishing regulatory standards, including some fundamental definitions, to accommodate large-scale genetic testing and support innovation of its technologies and applications. We also request that regulation be based upon evidence and not fear of potential harm to individuals which, to date, has not been demonstrated. In fact, growing numbers of participating individuals and independent studies focused on this issue provide preliminary evidence that the vast majority of people understand the information presented and experience no significant negative effects.

Genomics Law Report had an overview of the issues beforehand as well as a Twitter roundup of the meeting. Here are his thoughts after the first day:

First and foremost, I fully expect the MCGP (Molecular and Clinical Genetics Panel) to note, likely more than once, that given the complexity of the questions put to it by the FDA it should be afforded far more time to deliberate and research prior to making any recommendations.

If taking time out for further debate isn’t an option, what is the MCGP likely to recommend? Based on today’s deliberations, I think it’s a safe bet that the MCGP will advise the FDA to (1) demand clear proof of analytical and clinical validity for all genetic tests and (2) require that most, or perhaps even all, genetic tests with demonstrated or potential clinical significance be (to use the FDA’s terminology) “routed through a clinician.”

In other words, I think the odds strongly favor an MCGP recommendation to the FDA that clinical (as defined by the FDA, which is itself a separate issue) direct-to-consumer genetic testing, when offered without a requirement that a clinician participate in the ordering, receipt and interpretation of the test, be removed from the marketplace. At least for the time being.

If you read my blog, you probably know my politics. I do however think that any regulations have to be shown to have actual tangible benefit and prevention of harm. Simple misinterpretation of genetic results by a regular joe causing hypothetical harm is not enough justification.

So what can you do? Razib Khan is already on the task.

1) I am going to release my own 23andMe sequence into the public domain soon. I encourage everyone to download it. I would rather have someone off the street know my own genetic information than be made invisible by the government. That is my right. For now that right is not barred by law. I will exercise it.

2) Spread word of this video via social networking websites and twitter. The media needs to get the word out, but they only will if they know you care. Do you care? I hope you do. This is a power grab, this is not about safety or ethics. If it was, I assume that the “interpretative services” would be provided for free. I doubt they will be.

3) Contact your local representative in congress. I’ve never done this myself, but am going to draft a quick note. They need to be aware that people care, that this isn’t just a minor regulatory issue.

4) The online community needs to get organized. We’re not as powerful as a million doctors and a Leviathan government, but we have right on our side. They’re trying to take from us what is ours.

5) Plan B’s. We need to prepare for the worst. Which nations have the least onerous regulatory regimes? Is genomic tourism going to be necessary? How about DIYgenomics? The cost of the technology to genotype and sequence is going to crash. I know that the Los Angeles DIYbio group has a cheap cast-off sequencer. For those who can’t afford to go abroad soon we’ll be able to get access to our information in our homes. Let’s prepare for that day.

Here are the links to contact your House Representative and your Senators.

Google Charts

Here's a chart using Google Visualization API.

In case you are wondering, the individuals are ordered by the sum of their South Asian, Pakistan/Caucasian and Kalash component percentages.

If it works well for everyone, using Internet Explorer, Firefox, Chrome, or Safari on Windows, Linux, iOS or Mac OS, then I'll start using these interactive bar charts instead of the ones I have been creating in R. These just use the data from the spreadsheet directly.

I am also looking into interactive scatter plots for the PCA plots, but I am not sure if it will handle a lot of data points without running your computer into the ground.

The Google Visualization API also has a geographical map feature using flash. There is also a static map chart which I am looking into.

My Genetic Journey

It all started DNA Day 2010 when Razib tweeted about a $99 sale for the DNA test at 23andme. I ordered one immediately. Over the next few months, a lot of my free time was spent poring over and analyzing my genomic results.

While the health and physical traits information was interesting, I found the ancestry information that can be deduced from your genome to be fascinating. That might be because I was working on collecting together and digitizing our family tree at the time.

So to beat Razib's record of of writing about his personal genome, I have started blogging about mine:

There's much more to come, including: What's wrong with my chromosome 9 and who did I get it from?; my results from Doug McDonald, Dodecad and Eurogenes; why do I have low similarity scores with everyone?; where exactly was my great-grandmother from?; and more.

PS. Since it's Valentine's Day, I should probably mention that my top match among the people (excluding my sibling of course) I am sharing genomes with on 23andme is my dear wife, Amber.