Category Archives: Participation

A New South Asian Project

Life and work have prevented me working on the Harappa Ancestry Project for a long time. Sadly, it looks like I won’t be able to get back to it in the near future.

My friend Razib has recently started his own South Asian Genotype Project. I’m curious to see what new results he comes up with.

Submission On Hold

Currently, Harappa Ancestry Project is on hold as I am busy with other things. Therefore, please do not send me your data.

However, if you belong to a South Asian ethnic group that is significantly different than any of the groups for whom I have data, please send me an email.


In my last Admixture run, there was a participant with results from DNA service.

Most of the SNPs genotyped by ancestryDNA are in common with 23andme and FTDNA. Thus, there was a perfect overlap between ancestryDNA and HarappaWorld admixture.

So, I am now accepting participants with DNA results too.

23andme and FDA

FDA had asked 23andme to stop its direct-to-consumer genetic testing and as a result 23andme has issued the following statement:

After discussion with officials from the Food and Drug Administration today, 23andMe will comply with the FDA's directive and stop offering new consumers access to health-related genetic tests while the company moves forward with the agency's regulatory review processes.

Customers who purchased kits on or after the FDA's warning letter of November 22nd will not have access to health-related results. Those customers will have access to ancestry-related genetic information and their raw data without 23andMe's interpretation of that data. They may receive health-related results in the future, depending on FDA marketing authorization.

Customers who purchased kits before November 22, 2013 will continue to have access to all the reports they've always had.

While I am disappointed at this turn of events, for our project it does not change much since 23andme will still provide raw data downloads as well as ancestry information.


Due to various reasons and to avoid hassle for myself, I'll no longer accept any Romany/Roma/Gypsy to the project.

23andme Now $99

Things have been very busy in meatspace in recent months, but I am finally back.

While the submission of data from new participants has really slowed, the release of new software continues unabated. I hope to try some of them (ALDER, MULTIMIX, ADMIXTOOLS, etc.) out and report anything interesting here.

Another disappointment has been the 1000genomes South Asian data which is still nowhere to be seen.

However, 23andme has reduced its regular price to $99 which should induce some of you to test and participate.

The Genographic Project has finally gotten into autosomal testing. If you are South Asian and have received their Geno 2.0 results, I would be interested in your raw data so that I can check how many SNPs it has in common with HarappaWorld.

23andme $50 Off

23andme has a $50 off coupon sale for three days. Here's the email I got from them:

Visiting family this summer? Are they part of 23andMe? Take advantage of our summer discount: $50 OFF each kit you purchase. This offer expires in 3 days (11:59PM PDT, Sunday August 12, 2012).

To use this code, visit our online store and add an order to your cart. Click "I have a discount code" and enter the code below.

$50 off Discount code: VMQ6KG

FTDNA Summer Sale

FTDNA is having a sale on its DNA tests till July 15, 2012.

Their autosomal test, Family Finder, which can be submitted to Harappa Ancestry Project, is on sale for $199 instead of a regular price of $289.

In addition, their mtDNA and Y-DNA products are also discounted till end of day July 15.

Participation Changes

Now that I have DIY HarappaWorld out, I am changing the participation requirements a little bit with somewhat different requirements for South Asians compared to other regions.

If you have any real ancestry from a South Asian origin, you are eligible to participate. Partial South Asian ancestry is okay. The list of countries of origin I count as South Asian are as follows:

  • Afghanistan
  • Bangladesh
  • Bhutan
  • India
  • Maldives
  • Nepal
  • Pakistan
  • Sri Lanka

Note that 2-3% South Asian from Dr. McDonald's BGA or Dodecad Project does not count as South Asian ancestry.

If you have all four of your grandparents from one of the following countries or regions, you can also send me your data.

  • Burma
  • Tibet
  • Uyghur from Xinjiang, China
  • Tajikistan
  • Kyrgyzstan
  • Kazakhstan
  • Uzbekistan
  • Turkmenistan
  • Iran
  • Turkey
  • Azerbaijan
  • Armenia
  • Georgia
  • North Caucasian Federal District, Russia
  • Iraq
  • Syria
  • Lebanon
  • Jordan

Relatives will only be accepted when they are a better replacement for current participants. For example, replacing a participant by his/her parents or his maternal uncle and paternal aunt gets us two unrelated participants (assuming, of course, that the two sides of the family are not related by blood). Another example could be if a participant is of partial South Asian ancestry and they get replaced by a relative who has more South Asian ancestry.

Everyone else can use DIY HarappaWorld. It's fairly easy to use on both Windows and Linux. The only hard part right now is that you have to install R to standardize your genome file. I might look into creating an executable for that to make it easier.

Finally, please be honest.

Honesty in Participation

I am pretty lenient when it comes to participation in the Harappa Ancestry Project. I accept almost all comers, even those with no connection to South Asia and neighboring regions.

While I ask about ethnic background, I don't release it publicly unless I have consent from the participant.

I do expect a few things from project participants. One is that they will be honest about the information they share with me.

I see no need for anyone to lie about their ethnicity. It's better just to withhold that information if you are so concerned.

There is also no reason not to tell me if you have a close relative participating since I do accept data from relatives (with the proviso that only unrelated samples can be included in most analyses). It's possible that you might not know that a 1st or 2nd degree relative is already in the project. That's not a problem; knowing a relative is in the project and not telling me is.

Also if you send me genomes that are 95% identical (IBS2: 849,145 SNPs) under different names, I will know. And I will remove you from the project.

UPDATE: 849,145 SNPs being IBS2 (i.e. both alleles the same for the two individuals) is 92.5% of all the common SNPs in their data files. For comparison, my sister and I have an IBS2 percentage of only 76.8%.