Honesty in Participation

I am pretty lenient when it comes to participation in the Harappa Ancestry Project. I accept almost all comers, even those with no connection to South Asia and neighboring regions.

While I ask about ethnic background, I don't release it publicly unless I have consent from the participant.

I do expect a few things from project participants. One is that they will be honest about the information they share with me.

I see no need for anyone to lie about their ethnicity. It's better just to withhold that information if you are so concerned.

There is also no reason not to tell me if you have a close relative participating since I do accept data from relatives (with the proviso that only unrelated samples can be included in most analyses). It's possible that you might not know that a 1st or 2nd degree relative is already in the project. That's not a problem; knowing a relative is in the project and not telling me is.

Also if you send me genomes that are 95% identical (IBS2: 849,145 SNPs) under different names, I will know. And I will remove you from the project.

UPDATE: 849,145 SNPs being IBS2 (i.e. both alleles the same for the two individuals) is 92.5% of all the common SNPs in their data files. For comparison, my sister and I have an IBS2 percentage of only 76.8%.


  1. Zack, what is this post motivated by? Also, does shared DNA of close to, but less than ~1% skew any of the results?

    • The fake genome, of course.

      1% IBD is not a problem at all.

      • So, you actually had someone send you their files two times, the second time under a false pretense and identity. Very unfortunate. I asked, because, I'm afraid that I do have some extremely distant relatives (two, to be precise) participating in the project. But I seem to share no more than ~1% or so DNA with them on the Relative Finder tool at 23andMe. Glad to hear 1% IBD does not skew the results.

  2. Very unfortunate to hear people are attempting to con their way through to your analyses, despite it being redundant to begin with. Dienekes had a similar experience with a Dodecad user claiming Spanish ancestry, with that individual ending up being removed because they showed Mexican admixture. From my part at least, I can confirm having no known relatives in your project and that my ancestry is as stated.

  3. Did he just change thousands of alleles? If so, which ones? Just curious what kind of agenda the imbecile might have had.

    • That's exactly what was done. Thousands of alleles changes on mostly four chromosomes. I can only speculate about intentions.

      • Unfortunately - at least going by 23andMe and FTDNA - the genome results are just given back as text files and any idjit who wants to muck around with it could do so. Sanity-checking the input data is the best thing to do under the circs.

        Really don't understand the motivation for such acts, though. Or of those who unleash malware on innocent others. So it goes...

      • The possible intentions of doing such a thing perplexes me to no end. Why on earth would someone want to do something so easily detectable and also futile in nature?

  4. Zack

    I hope you are going to deleted the person's fake data off the graph.

