Supervised Continental Admixture

Since the version 1.1 of Admixture with supervised option came almost two months ago, I have been salivating over it.

My original use case for it is not possible (for now). I wanted to be able to assign a few of the K ancestral components to specific reference populations and let the other ancestral components fall where they may. But we can do supervised admixture only by assigning all K ancestral components.

So I decided to test this supervised option by mimicking the three continental percentages 23andme assigns you on their ancestry painting page. Mine are:

Europe 91.22%
Asia 8.69%
Africa 0.09%

You can get the extra precision (and false sense of accuracy) here.

Regarding the reference populations used for ancestry painting, 23andme says:

23andMe takes advantage of publicly available data for four populations studied extensively via the International HapMap project (hapmap.org). That project obtained the genotypes for 60 individuals of western European descent from Utah, 60 western African individuals from Nigeria, and 90 eastern Asian individuals, 45 from each of Japan and China. Because the two eastern Asian populations are geographically near one another and relatively similar at the genetic level, 23andMe combines these to form a single eastern Asian reference population.

So I dug up my reference admixture run at K=3 and found the same number of samples of these HapMap populations by looking for those samples which had the highest percentage in the respective component.

Then I combined these 210 samples from the HapMap with 74 Harappa Project participants (HRP0001 to HRP0079, excluding 5 who are related to others).

The results of the supervised admixture run are in a spreadsheet and also shown in a bar chart below.

Since I did run an unsupervised K=3 admixture analysis of the first Harappa batch with the whole reference I populations, you can compare these results to those.

13 Comments.

  1. I'm curious - because I see a non-zero African component in my case - do you have an idea of the error bars?

    Also, as a general question, does Admixture provide you with a way to get uncertainties in your analysis results?

    • At K=3, some South Asians get a small African component in admixture which disappears at K=4.

      • Interesting. My results for European and Asian components seem to be quite different from what 23andMe tells me (Asian is 27% here, compared to 15% on 23andMe). Could you please enlighten me as to why it is so? If I understand correctly, the reference populations are different?

  2. Vasishta (HRP0072)

    On 23andMe I am :-
    European - 78.62%
    Asian - 21.34%
    African - 0.04

    Here, I am :-
    European - 71.61%
    Asian - 27.95%
    African - 0.44%

  3. my parents are elevated on asian too. ~5%

  4. On 23andMe, I am:
    100% "European" (West Asian)

    Harappa (K=3):
    Europe 94.76%
    Asia 4.15%
    Africa 1.08%

    Dodecad (K=10):
    W Asian 52.2
    SW Asian 22.7
    S European 21
    S Asian 4.1

  5. I meant to say West Eurasian.

  6. I've heard that 23andme classes certain indian references as asian instead of european, is that true?

  7. What is a Brazilian doing the Harappa project?