Admixture: Supervised Zombies Vs Unsupervised

I wanted to see how the supervised ADMIXTURE using zombies performed compared to regular unsupervised ADMIXTURE. Zombies here refers to genomes created using the --simulate option of plink from allele frequencies.

Therefore, I used the allele frequencies computed by Admixture for K=11 ancestral components for Reference 3 to generate 25 zombie individuals per ancestral component.

Using these 275 zombie samples as belonging 100% to one ancestral component, I ran Admixture in supervised mode on the Reference 3 dataset. You can see the population average results here (compare to unsupervised results).

Since I was interested in the difference between the supervised zombie admixture and the unsupervised results, here are the histograms for the difference between the two for all 3,886 samples and each ancestral component. The histogram bins are 0.5% wide.

Most of the results are within the usual error margins. Except for C7 West African component and C10 San/Pygmy component. Those two have larger differences between the unsupervised and supervised zombies approaches. Basically, individuals with West Africans or San/Pygmy ancestry get ~5-8% more West African component in the supervised zombie case with a corresponding decrease in the San/Pygmy component.


  1. These are good results and validate the use of zombies. Here is a suggestion. Run K=11 ADMIXTURE in the unsupervised mode with just the zombie populations and see how "pure" the zombies are!

    If it turns out that the zombies are themselves admixed, generate second order zombies which we can hope are purer. Such an iterative procedure might help to generate pure ASI zombies without recourse to Reich et al!