Tag Archives: somali

Pagani East African Dataset

Pagani et al analyzed Ethiopian genetics in their paper "Ethiopian Genetic Diversity Reveals Linguistic Stratification and Complex Influences on the Ethiopian Gene Pool". Their dataset consisting of Ethiopians and a few other East African populations is available online.

I have analyzed the Pagani dataset with my HarappaWorld admixture calculator and included the results in my regular spreadsheet.

The group (weighted mean) results are also shown in the usual interactive bar chart below. You can click on the component labels to sort by that ancestral component.

Because the East African component as computed in HarappaWorld is maximum among the Maasai and several of the Pagani dataset populations have a higher percentage of that component, we should be a bit careful with interpreting the HarappaWorld results for the Pagani groups. I'll likely include them in my next iteration of the admixture calculator.

HarappaWorld Tweaks

First of all, I wanted to draw your attention to the fact that I am using weighted means for population averages for HarappaWorld instead of just averaging all samples' results. The weighting gives less importance to outliers. I find this to be a better solution than a simple average or median. A median removes all outliers but it also rejects a lot of information.

An example of the weighted mean effect can be seen in the Behar et al Armenian samples. Four of the samples have higher NE European percentages than the rest. As you can see in the table below, the weighting makes their impact on the population results low.

Mean Weighted Mean
Ethnicity armenian armenian armenian armenian
Dataset behar yunusbayev behar yunusbayev
N 19 16 19 16
S Indian 0.37% 0.52% 0.41% 0.52%
Baloch 16.57% 17.73% 17.07% 17.65%
Caucasian 54.35% 56.43% 57.29% 56.61%
NE Euro 8.96% 2.98% 5.35% 2.95%
SE Asian 0.10% 0.12% 0.10% 0.13%
Siberian 0.49% 0.09% 0.29% 0.09%
NE Asian 0.14% 0.08% 0.16% 0.09%
Papuan 0.28% 0.27% 0.26% 0.27%
American 0.19% 0.18% 0.22% 0.18%
Beringian 0.26% 0.19% 0.23% 0.20%
Mediterranean 8.46% 8.37% 8.21% 8.40%
SW Asian 9.81% 13.03% 10.40% 12.91%
San 0.00% 0.00% 0.00% 0.00%
E African 0.02% 0.00% 0.01% 0.00%
Pygmy 0.00% 0.00% 0.00% 0.00%
W African 0.00% 0.00% 0.00% 0.00%

Another example is the Somali samples in Reich et al data. There is one sample (out of 6) who seems to be eastern Bantu. Let's compare the unweighted mean and weighted mean for Somalis in Reich et al and Harappa participants.

Mean Weighted Mean
Ethnicity somali somali somali somali
Dataset harappa reich harappa reich
N 2 6 2 6
S Indian 0.00% 1.62% 0.00% 1.49%
Baloch 0.00% 0.00% 0.00% 0.00%
Caucasian 2.76% 0.00% 2.76% 0.00%
NE Euro 0.00% 0.11% 0.00% 0.04%
SE Asian 0.27% 0.05% 0.27% 0.06%
Siberian 0.00% 0.04% 0.00% 0.05%
NE Asian 0.00% 0.41% 0.00% 0.46%
Papuan 0.26% 0.10% 0.26% 0.11%
American 0.14% 0.17% 0.14% 0.19%
Beringian 0.23% 0.33% 0.23% 0.38%
Mediterranean 2.12% 3.25% 2.12% 3.65%
SW Asian 31.73% 24.48% 31.73% 27.33%
San 1.96% 1.48% 1.96% 1.37%
E African 60.37% 56.75% 60.37% 60.13%
Pygmy 0.15% 1.78% 0.15% 1.23%
W African 0.00% 9.43% 0.00% 3.51%

Also, I have divided Singapore Indians into 4 groups (actually 3 groups and 1 outlier) since they are so heterogeneous. Here are the weighted mean admixture proportions for all Singapore Indians and the four subgroups.

Ethnicity singapore-indian singapore-indian-1 singapore-indian-2 singapore-indian-3 singapore-indian-4
Dataset sgvp sgvp sgvp sgvp sgvp
N 83 31 41 10 1
S Indian 53.57% 61.95% 50.39% 33.68% 27.81%
Baloch 33.97% 30.24% 36.00% 40.72% 14.27%
Caucasian 3.55% 1.92% 4.03% 9.32% 4.53%
NE Euro 2.93% 0.08% 3.89% 9.84% 35.38%
SE Asian 1.31% 1.30% 1.23% 0.63% 1.20%
Siberian 0.45% 0.47% 0.44% 0.43% 1.19%
NE Asian 0.92% 0.91% 0.80% 1.19% 3.26%
Papuan 0.72% 1.09% 0.50% 0.35% 0.62%
American 0.42% 0.35% 0.44% 0.69% 1.29%
Beringian 0.56% 0.38% 0.65% 0.76% 0.00%
Mediterranean 0.67% 0.40% 0.72% 1.33% 10.38%
SW Asian 0.90% 0.86% 0.87% 1.05% 0.06%
San 0.01% 0.00% 0.01% 0.00% 0.00%
E African 0.03% 0.02% 0.04% 0.00% 0.00%
Pygmy 0.00% 0.00% 0.00% 0.00% 0.00%
W African 0.01% 0.01% 0.00% 0.00% 0.00%

I have updated the spreadsheet as well as HarappaWorld Oracle.