Tag Archives: indian

HarappaWorld Tweaks

First of all, I wanted to draw your attention to the fact that I am using weighted means for population averages for HarappaWorld instead of just averaging all samples' results. The weighting gives less importance to outliers. I find this to be a better solution than a simple average or median. A median removes all outliers but it also rejects a lot of information.

An example of the weighted mean effect can be seen in the Behar et al Armenian samples. Four of the samples have higher NE European percentages than the rest. As you can see in the table below, the weighting makes their impact on the population results low.

Mean Weighted Mean
Ethnicity armenian armenian armenian armenian
Dataset behar yunusbayev behar yunusbayev
N 19 16 19 16
S Indian 0.37% 0.52% 0.41% 0.52%
Baloch 16.57% 17.73% 17.07% 17.65%
Caucasian 54.35% 56.43% 57.29% 56.61%
NE Euro 8.96% 2.98% 5.35% 2.95%
SE Asian 0.10% 0.12% 0.10% 0.13%
Siberian 0.49% 0.09% 0.29% 0.09%
NE Asian 0.14% 0.08% 0.16% 0.09%
Papuan 0.28% 0.27% 0.26% 0.27%
American 0.19% 0.18% 0.22% 0.18%
Beringian 0.26% 0.19% 0.23% 0.20%
Mediterranean 8.46% 8.37% 8.21% 8.40%
SW Asian 9.81% 13.03% 10.40% 12.91%
San 0.00% 0.00% 0.00% 0.00%
E African 0.02% 0.00% 0.01% 0.00%
Pygmy 0.00% 0.00% 0.00% 0.00%
W African 0.00% 0.00% 0.00% 0.00%

Another example is the Somali samples in Reich et al data. There is one sample (out of 6) who seems to be eastern Bantu. Let's compare the unweighted mean and weighted mean for Somalis in Reich et al and Harappa participants.

Mean Weighted Mean
Ethnicity somali somali somali somali
Dataset harappa reich harappa reich
N 2 6 2 6
S Indian 0.00% 1.62% 0.00% 1.49%
Baloch 0.00% 0.00% 0.00% 0.00%
Caucasian 2.76% 0.00% 2.76% 0.00%
NE Euro 0.00% 0.11% 0.00% 0.04%
SE Asian 0.27% 0.05% 0.27% 0.06%
Siberian 0.00% 0.04% 0.00% 0.05%
NE Asian 0.00% 0.41% 0.00% 0.46%
Papuan 0.26% 0.10% 0.26% 0.11%
American 0.14% 0.17% 0.14% 0.19%
Beringian 0.23% 0.33% 0.23% 0.38%
Mediterranean 2.12% 3.25% 2.12% 3.65%
SW Asian 31.73% 24.48% 31.73% 27.33%
San 1.96% 1.48% 1.96% 1.37%
E African 60.37% 56.75% 60.37% 60.13%
Pygmy 0.15% 1.78% 0.15% 1.23%
W African 0.00% 9.43% 0.00% 3.51%

Also, I have divided Singapore Indians into 4 groups (actually 3 groups and 1 outlier) since they are so heterogeneous. Here are the weighted mean admixture proportions for all Singapore Indians and the four subgroups.

Ethnicity singapore-indian singapore-indian-1 singapore-indian-2 singapore-indian-3 singapore-indian-4
Dataset sgvp sgvp sgvp sgvp sgvp
N 83 31 41 10 1
S Indian 53.57% 61.95% 50.39% 33.68% 27.81%
Baloch 33.97% 30.24% 36.00% 40.72% 14.27%
Caucasian 3.55% 1.92% 4.03% 9.32% 4.53%
NE Euro 2.93% 0.08% 3.89% 9.84% 35.38%
SE Asian 1.31% 1.30% 1.23% 0.63% 1.20%
Siberian 0.45% 0.47% 0.44% 0.43% 1.19%
NE Asian 0.92% 0.91% 0.80% 1.19% 3.26%
Papuan 0.72% 1.09% 0.50% 0.35% 0.62%
American 0.42% 0.35% 0.44% 0.69% 1.29%
Beringian 0.56% 0.38% 0.65% 0.76% 0.00%
Mediterranean 0.67% 0.40% 0.72% 1.33% 10.38%
SW Asian 0.90% 0.86% 0.87% 1.05% 0.06%
San 0.01% 0.00% 0.01% 0.00% 0.00%
E African 0.03% 0.02% 0.04% 0.00% 0.00%
Pygmy 0.00% 0.00% 0.00% 0.00% 0.00%
W African 0.01% 0.01% 0.00% 0.00% 0.00%

I have updated the spreadsheet as well as HarappaWorld Oracle.