Let's take a look at the Bengali participants of the Harappa Ancestry Project.

I have added a suffix to the IDs where B = Brahmin, V = Vaidya and M = Muslim.

Here are the HarappaWorld Admixture results for the Bengalis which you can also see in a spreadsheet.

It's easy to see the difference between the Brahmins and others.

Razib wanted to know the origin of the East Asian ancestry among the Bengalis. So I ran a supervised ADMIXTURE with the following populations set as ancestral:

  • Altaian
  • Burmanese
  • Buryat
  • Cambodian
  • Chukchi
  • Dai
  • Daur
  • Dolgan
  • Evenki
  • Georgian
  • Gujarati-A
  • Han
  • Han-NChina
  • Hezhen
  • Japanese
  • Ket
  • Kinh
  • Koryak
  • Lahu
  • Miao
  • Mongola
  • Mongolian
  • Naxi
  • Nganassan
  • Oroqen
  • Selkup
  • She
  • Singapore-Malay
  • Tibet
  • Tu
  • Tujia
  • Tuvinian
  • Xibo
  • Yakut
  • Yi
  • Yukaghir

While most of these populations are various East Asian groups, I used the Gujarati-A as the South Asian group since it has the most South Indian + Baloch components without any East Asian influence. I used the Georgians as a proxy for West Asian ancestry.

Since it's K=36, I ran ADMIXTURE 10 times with different seeds and computed the average percentages for the Bengali participants. The number of SNPs was about 85,565. I did a similar analysis at K=35 after excluding the Tibetans, which got me 263,000 SNPs. The results were broadly similar.

I am showing only the first 12 ancestral components since all the rest were less than 0.5% for all the Bengalis (Spreadsheet).

Please do remember that in supervised ADMIXTURE, I assign the ancestral populations and the algorithm has to find the best fit using those populations. So it's not showing actual ancestry but broad affinity. Also, the exact percentages are not important and can vary when I change the parameters of the analysis. Just look at the broad trends.

The general pattern is that Bengali Brahmins have the least Eastern Eurasian and the most West Asian. The Eastern Eurasian ethnicity most closely related to Bengalis is Burmese.

Interestingly, there is a pattern of a small amount of Siberian ancestry among these Bengalis. Let's add all the Siberian and Russian Far East groups.

ID Ethnicity Siberian
HRP0244 West Bengal Rajput 5.07%
HRP0077B Bengali Brahmin 5.01%
HRP0049 Bengali 4.45%
HRP0252B Bengali Brahmin 4.01%
HRP0268B Bengali Brahmin 3.90%
HRP0023M Bengali Muslim 3.54%
HRP0316B Bengali Brahmin 3.45%
HRP0054B Bengali Brahmin 3.41%
HRP0300M Bengali Muslim 2.95%
HRP0240V Bengali Vaidya 1.78%
HRP0293B Bengali Brahmin 1.02%
HRP0291V Bengali Vaidya 0.99%
HRP0317M Bengali Muslim 0.89%
HRP0321M Bengali Muslim 0.58%
HRP0322M Bengali Muslim 0.41%
HRP0022M Bengali Muslim 0.37%
HRP0091B Bengali Brahmin 0.01%

I am not sure of the pattern here, but at least the first few are above noise thresholds.

East Asian Admixture

Let's look at the East Asian admixture among South Asians and other surrounding populations from a previous admixture run (K=12).

I have listed the different kinds of East Asian admixture components among selected populations. The three relevant components are:

  1. Southeast Asian: Highest among the Dai, Cambodians, Lahu and Malay, this is the most common East Asian component among South Asians.
  2. Northeast Asian: Highest among the Naga, Nysha, Japanese and north Han.
  3. Siberians: Highest among the Nganassans and Evenkis, this is lowest among South Asians overall. While this is not quite Turkic, it is the one most related to them.

Let's look at the total East Asian percentage among South Asians.

As expected, the eastern part of South Asia is where we see most of the East Asian admixture.

Now instead of looking at the absolute percentages of Southeast Asian admixture, let's look at the Southeast Asian component as a percentage of total East Asian component.

South and East India seem like mostly Southeast Asian admixture.

Now the same map for Northeast Asian as a proportion of total East Asian:

The Northeast Asian component dominates along the northern border of South Asia.

Finally the Siberian:

Compared to the other two, Siberian component is fairly low among South Asians, so it's difficult to separate the noise from real admixture here. Most of the peaks you see are among populations that have low East Asian admixture.