Harappa Oracle

Posted by Zack on March 23, 2012

Based on the Dodecad Oracle, here is Harappa Oracle using reference 3 admixture results.

I am using Dienekes' code with a couple of changes. One of them is using weighted distance based on Fst divergences between ancestral components. Because of that it is several times slower than DodecadOracle. I plan to offer an option soon to switch between Euclidean distance and Fst-weighted distance.

You need to install R to use it. Then unzip the Oracle zip file. Double-click on the file or use the following in R:

load('HarappaOracleR3fst.RData')

In R, you can look at the 385 populations included by typing:

X[,1]

To use it to find your closest populations, you need your Harappa Reference 3 admixture results. Use them separated by commas like this (for me):

HarappaOracle(c(44,12,0,24,14,1,2,0,0,1,2))

You will get a result, with the first column showing the closest populations and the 2nd column their distance to you.

[,1] [,2]
[1,] "balochi" "8.0242"
[2,] "bene-israel" "9.2843"
[3,] "brahui" "9.5158"
[4,] "pathan" "9.7034"
[5,] "makrani" "10.1014"
[6,] "sindhi" "10.9236"
[7,] "Bhatia" "11.8441"
[8,] "Sindhi" "12.1704"
[9,] "Kashmiri" "13.4229"
[10,] "punjabi-arain" "13.9192"

You can also find out the closest populations to one of the reference populations:

HarappaOracle("punjabi-arain")

By default, the Oracle shows the 10 closest populations. You can change that:

HarappaOracle("punjabi-arain",k=20)

Also, by default, the Oracle excludes the Pan-Asian dataset since the overlap is only 5,400 SNPs. You can include Pan-Asian populations:

HarappaOracle("punjabi-arain",panasian=T)

There is also a mixed mode where the individual (or mean reference population) is compared against all pairs of populations as ancestors.

HarappaOracle("Haryana Jatt",mixedmode=T)

which has the following output:

[1,] "Haryana Jatt" "0"
[2,] "15.4% lithuanians + 84.6% Punjabi Brahmin" "1.9553"
[3,] "10.6% russian + 89.4% Rajasthani Brahmin" "2.0626"
[4,] "14.7% finnish + 85.3% Punjabi Brahmin" "2.0863"
[5,] "9.2% finnish + 90.8% Rajasthani Brahmin" "2.1142"
[6,] "89.4% Rajasthani Brahmin + 10.6% mordovians" "2.1727"
[7,] "9.6% lithuanians + 90.4% Rajasthani Brahmin" "2.1989"
[8,] "10.1% belorussian + 89.9% Rajasthani Brahmin" "2.2938"
[9,] "16.8% russian + 83.2% Punjabi Brahmin" "2.3015"
[10,] "16.2% belorussian + 83.8% Punjabi Brahmin" "2.3656"

You can of course combine any or all of the options.

Think of Harappa Oracle as a tool to help you interpret your admixture results by comparing who you are closest to. Do not think of it as giving you your real ancestry.

Admixtureharappa, oracle

← Ref3 Admixture Dendrograms

South Asian fineStructure Ref3 Admixture →

15 Comments.

Parasar March 23, 2012 at 9:19 am

Zack,
Could you explain the results below as to how to interpret the difference between HRP0003 and "Bihari Brahmin" mixed mode results.
Thanks.

For HRP0003
[1,] "Bihari Brahmin" "0.7657"
[2,] "UP Brahmin" "3.8766"
[3,] "Punjabi Brahmin" "3.9242"
[4,] "Punjabi" "4.6289"
[5,] "Rajasthani Brahmin" "5.2604"
[6,] "kashmiri-pandit" "5.3077"
[7,] "Brahmins_from_Uttar_Pradesh" "6.0628"
[8,] "Kashmiri" "6.372"
[9,] "Bengali Brahmin" "6.4095"
[10,] "Bihari Muslim" "6.4485"

Mixed Mode
HarappaOracle(c(52,18,2,7,17,0,0,1,2,0,0),mixedmode=T)

[1,] "99.4% Bihari Brahmin + 0.6% bolivian" "0.5006"
[2,] "0.6% maya + 99.4% Bihari Brahmin" "0.5017"
[3,] "0.5% pima + 99.5% Bihari Brahmin" "0.5044"
[4,] "99.5% Bihari Brahmin + 0.5% totonac" "0.5068"
[5,] "0.5% karitiana + 99.5% Bihari Brahmin" "0.5073"
[6,] "0.5% surui + 99.5% Bihari Brahmin" "0.5073"
[7,] "0.8% ecuadorian + 99.2% Bihari Brahmin" "0.5192"
[8,] "0.8% mexicans + 99.2% Bihari Brahmin" "0.532"
[9,] "0.8% colombian + 99.2% Bihari Brahmin" "0.5806"
[10,] "0.6% east-greenlanders + 99.4% Bihari Brahmin" "0.6203"

HarappaOracle("Bihari Brahmin",mixedmode=T)
[,1] [,2]
[1,] "Bihari Brahmin" "0"
[2,] "82.3% Rajasthani Brahmin + 17.7% ap-mala" "1.5283"
[3,] "82.7% Rajasthani Brahmin + 17.3% Tamil Vishwakarma" "1.5463"
[4,] "11% lithuanians + 89% Gujarati" "1.5744"
[5,] "17.4% sakilli + 82.6% Rajasthani Brahmin" "1.612"
[6,] "17.7% kamsali + 82.3% Rajasthani Brahmin" "1.6312"
[7,] "83.2% Rajasthani Brahmin + 16.8% tn-dalit" "1.6388"
[8,] "17.7% north-kannadi + 82.3% Rajasthani Brahmin" "1.6591"
[9,] "81.6% Rajasthani Brahmin + 18.4% Chenchus" "1.6629"
[10,] "81.4% Rajasthani Brahmin + 18.6% Chamar" "1.6762"
- SB March 23, 2012 at 2:56 pm
  
  Hi Parasar, here is my guess... HRP0003 has more east Asian admixture when compared to "Bihari Brahmin" which is showing up as south-american.
  
  For the "Bihari brahmin", all results except for [4,] seem obvious. To explain [4,] we have seen in the Eurogenes project that "Brahmins" in general seem to have something in common with Lithuanian/Baltic populations. Hence the Gujarati+Lithuanian combo makes sense.
  - AV March 23, 2012 at 11:16 pm
    
    What seems strange is why HRP0003 and Bihari Brahmin would have different results considering that HRP0003 is the sole individual that comprises of the Bihari-Brahmin group composite?
AV March 23, 2012 at 11:14 pm

Thanks, Zack! Check out my oracle scores, too;

[1,] "7.3% Saudis + 92.7% Kanjars" "1.0774"
[2,] "6.9% Yemen-Jews + 93.1% Kanjars" "1.0993"
[3,] "14.1% Balochi + 85.9% Meghawal" "1.1112"
[4,] "7.4% Samaritians + 92.6% Kanjars" "1.1171"
[5,] "22% Bene-Israel + 78% Kanjars" "1.1402"
[6,] "8.3% Iraq-Jews + 91.7% Kanjars" "1.1491"
[7,] "8.7% Iraqi Arab + 91.3% Kanjars" "1.1493"
[8,] "7.1% Iranian-Jews + 92.9% Muslim" "1.1526"
[9,] "8.5% Iranian-Jews + 91.5% Kanjars" "1.1539"
[10,] "13.5% Brahui + 86.5% Meghawal" "1.1605"

The appreciable presence of the SW Asian component in me (~10%) in Ref 3 K=11 is being literally interpreted by the Harappa Oracle, yes? Doesn't the program take into account it's even higher presence in some other South-Asian populations?
- Balaji March 27, 2012 at 2:58 pm
  
  As you said, it looks like Harappa Oracle is trying to match up your SW Asian component by combining appropriate amounts from different populations. It is also trying to match up the total W. Eurasian components or ANI. Kanjars are about 53% ANI and Saudis are mostly W. Eurasian with some African admixture. 0.53*0.93+0.07 = 0.56 which is not too far from your ANI of ~0.6.
  
  However we cannot take the values of SW Asian and European from the Ref. 3, K=11 analysis literally as ancestry from SW Asia or from Rurope. More than a year ago, Zack found in the Ref. 1 analysis that the major W. Eurasian component was Baloch/Caucasian. Dienekes and Metspalu et al. have independently come to the same conclusion.
AV March 23, 2012 at 11:18 pm

As an aside, Zack, will you be releasing a DIY calculator for Ref 3 K=11. Many non-participants seem to be very interested in one.
Sakiusa March 24, 2012 at 10:12 am

This is awesome Zack, I played with it for a couple of hours. The concentrations are really interesting.

Cheers
Palisto March 26, 2012 at 2:02 pm

"I am using Dienekes' code with a couple of changes. One of them is using weighted distance based on Fst divergences between ancestral components."

I am glad that you are using distance based on Fst divergences. This is what I proposed all the time, I just called it "adjusted distance" and I mentioned it to you how to do it here:
http://www.harappadna.org/2012/02/admixture-ref3-k11-hrp0211-hrp0220/
>>
Zack February 24, 2012 at 10:13 pm

Looks interesting, though I don't have Excel on my home machine, so I haven't been able to test it.

Also, what's the difference between the normal and adjusted distance columns in the results?
Reply

Palisto February 25, 2012 at 4:38 pm

I explained it here.
http://www.forumbiodiversity.com/showpost.php?p=571226&postcount=5

The adjustment is to address the relation of the different components to each other based on the Fst divergences you provided here:
http://www.harappadna.org/2011/04/reference-3-admixture-k11/
<<
The upper class is more Republican « elmsprogressivemedia - pingback on March 26, 2012 at 3:21 pm
Aditya March 26, 2012 at 11:51 pm

Oracle observations for Kerala Nair

HarappaOracle("Kerala Nair",k=20)
[,1] [,2]
[1,] "Kerala Nair" "0"
[2,] "Brahmins_from_Tamil_Nadu" "1.4726"
[3,] "Kerala Brahmin" "2.1735"
[4,] "Iyer Brahmin" "2.3957"
[5,] "meghawal" "2.9648"
[6,] "Iyengar Brahmin" "3.0628"
[7,] "Goan" "3.1174"
[8,] "Maharashtrian" "3.2327"
[9,] "Karnataka Brahmin" "3.3282"
[10,] "Meena" "3.5734"
[11,] "Kerala Christian" "3.6895"
[12,] "singapore-indians" "3.8758"
[13,] "Kshatriya" "4.0627"
[14,] "Gujarati" "4.2323"
[15,] "UP" "4.8068"
[16,] "tn-brahmin" "5.0151"
[17,] "Lambadi" "5.4237"
[18,] "gujaratis-b" "5.5692"
[19,] "Sourastrian" "5.6119"
[20,] "Muslim" "5.8098"

> HarappaOracle("Kerala Nair",mixedmode=T)
[,1] [,2]
[1,] "Kerala Nair" "0"
[2,] "48.5% kashmiri-pandit + 51.5% Tamil_Nadu_Scheduled_Caste" "0.2834"
[3,] "74.1% Kerala Christian + 25.9% UP Brahmin" "0.4427"
[4,] "71.2% Andhra Pradesh + 28.8% Bhatia" "0.5272"
[5,] "65.1% Kerala Christian + 34.9% Brahmins_from_Uttar_Pradesh" "0.558"
[6,] "70.4% Andhra Pradesh + 29.6% Sindhi" "0.5715"
[7,] "26.6% pathan + 73.4% Andhra Pradesh" "0.587"
[8,] "31.3% pathan + 68.7% Tamil Vellalar" "0.6394"
[9,] "41.4% kashmiri-pandit + 58.6% velama" "0.647"
[10,] "2.2% onge + 97.8% Kerala Brahmin" "0.6953"

However, my individual mixed mode results are interesting.

HarappaOracle(c(56,24,0,9,7,2,0,1,0,0,0),mixedmode=T)
[,1] [,2]
[1,] "1% nganassans + 99% Kerala Nair" "1.3097"
[2,] "1% koryaks + 99% Kerala Nair" "1.3814"
[3,] "1% evenkis + 99% Kerala Nair" "1.3823"
[4,] "1.1% dolgans + 98.9% Kerala Nair" "1.3963"
[5,] "1% yakut + 99% Kerala Nair" "1.4097"
[6,] "1.1% yukaghirs + 98.9% Kerala Nair" "1.4293"
[7,] "1% kets + 99% Kerala Nair" "1.4367"
[8,] "0.9% chukchis + 99.1% Kerala Nair" "1.4449"
[9,] "0.6% papuan + 99.4% Kerala Nair" "1.4519"
[10,] "1% selkups + 99% Kerala Nair" "1.4565"
HRP15 March 28, 2012 at 7:21 am

my results in mixedmode

> HarappaOracle(c(20,6,1,31,40,0,0,1,1,0,0),mixedmode=T)
[,1] [,2]
[1,] "69.9% tuscans + 30.1% Bihari Brahmin" "1.0627"
[2,] "71.8% tuscans + 28.2% Bengali Brahmin" "1.4946"
[3,] "70.3% tuscans + 29.7% UP Brahmin" "1.5379"
[4,] "33.9% bene-israel + 66.1% bulgarians" "1.5631"
[5,] "72.6% tuscans + 27.4% ap-brahmin" "1.7342"
[6,] "71.4% tuscans + 28.6% vaish" "1.7753"
[7,] "72.9% tuscans + 27.1% Oriya" "1.7949"
[8,] "71.7% tuscans + 28.3% Brahmins_from_Uttar_Pradesh" "1.8043"
[9,] "67.8% tuscans + 32.2% Rajasthani Brahmin" "1.8924"
[10,] "66.6% tuscans + 33.4% punjabi-arain" "1.8939"
>
RanilB March 30, 2012 at 7:02 am

My results in mixed mode

HarappaOracle(c(60,30,3,5,1,0,0,0,0,0,0),mixedmode=T,k=20)
[,1] [,2]
[1,] "83.5% Sinhalese + 16.5% Tamil_Nadu_Scheduled_Caste" "0.6774"
[2,] "89.7% Sinhalese + 10.3% Velamas" "0.7537"
[3,] "9.1% velama + 90.9% Sinhalese" "0.8073"
[4,] "92.5% Sinhalese + 7.5% Kurumba" "0.8355"
[5,] "94.7% Sinhalese + 5.3% Piramalai_Kallars" "0.85"
[6,] "94% Sinhalese + 6% Tamil Nadar" "0.8531"
[7,] "5.4% Karnataka + 94.6% Sinhalese" "0.8625"
[8,] "93.7% Sinhalese + 6.3% Tamil Vellalar" "0.8635"
[9,] "0.2% yemen-jews + 99.8% Sinhalese" "0.867"
[10,] "0.2% saudis + 99.8% Sinhalese" "0.8705"
[11,] "2.2% vysya + 97.8% Sinhalese" "0.8727"
[12,] "0.2% bedouin + 99.8% Sinhalese" "0.8746"
[13,] "99.8% Sinhalese + 0.2% qatari" "0.8755"
[14,] "0.2% samaritians + 99.8% Sinhalese" "0.8755"
[15,] "0.2% iraq-jews + 99.8% Sinhalese" "0.8769"
[16,] "0.2% iranian-jews + 99.8% Sinhalese" "0.8773"
[17,] "98% Sinhalese + 2% Dusadh" "0.8779"
[18,] "0.1% Iraqi Arab + 99.9% Sinhalese" "0.879"
[19,] "0.1% palestinian + 99.9% Sinhalese" "0.8794"
[20,] "0.1% druze + 99.9% Sinhalese" "0.8796

Oracle observations for Sinhalese

> HarappaOracle("Sinhalese",mixedm [,1] [1,] "Sinhalese" [2,] "32% Gond + 68% Velamas" [3,] "22.9% Bengali + 77.1% [4,] "49.8% Bengali + 50.2% Piramalai_Kallars" [5,] "40.1% Kerala Christian + 59.9% Chenchus" [6,] "41.7% Bengali + 58.3% [7,] "28.4% Bengali + 71.6% Velamas" [8,] "9.4% juang + 90.6% Velamas" [9,] "25% satnami + 75% Velamas" [10,] "9.6% bonda + 90.4% Velamas" [11,] "53.4% Tamil Muslim + 46.6% Velamas" [12,] "15.4% Dhurwa + 84.6% Velamas" [13,] "17.8% sahariya + 82.2% Velamas" [14,] "58.8% Kerala Muslim + [15,] "45.6% Chenchus + 54.4% Lambadi" [16,] "14.9% Bhunjia + 85.1% Velamas" [17,] "11.4% kharia + 88.6% Velamas" [18,] "10.3% gadaba + 89.7% Velamas" [19,] "11.3% savara + 88.7% Velamas" [20,] "2.4% dai + 97.6% Tamil Vellalar" ode=T,k=30)
[,2]
"0"
"1.2599"
Tamil_Nadu_Scheduled_Caste" "1.3262"
"1.3283"
"1.3329"
Tamil_Nadu_Scheduled_Caste" "1.4593"
"1.5238"
"1.5415"
"1.5487"
"1.5512"
"1.5689"
"1.5762"
"1.5867"
41.2% Piramalai_Kallars" "1.6186"
"1.6208"
"1.6226"
"1.6338"
"1.662"
"1.6758"
"1.7066"
HRP15 March 30, 2012 at 11:26 am

I just found out there is a "Romany" Category. Which can be observed

Observations
HarappaOracle("Romany",mixedmode=T)
[,1] [,2]
[1,] "Romany" "0"
[2,] "26.6% Punjabi Jatt + 73.4% spain-basc" "0.8078"
[3,] "25.8% Kashmiri + 74.2% spain-basc" "1.0171"
[4,] "26.6% punjabi-arain + 73.4% spain-basc" "1.0517"
[5,] "26.3% Bhatia + 73.7% spain-basc" "1.0549"
[6,] "24.5% Punjabi + 75.5% spain-basc" "1.0877"
[7,] "72% basque + 28% Haryana Jatt" "1.112"
[8,] "26% Sindhi + 74% spain-basc" "1.1708"
[9,] "24.5% Punjabi Brahmin + 75.5% spain-basc" "1.1906"
[10,] "74.6% basque + 25.4% Rajasthani Brahmin" "1.1922"

-- I got Rajasthani Brahmin and Punjabi Arain as well in my results
Ð—Ð° ÐºÑƒÐ»Ð¸ÑÐ°Ð¼Ð¸: ÐºÐ°Ðº ÑÐ¾Ð·Ð´Ð°Ð²Ð°Ð»ÑÑ ÑÑ‚Ð½Ð¾-Ð¿Ð¾Ð¿ÑƒÐ»ÑÑ†Ð¸Ð¾Ð½Ð½Ñ‹Ð¹ ÐºÐ°Ð»ÑŒÐºÑƒÐ»ÑÑ‚Ð¾Ñ€ World-22 | Ð—Ð°Ð¼ÐµÑ‚Ðº - pingback on April 20, 2013 at 1:28 pm
Ð—Ð° ÐºÑƒÐ»Ð¸ÑÐ°Ð¼Ð¸: ÐºÐ°Ðº ÑÐ¾Ð·Ð´Ð°Ð²Ð°Ð»ÑÑ ÑÑ‚Ð½Ð¾-Ð¿Ð¾Ð¿ÑƒÐ»ÑÑ†Ð¸Ð¾Ð½Ð½Ñ‹Ð¹ ÐºÐ°Ð»ÑŒÐºÑƒÐ»ÑÑ‚Ð¾Ñ€ World-22 | Ð—Ð°Ð¼ÐµÑ‚Ðº - pingback on April 20, 2013 at 1:28 pm

Trackbacks and Pingbacks:

The upper class is more Republican « elmsprogressivemedia - Pingback on 2012/03/26/ 15:21
Ð—Ð° ÐºÑƒÐ»Ð¸ÑÐ°Ð¼Ð¸: ÐºÐ°Ðº ÑÐ¾Ð·Ð´Ð°Ð²Ð°Ð»ÑÑ ÑÑ‚Ð½Ð¾-Ð¿Ð¾Ð¿ÑƒÐ»ÑÑ†Ð¸Ð¾Ð½Ð½Ñ‹Ð¹ ÐºÐ°Ð»ÑŒÐºÑƒÐ»ÑÑ‚Ð¾Ñ€ World-22 | Ð—Ð°Ð¼ÐµÑ‚Ðº - Pingback on 2013/04/20/ 13:28
Ð—Ð° ÐºÑƒÐ»Ð¸ÑÐ°Ð¼Ð¸: ÐºÐ°Ðº ÑÐ¾Ð·Ð´Ð°Ð²Ð°Ð»ÑÑ ÑÑ‚Ð½Ð¾-Ð¿Ð¾Ð¿ÑƒÐ»ÑÑ†Ð¸Ð¾Ð½Ð½Ñ‹Ð¹ ÐºÐ°Ð»ÑŒÐºÑƒÐ»ÑÑ‚Ð¾Ñ€ World-22 | Ð—Ð°Ð¼ÐµÑ‚Ðº - Pingback on 2013/04/20/ 13:28

Harappa Ancestry Project

Genetics and South Asia

Harappa Oracle

Related

15 Comments.

Trackbacks and Pingbacks:

Contact

My Sites

Data

Affiliate DNA Tests

Categories

Archives

Recent Comments

Blogroll

Harappa Ancestry Project

Genetics and South Asia

Harappa Oracle

Share this:

Related

15 Comments.

Trackbacks and Pingbacks:

Contact

My Sites

Data

Affiliate DNA Tests

Categories

Tags

Archives

Recent Comments

Blogroll