Xing to PED Conversion

Following mallu's request, here is the code I used to convert Xing et al data to Plink's PED format.

#!/bin/bash
dos2unix *.csv
 
head --lines=1  JHS_Genotype.csv > header.txt
awk -F, '{for (i=2;i<=NF;i++) print "0",$i,"0","0","0", "0"}' header.txt > xing.tfam
sed '1d' JHS_Genotype.csv > genotype.csv
sort -t',' -k 1b,1 genotype.csv > genotype_sorted.csv
sort -t',' -k 1b,1 JHS_SNP.csv > snp_sorted.csv
join -t',' -j 1 snp_sorted.csv genotype_sorted.csv > xing_compound.csv
awk -F, '{printf("%s %s 0 %s ",substr($2,4),$1,$3); 
        for (i=6;i<=NF;i++)
                printf("%s %s ",substr($i,1,1),substr($i,2,1));
        printf("\n")}' xing_compound.csv > xing.tped
 
plink --tfile xing --out xing --make-bed --missing-genotype N --output-missing-genotype 0

I make no guarantees that it will work for you. I used it on my Ubuntu box, but I am sure it'll have trouble on Mac OS.

2 Comments.

  1. wow! this is amazing. You are great. please keep on posting such info which will be quite helpful to the newcomers.

  2. please also post the code to convert Pan Asian data to plink format. No hurry, when time permits!

    regards,
    sarb