Even though the Pan-Asian dataset is not public, there was a request for my script to convert the data to Plink's PED format.
Here is how I convert the Pan-Asian data to Plink's transposed file format.
#!/usr/bin/perl -w $file="Genotypes_All.txt"; open(INFILE,"<",$file); open(TFAM,">","panasian.tfam"); open(TPED,">","panasian.tped"); $line = <INFILE>; chomp $line; @first = split('\t',$line); foreach my $sample (5..$#first) { print TFAM "0 $first[$sample] 0 0 0 -9\n"; } my $alleles; while(<INFILE>) { chomp; @lines = split('\t',$_); my ($major,$minor) = split('/',$lines[4]); print TPED "$lines[2] $lines[1] 0 $lines[3]"; foreach my $snp (5..$#lines) { if ($lines[$snp] == 0) { $alleles = "$major $major";} elsif ($lines[$snp] == 1) { $alleles = "$major $minor";} elsif ($lines[$snp] == 2) { $alleles = "$minor $minor";} else { $alleles = "0 0";} print TPED " $alleles"; } print TPED "\n"; } close(INFILE); close(TFAM); close(TPED);
Again, no guarantees! It's Perl though, so it should be more stable across various operating systems.




















![The Foolish Dictionary An exhausting work of reference to un-certain English words, their origin, meaning, legitimate and illegitimate use, confused by a few pictures [not included] The Foolish Dictionary An exhausting work of reference to un-certain English words, their origin, meaning, legitimate and illegitimate use, confused by a few pictures [not included]](http://ecx.images-amazon.com/images/I/51wVQz%2BxdVL._SL75_.jpg)


Recent Comments