Hello,
I think this program gives you what you want:
id = id of animal x
cid = id of animal y (which is compared to animal x)
location_of_diff = Traits different for animal x and animal y
data have ;
input id (T1-T12) ($) ;
cards ;
1 E A E B A A E A C B E F
2 F A E F E E E E A E E E
3 A C C E E E A F D D E F
4 A F F D B B C A A A C C
5 B B C C D A A A B E E F
;
run ;
data have1(rename=(id=cid) drop=i T1-T12);
set have;
array origT{12} $ T1- T12;
array copyT{12} $ cT1-cT12;
do i=1 to dim(origT);
copyT(i)=origT(i);
end;
run;
data _NULL_;
if 0 then set have1 nobs=count;
call symput('numobs',strip(put(count,8.)));
STOP;
run;
%PUT &=numobs;
data want(drop=j T1-T12 cT1-cT12);
LENGTH id cid 8 location_of_diff $ 200;
set have;
array origT{12} $ T1- T12;
array copyT{12} $ cT1-cT12;
array equaT{12} $ eT1-eT12;
do pointer=1 to &numobs.;
set have1 point=pointer;
location_of_diff='';
do j=1 to dim(origT);
if copyT(j)=origT(j) then equaT(j)='Y';
else equaT(j)='N';
if equaT(j)='N' then location_of_diff=strip(location_of_diff)!!strip(vname(origT(j)));
end;
if id ^= cid then output;
end;
run;
/* end of program */
Note: if you have 10 000+ animals, it may be worthwhile (performance-wise) to turn the data set have1 into a hash table and doing a Hash Object Table Look-up.
I haven't put any comments, hoping that you can grasp the code without. If not, tell me!
Cheers,
Koen
... View more