Hello, i am new in the community;
i need a macro in SAS (but I have no experience) that meets the following function:
From a genealogy file (example):
data one;
input individual sire dam sex birth_year;
cards;
3 1 2 1 2005
10 3 4 2 2010
11 7 8 1 2011
13 11 10 1 2014
16 11 6 2 2015
;
I would like to search the complete genealogy (all possible kinship relationships) for one or more specific individuals (specified by me), so in theory the file should be as follows:
Eg. if looking for the complete genealogy of the individual 13:
individual sire dam sex birth_year
3 1 2 1 2005
4 0 0 2 .
7 0 0 1 .
8 0 0 2 .
10 3 4 2 2010
11 7 8 1 2011
13 11 10 1 2014
Well, I hope we can help me.
Thank you!
Alan
So you only care about its Fathers or Mothers ? data one; input individual sire dam sex birth_year; cards; 3 1 2 1 2005 10 3 4 2 2010 11 7 8 1 2011 13 11 10 1 2014 16 11 6 2 2015 ; run; data have; if _n_=1 then do; if 0 then set one; declare hash h(dataset:'one'); h.definekey('individual'); h.definedone(); end; set one; output; _sire=sire; _dam=dam; if h.check(key:_sire) ne 0 then do; individual=_sire; h.add(); sire=0; dam=0; sex=1; birth_year=.; output; end; if h.check(key:_dam) ne 0 then do; individual=_dam; h.add(); sire=0; dam=0; sex=2; birth_year=.; output; end; drop _:; run; data key; set one; start=individual;end=sire;output; end=dam;output; keep start end; run; data want; if 0 then set have; declare hash h(dataset:'have'); h.definekey('individual'); h.definedata('sire','dam','sex','birth_year'); h.definedone(); if 0 then set key; declare hash k(dataset:'key',multidata:'y'); k.definekey('start'); k.definedata('end'); k.definedone(); declare hash path(ordered:'y'); declare hiter hi_pa('path'); path.definekey('n'); path.definedata('last'); path.definedone(); individual=13; /*<-----*/ n=1; path.add(key:n,data:individual); do while(hi_pa.next()=0); individual=last; if h.find()=0 then output; _last=last; rc=k.find(key:_last); do while(rc=0); n+1;last=end;path.add(); rc=k.find_next(key:_last); end; end; stop; drop rc last _last n start end; run;
So you only care about its Fathers or Mothers ? data one; input individual sire dam sex birth_year; cards; 3 1 2 1 2005 10 3 4 2 2010 11 7 8 1 2011 13 11 10 1 2014 16 11 6 2 2015 ; run; data have; if _n_=1 then do; if 0 then set one; declare hash h(dataset:'one'); h.definekey('individual'); h.definedone(); end; set one; output; _sire=sire; _dam=dam; if h.check(key:_sire) ne 0 then do; individual=_sire; h.add(); sire=0; dam=0; sex=1; birth_year=.; output; end; if h.check(key:_dam) ne 0 then do; individual=_dam; h.add(); sire=0; dam=0; sex=2; birth_year=.; output; end; drop _:; run; data key; set one; start=individual;end=sire;output; end=dam;output; keep start end; run; data want; if 0 then set have; declare hash h(dataset:'have'); h.definekey('individual'); h.definedata('sire','dam','sex','birth_year'); h.definedone(); if 0 then set key; declare hash k(dataset:'key',multidata:'y'); k.definekey('start'); k.definedata('end'); k.definedone(); declare hash path(ordered:'y'); declare hiter hi_pa('path'); path.definekey('n'); path.definedata('last'); path.definedone(); individual=13; /*<-----*/ n=1; path.add(key:n,data:individual); do while(hi_pa.next()=0); individual=last; if h.find()=0 then output; _last=last; rc=k.find(key:_last); do while(rc=0); n+1;last=end;path.add(); rc=k.find_next(key:_last); end; end; stop; drop rc last _last n start end; run;
Hi Ksharp, it works! thank you very much!!
I do not want to abuse your generosity, but I do another question:
- If I want to search more than one individual (block individuals, for example 16 and 13), this code works?
Thank you!
Alan
P.S.: I do not understand your question about fathers and mothers...
If you are working with similar data frequently you may want to investigate Proc Inbreed.
Change it : individual=16; And combine them together into one table. I mean you only care about its ancients , don't care about its offspring or brothers ?
OK. Here is . data one; input individual sire dam sex birth_year; cards; 3 1 2 1 2005 10 3 4 2 2010 11 7 8 1 2011 13 11 10 1 2014 16 11 6 2 2015 ; run; data have; if _n_=1 then do; if 0 then set one; declare hash h(dataset:'one'); h.definekey('individual'); h.definedone(); end; set one; output; _sire=sire; _dam=dam; if h.check(key:_sire) ne 0 then do; individual=_sire; h.add(); sire=0; dam=0; sex=1; birth_year=.; output; end; if h.check(key:_dam) ne 0 then do; individual=_dam; h.add(); sire=0; dam=0; sex=2; birth_year=.; output; end; drop _:; run; data key; set one; start=individual;end=sire;output; end=dam;output; keep start end; run; proc sql; /*<-------------------------*/ create table individual as select distinct start as id from key where start in (13 16); quit; data want; if _n_=1 then do; if 0 then set have; declare hash h(dataset:'have'); h.definekey('individual'); h.definedata('sire','dam','sex','birth_year'); h.definedone(); if 0 then set key; declare hash k(dataset:'key',multidata:'y'); k.definekey('start'); k.definedata('end'); k.definedone(); declare hash path(ordered:'y'); declare hiter hi_pa('path'); path.definekey('n'); path.definedata('last'); path.definedone(); end; set individual; individual=id; /*<-----*/ n=1; path.add(key:n,data:individual); do while(hi_pa.next()=0); individual=last; if h.find()=0 then output; _last=last; rc=k.find(key:_last); do while(rc=0); n+1;last=end;path.add(); rc=k.find_next(key:_last); end; end; path.clear(); drop rc last _last n start end; run;
it works! Thank you!
Now I understand your question about the fathers and mothers.
It's very interesting your point of view about incorporating descendants and siblings. But it is not proposed because it seemed complicated to do.
But you think you can achieve some code that includes descencientes in the search?
Concerning the siblings, note that is a population of cattle and there are full siblings (sons of the same sire and dam) and half-siblings (sons of the same sire but not the same dam, and vice versa).
What do you say, can be done?
Thanks!
Alan
"But you think you can achieve some code that includes descencientes in the search?" Yes. I can make that happened include full siblings and half-siblings . BTW, My code would not work if you have bad data (a dead loop) Like: id Father Mother 1 2 3 2 4 5 5 1 6 1->2->5->1 But I can fixed it though.
Ok. Sorry, but I do not entirely understand what a dead loop ?
If you have the data as I showed you . You will got this forever . My code will pending there and never finished. 1->2->5-> 1->2->5-> 1->2->5 ->1->2->5 ->1->2->5 ->1->2->5->1............ forever..
Sure. I take individual=16 as an example. data one; input individual sire dam sex birth_year; cards; 3 1 2 1 2005 10 3 4 2 2010 11 7 8 1 2011 13 11 10 1 2014 16 11 6 2 2015 ; run; data have; if _n_=1 then do; if 0 then set one; declare hash h(dataset:'one'); h.definekey('individual'); h.definedone(); end; set one; output; _sire=sire; _dam=dam; if h.check(key:_sire) ne 0 then do; individual=_sire; h.add(); sire=0; dam=0; sex=1; birth_year=.; output; end; if h.check(key:_dam) ne 0 then do; individual=_dam; h.add(); sire=0; dam=0; sex=2; birth_year=.; output; end; drop _:; run; data key; set one; from=individual;to=sire;output; to=dam;output; keep from to; run; data full; set key end=last; if _n_ eq 1 then do; declare hash h(); h.definekey('node'); h.definedata('node'); h.definedone(); end; output; node=from; h.replace(); from=to; to=node; output; node=from; h.replace(); if last then h.output(dataset:'node'); drop node; run; %let individual=16; data want(keep=node household); declare hash ha(ordered:'a'); declare hiter hi('ha'); ha.definekey('count'); ha.definedata('last'); ha.definedone(); declare hash _ha(hashexp: 20); _ha.definekey('key'); _ha.definedone(); if 0 then set full; declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y'); from_to.definekey('from'); from_to.definedata('to'); from_to.definedone(); if 0 then set node; declare hash no(dataset:'node(where=(node=&individual))'); declare hiter hi_no('no'); no.definekey('node'); no.definedata('node'); no.definedone(); do while(hi_no.next()=0); household+1; output; count=1; key=node;_ha.add(); last=node;ha.add(); rc=hi.first(); do while(rc=0); from=last;rx=from_to.find(); do while(rx=0); key=to;ry=_ha.check(); if ry ne 0 then do; node=to;output;rr=no.remove(key:node); key=to;_ha.add(); count+1; last=to;ha.add(); end; rx=from_to.find_next(); end; rc=hi.next(); end; ha.clear();_ha.clear(); end; stop; run; data final_want; if _n_=1 then do; if 0 then set have; declare hash h(dataset:'have'); h.definekey('individual'); h.definedata('sire','dam','sex','birth_year'); h.definedone(); end; call missing(of _all_); set want; individual=node; h.find(); run;
Sorry for the delay..
it works perfectly!
thank you very much for your help!
Alan
hi Ksharp,
I need your help again!
For the last code, I can not combined several individuals (for looking) together into one table. That is, replace "% let individual = 16" with a table with several individuals.
how can I do it?
Thank you!
Alan
Sure. data one; input individual sire dam sex birth_year; cards; 3 1 2 1 2005 10 3 4 2 2010 11 7 8 1 2011 13 11 10 1 2014 16 11 6 2 2015 20 17 18 2 2015 ; run; data have; if _n_=1 then do; if 0 then set one; declare hash h(dataset:'one'); h.definekey('individual'); h.definedone(); end; set one; output; _sire=sire; _dam=dam; if h.check(key:_sire) ne 0 then do; individual=_sire; h.add(); sire=0; dam=0; sex=1; birth_year=.; output; end; if h.check(key:_dam) ne 0 then do; individual=_dam; h.add(); sire=0; dam=0; sex=2; birth_year=.; output; end; drop _:; run; data key; set one; from=individual;to=sire;output; to=dam;output; keep from to; run; data full; set key end=last; if _n_ eq 1 then do; declare hash h(); h.definekey('node'); h.definedata('node'); h.definedone(); end; output; node=from; h.replace(); from=to; to=node; output; node=from; h.replace(); if last then h.output(dataset:'node'); drop node; run; /*eplace "% let individual = 16" with a table with several individuals*/ data n; input node; cards; 16 20 ; run; /***********/ data want(keep=node household); declare hash ha(ordered:'a'); declare hiter hi('ha'); ha.definekey('count'); ha.definedata('last'); ha.definedone(); declare hash _ha(hashexp: 20); _ha.definekey('key'); _ha.definedone(); if 0 then set full; declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y'); from_to.definekey('from'); from_to.definedata('to'); from_to.definedone(); if 0 then set node; declare hash no(dataset:'n');/***<---*******/ declare hiter hi_no('no'); no.definekey('node'); no.definedata('node'); no.definedone(); do while(hi_no.next()=0); household+1; output; count=1; key=node;_ha.add(); last=node;ha.add(); rc=hi.first(); do while(rc=0); from=last;rx=from_to.find(); do while(rx=0); key=to;ry=_ha.check(); if ry ne 0 then do; node=to;output;rr=no.remove(key:node); key=to;_ha.add(); count+1; last=to;ha.add(); end; rx=from_to.find_next(); end; rc=hi.next(); end; ha.clear();_ha.clear(); end; stop; run; data final_want; if _n_=1 then do; if 0 then set have; declare hash h(dataset:'have'); h.definekey('individual'); h.definedata('sire','dam','sex','birth_year'); h.definedone(); end; call missing(of _all_); set want; individual=node; h.find(); run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.