BookmarkSubscribeRSS Feed
wrosario
Calcite | Level 5

Friends, someone has already done recursion in sas?

example.

source table :

KEYNAMEFATHERDATE_STARTDATE_END
1SÃO PAULO201/01/190012/31/2015
2BRASIL01/01/190012/31/2015
3SANTOS101/01/190012/31/2015
4PRAIA GRANDE301/01/190012/31/2015
5ESPIRITO SANTO201/01/190012/31/2015
6VITORIA501/01/190012/31/2015
7MATA DA PRAIA601/01/190012/31/2015
8BAHIA201/01/190012/31/2015
9SALVADOR801/01/190012/31/2015

result table:

SKLEVELNAMEKEYFATHERDATE_STARTDATE_END
11BRASIL201/01/190012/31/2015
22-SÃO PAULO1201/01/190012/31/2015
33--SANTOS3101/01/190012/31/2015
44---PRAIA GRANDE4301/01/190012/31/2015
52-ESPIRITO SANTO5201/01/190012/31/2015
63--VITORIA6501/01/190012/31/2015
74---MATA DA PRAIA7601/01/190012/31/2015
82-BAHIA8201/01/190012/31/2015
93--SALVADOR9801/01/190012/31/2015

if possible like that ... result table two:


SK

DATE_START

DATE_ENDKEY_1NAME_1KEY_2NAME_2KEY_3NAME_3KEY_4NAME_4LEVEL
101/01/190012/31/20152BRASIL2BRASIL2BRASIL2BRASIL1
201/01/190012/31/20152BRASIL1SÃO PAULO1SÃO PAULO1SÃO PAULO2
301/01/190012/31/20152BRASIL1SÃO PAULO3SANTOS3SANTOS3
401/01/190012/31/20152BRASIL1SÃO PAULO3SANTOS4PRAIA GRANDE4
501/01/190012/31/20152BRASIL5ESPIRITO SANTO5ESPIRITO SANTO5ESPIRITO SANTO2
601/01/190012/31/20152BRASIL5ESPIRITO SANTO6VITORIA6VITORIA3
701/01/190012/31/20152BRASIL5ESPIRITO SANTO6VITORIA7MATA DA PRAIA4
801/01/190012/31/20152BRASIL8BAHIA8BAHIA8BAHIA2
901/01/190012/31/20152BRASIL8BAHIA9SALVADOR9SALVADOR3


Thank you!


12 REPLIES 12
gergely_batho
SAS Employee

In proc fcmp and proc ds2  you can define functions, that can call themselves.

link statement in datastep can be also used, but very carefully.

If you want to look for ancestors or descendants or traverse a graph, sometimes you can accomplish this using a loop.

Typically you load such data structures into a hash object but other alternatives techniques are also possible (array, set point=, set key=).

wrosario
Calcite | Level 5

you have any example?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

wrosario
Calcite | Level 5

not yet worked out ..

PGStats
Opal | Level 21

You might also want to look at

and

and the documentation for proc optnet in SAS/OR.

PG

PG
wrosario
Calcite | Level 5

I did not understand

billfish
Quartz | Level 8

Your question has been addressed a few times in this forum.

One does not need recursion to solve the problem.

2 observations:

    1- The Father (or Parent) of a lineage does not show up in the Key column.
    2- The "end-of-lineage" child does not show in the Father (or Parent) column.
       If a child does not become Father, then the lineage stops. End-of-lineage.

So one finds fairly easily the "end-of-lineage" children and then one goes back to reach the Father.

From what you show as the intended result, you have repeated the "end-of-lineage" child several times to pad records.

The following is a solution (no padding of records). You need to replace

    Key    -> zChild
    Father -> zParent
    Level  -> T_gen

Your suggested output assumes no more than 4 generations per lineage. My proposed solution assumes no more than 9999 generations per lineage.

In the proposed solution:
   1- dataset t_a    is the input  dataset;
   2- dataset t_want is the output dataset.
   3- does not pad the records.
   4- array GG(10000) covers column KEY of your input; if you want the output to have the names, assign another array, say, HH(10000) to cover the names.
      Whatever is done to array GG(), do it to array HH().

/*************************/

/*** proposed solution ***/

/*************************/

data t_want(keep=t_gen P1 G1-G9999);
   length c_end 8.;
   t_max=0;
   array GG(10000) P1 G1-G9999;

   if _N_=1 then do;
      if 0 then set t_a;
      declare hash ha(dataset:'t_a',multidata:'N' );
      ha.definekey('zChild');
      ha.definedata('zChild','zParent');
      ha.definedone();

      declare hash hb(dataset:'t_a',multidata:'N' );
      hb.definekey('zParent');
      hb.definedata('zChild','zParent');
      hb.definedone();

      declare hash hd(multidata:'N');
      hd.definekey('c_end');
      declare hiter dIter('hd');
      hd.definedone();
   end;

   /******************************************/

   /*** make list of end-of-line children ***/

   /******************************************/

   do until (zDone);
      set t_a(rename=(zchild=zc zparent=zp)) nobs=n_last end=zDone;
      zParent=zC;
      if hb.find() then do;
         c_end=zParent;
         if hd.check() then do; hd.add(); end;
      end;
   end;

   /****************************************************/

   /*** for each end-of-line child, find the lineage ***/

   /****************************************************/

   rc = dIter.first();
   do while (rc = 0);
      aDone=0;
      t_gen=0;
      do until (aDone);
              if (t_gen=0)      then do; t_gen+1; GG(t_gen)= c_end;   zChild=c_end;   end;
         else if not(ha.find()) then do; t_gen+1; GG(t_gen)= zParent; zChild=zParent; end;
                                else do; aDone=1; end;
      end;

      /********************************/

      /*** reverse lineage sequence ***/

      /********************************/
      do i=1 to int(t_gen/2);
         x1= GG(i);
         GG(i)= GG(t_gen+1-i);
         GG(t_gen+1-i)=x1;
      end;
      output;

      /********************************************************/

      /*** with one lineage completed, clear out array GG() ***/

      /********************************************************/
      call missing(of GG(*));
      t_max=max(t_max,t_gen);
      rc = dIter.next();
   end;

   /***************************************************************/

   /*** remove the fields with only null values for all records ***/

   /***************************************************************/

   call execute('data t_want; set t_want(keep=t_gen P1 G1-'||compress('G'||put(t_max-1,6.))||');run;');
run;

wrosario
Calcite | Level 5

- ERROR: Not all variables in the list G1-G0 were found.
- WARNING: The data set WORK.W8B2NR may be incomplete. When this step was stopped there were 0 observations and 0 variables.
- WARNING: Data set WORK.W8B2NR was not replaced because this step was stopped.

/*************************/

/*** proposed solution ***/

/*************************/

data WORK.W8B2NR(keep=t_gen P1 G1-G9999);

   length c_end 8.;

   t_max=0;

   array GG(10000) P1 G1-G9999;

   if _N_=1 then do;

      if 0 then set ORA_BISI.TEMP_HSETOR;

      declare hash ha(dataset:'ORA_BISI.TEMP_HSETOR',multidata:'N' );

      ha.definekey('SETOR');

      ha.definedata('SETOR','PAISETOR');

      ha.definedone();

      declare hash hb(dataset:'ORA_BISI.TEMP_HSETOR',multidata:'N' );

      hb.definekey('PAISETOR');

      hb.definedata('SETOR','PAISETOR');

      hb.definedone();

      declare hash hd(multidata:'N');

      hd.definekey('c_end');

      declare hiter dIter('hd');

      hd.definedone();

   end;

   /******************************************/

   /*** make list of end-of-line children ***/

   /******************************************/

   do until (zDone);

      set ORA_BISI.TEMP_HSETOR(rename=(SETOR=zc PAISETOR=zp)) nobs=n_last end=zDone;

      PAISETOR=zC;

      if hb.find() then do;

         c_end=PAISETOR;

         if hd.check() then do; hd.add(); end;

      end;

   end;

   /****************************************************/

   /*** for each end-of-line child, find the lineage ***/

   /****************************************************/

   rc = dIter.first();

   do while (rc = 0);

      aDone=0;

      t_gen=0;

      do until (aDone);

              if (t_gen=0)      then do; t_gen+1; GG(t_gen)= c_end;   SETOR=c_end;   end;

         else if not(ha.find()) then do; t_gen+1; GG(t_gen)= PAISETOR; SETOR=PAISETOR; end;

                                else do; aDone=1; end;

      end;

      /********************************/

      /*** reverse lineage sequence ***/

      /********************************/

      do i=1 to int(t_gen/2);

         x1= GG(i);

         GG(i)= GG(t_gen+1-i);

         GG(t_gen+1-i)=x1;

      end;

      output;

      /********************************************************/

      /*** with one lineage completed, clear out array GG() ***/

      /********************************************************/

      call missing(of GG(*));

      t_max=max(t_max,t_gen);

      rc = dIter.next();

   end;

   /***************************************************************/

   /*** remove the fields with only null values for all records ***/

   /***************************************************************/

   call execute('data WORK.W8B2NR; set WORK.W8B2NR(keep=t_gen P1 G1-'||compress('G'||put(t_max-1,6.))||');run;');

run;

Ksharp
Super User

Here is an example :

 
data have;
infile datalines truncover;
input Key $ Name & $40. Father $;
datalines;
1   SO PAULO    2   
2   BRASIL      
3   SANTOS  1   
4   PRAIA GRANDE    3   
5   ESPIRITO SANTO  2   
6   VITORIA   5   
7   MATA DA PRAIA   6   
8   BAHIA   2   
9   SALVADOR    8
;
run;

data ancestor;
if _n_ eq 1 then do;
 if 0 then set have;
 declare hash h(dataset:'have',hashexp:20);
 h.definekey('Key');
 h.definedone();
end;
 set have;
 if h.check(key:Father) ne 0;
run;


data want (keep =  path);
if _n_ eq 1 then do;
length path _path  $ 4000 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(Father is not missing and Key is not missing))',multidata:'Y');
ha.definekey('Father');
ha.definedata('Key','Name');
ha.definedone();
declare hash pa(ordered:'Y');
declare hiter hi_path('pa');
pa.definekey('count');
pa.definedata('path');
pa.definedone();
end;
set ancestor;
count=1;
path=catx('|',Name,Key);
pa.add();
do while(hi_path.next()=0);
_path=path;   
Father=scan(path,-1,'|'); 
rc=ha.find();
if rc ne 0 then output;
do while(rc=0);
  if not find(path,strip(Key)) then do;
                                      count+1;
                                      path=catx('|',path,Name,Key);
                                      pa.add();
                                      path=_path;
  end;
  rc=ha.find_next();
end;
end;
pa.clear();
run;

data final_want;
set want;
length sk level 8 Name Key  Father $ 80 ;
level=0;
do i=1 to countw(path,'|') by 2;
 sk+1;level+1;
 Name=scan(path,i,'|');
 Key=scan(path,i+1,'|');
 output;
 Father=Key;
end;
drop i path ;
run;


Xia Keshan

wrosario
Calcite | Level 5

it is almost friend, need to get those who have no father, and also need to take up the last level information even if she does not have more children.

thank you very much, you are the best!

Ksharp
Super User

What else do you need more ? You can process my result to fit your demand later .

art297
Opal | Level 21

Methinks you were too quick in dismissing billfish's suggested code. Other than the dates, which you've never mentioned how you want to treat, the following additions to billfish's code results in the table you said you would like to attain:

libname ora_bsi "c:\temp";

data ora_bsi.temp_hsetor;

  informat name $20.;

  informat date_start date_end mmddyy10.;

  format date_start date_end mmddyy10.;

  infile cards dlm='09'x;

  input KEY NAME FATHER DATE_START DATE_END;

  cards;

1 SÃO PAULO 2 01/01/1900 12/31/2015

2 BRASIL . 01/01/1900 12/31/2015

3 SANTOS 1 01/01/1900 12/31/2015

4 PRAIA GRANDE 3 01/01/1900 12/31/2015

5 ESPIRITO SANTO 2 01/01/1900 12/31/2015

6 VITORIA 5 01/01/1900 12/31/2015

7 MATA DA PRAIA 6 01/01/1900 12/31/2015

8 BAHIA 2 01/01/1900 12/31/2015

9 SALVADOR 8 01/01/1900 12/31/2015

;

data t_a;

  set ora_bsi.temp_hsetor (rename=(key=zChild father=zParent));

run;

/*************************/

/*** proposed solution ***/

/*************************/

data t_want(keep=t_gen P1 G1-G9999);

   length c_end 8.;

   t_max=0;

   array GG(10000) P1 G1-G9999;

   if _N_=1 then do;

      if 0 then set t_a;

      declare hash ha(dataset:'t_a',multidata:'N' );

      ha.definekey('zChild');

      ha.definedata('zChild','zParent');

      ha.definedone();

      declare hash hb(dataset:'t_a',multidata:'N' );

      hb.definekey('zParent');

      hb.definedata('zChild','zParent');

      hb.definedone();

      declare hash hd(multidata:'N');

      hd.definekey('c_end');

      declare hiter dIter('hd');

      hd.definedone();

   end;

   /******************************************/

   /*** make list of end-of-line children ***/

   /******************************************/

   do until (zDone);

      set t_a(rename=(zchild=zc zparent=zp)) nobs=n_last end=zDone;

      zParent=zC;

      if hb.find() then do;

         c_end=zParent;

         if hd.check() then do; hd.add(); end;

      end;

   end;

   /****************************************************/

   /*** for each end-of-line child, find the lineage ***/

   /****************************************************/

   rc = dIter.first();

   do while (rc = 0);

      aDone=0;

      t_gen=0;

      do until (aDone);

              if (t_gen=0)      then do; t_gen+1; GG(t_gen)= c_end;   zChild=c_end;   end;

         else if not(ha.find()) then do; t_gen+1; GG(t_gen)= zParent; zChild=zParent; end;

                                else do; aDone=1; end;

      end;

      /********************************/

      /*** reverse lineage sequence ***/

      /********************************/

      do i=1 to int(t_gen/2);

         x1= GG(i);

         GG(i)= GG(t_gen+1-i);

         GG(t_gen+1-i)=x1;

      end;

      output;

      /********************************************************/

      /*** with one lineage completed, clear out array GG() ***/

      /********************************************************/

      call missing(of GG(*));

      t_max=max(t_max,t_gen);

      rc = dIter.next();

   end;

   /***************************************************************/

   /*** remove the fields with only null values for all records ***/

   /***************************************************************/

   call execute('data t_want; set t_want(keep=t_gen P1 G1-'||compress('G'||put(t_max-1,6.))||');run;');

run;

data ctrl;                                                                                                                           

  set ora_bsi.temp_hsetor (rename=(key=start

  name=label)) end=last;

  retain fmtname 'names' type 'n';

  output;

  if last then do;

    hlo='O';

    label='***ERROR***';

    output;

  end;

run;

proc format library=work cntlin=ctrl;

run;

data want (keep=key: name: level sk);

  set t_want;

  array oldkey(*) g:;

  array key_(4);

  array name_(4) $20.;

  retain mainkey;

  if _n_ eq 1 then do;

    do i=1 to t_gen-1;

      key_(i)=oldkey(1);

      name_(i)=put(oldkey(1),names.);

    end;

    mainkey=oldkey(1);

    sk=1;

    level=1;

    output;

  end;

  else do;

    key_(1)=mainkey;

    name_(1)=put(key_(1),names.);

    level=1;

  end;

  if 2 le t_gen-1 then do;

    do i=2 to 4;

      key_(i)=oldkey(2);

      name_(i)=put(key_(i),names.);

    end;

    sk+1;

    level+1;

    output;

  end;

  if 3 le t_gen-1 then do;

    do i=3 to 4;

      key_(i)=oldkey(3);

      name_(i)=put(key_(i),names.);

    end;

    sk+1;

    level+1;

    output;

  end;

  if 4 le t_gen-1 then do;

    key_(4)=oldkey(4);

    name_(4)=put(key_(4),names.);

    sk+1;

    level+1;

    output;

  end;

run;

data want;

  retain sk key_1 name_1 key_2 name_2 key_3 name_3 key_4 name_4 level;

  set want;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 2066 views
  • 0 likes
  • 7 in conversation