DATA Step, Macro, Functions and more

help with hash table

Reply
Super Contributor
Posts: 336

help with hash table

[ Edited ]

Hi,

I want to use a hash table to create a lookup table.

Can someone help me correct the below?

 

Long story short, I am going to loop over the new and old contracts of a given customer. I got this working with data step, but it is too slow. Will give it a try with hash tables.

 

For the example below, once I iterate over these cases, I will notice that I have 5 customers:

C1 had contracts 1,2,3,4,5

C2 had contracts a,b,c,d

C3,C4,C5 respectively had contracts x,y,z

 

Thanks!

Miguel

 

data a;
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;

data b(rename=(old=contract contract=contract1));
set a;
run;

data result;
length contract contract1 $ 1;
set a;
if _n_=1 then do;
declare hash sc (dataset: 'b');
sc.definekey('contract');
sc.definedata('contract1');
sc.defineDone();
end;
*if sc.find() ge 0;
run;

/*The results data set is not what I expected*/
/*work.expected below is what I am trying to get*/

proc sort data=a;
by contract;
run;
proc sort data=b;
by contract;
run;

data expected;
merge a b;
by contract;
run;

 

 

 

Super Contributor
Posts: 336

Re: help with hash table

Uncommenting this line did the trick:

*if sc.find() ge 0;

It also works if you assign that find() value to any variable. E.g.

code=sc.find(); 

Super User
Posts: 9,681

Re: help with hash table

Here is .

I noticed you used to be sas employee. Now you leave off sas company ?

 

And this forum will  eat ' : '  in my code .

Change 

multidata:'y'

into

m.....a : 'y'

 

One more question: the start points are alway when old is missing, right ?

 

data have;
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;

data want;
if _n_ eq 1 then do;
length path _path  $ 4000 ;

if 0 then set have;
declare hash k(hashexp:20,dataset:'have(where=(old is not missing and contract is not missing))',multidata:'y');
k.definekey('old');
k.definedata('contract');
k.definedone();
 
declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('path');
pa.definedone();
 
end;
 
set have(where=(old is missing));
n=1;
path=catx(' ',old,contract);
pa.add();
do while(hi_path.next()=0);
_path=path;   
old=scan(path,-1,' ');
rc=k.find();  

if rc ne 0 then do;
 c+1;output;
end;

do while(rc=0);
  if not findw(path,strip(contract)) then do;
   if length(path)+length(contract)+1 gt lengthc(path) then do; 
     putlog 'ERROR: The length of path and _path are set too short';
     stop;
   end;
   n+1;
   path=catx(' ',path,contract);
   pa.add();
   path=_path;
  end;
   
  rc=k.find_next();
end;
end;
pa.clear();
keep c path ;
run;
Super User
Posts: 9,681

Re: help with hash table

This code could be better.

 

 


data have(rename=(old=from contract=to));
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;



data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node;
  output;
  node=from; h.replace();
  if last then h.output(dataset:'node(where=(node is not missing))');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;
Super Contributor
Posts: 336

Re: help with hash table

Hi Xia,

Nice program! I didn't get it to work but I wasn't sure is if I my SAS version is too old (9.1.3) or if &colon is a macro variable that I had to declare somewhere with a %let statement.

I will give it another try with SAS 9.4 which I am so grateful to get shortly. I'll start on 9.4 to leverage hpsplit, gampl, and logistic procedures for our predictive modeling. With the extra revenue we'll buy a full license of Enterprise Miner next year.

I call myself a SAS alumni. Part of SAS family once, means SAS family always.

 

Thanks again! Looking forward to trying your code on SAS 9.4.

 

Keeping a copy of the log here for my records...

 


1
2    data have(rename=(old=from contract=to));
3    contract='1';old='';output;
4    contract='2';old='1';output;
5    contract='x';old='';output;
6    contract='a';old='';output;
7    contract='3';old='2';output;
8    contract='y';old='';output;
9    contract='b';old='a';output;
10   contract='4';old='3';output;
11   contract='z';old='';output;
12   contract='5';old='4';output;
13   contract='c';old='b';output;
14   contract='d';old='c';output;
15   run;

NOTE: The data set WORK.HAVE has 12 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


16
17
18
19   data full;
20     set have end=last;
21     if _n_ eq 1 then do;
22      declare hash h();
23       h.definekey('node');
24        h.definedata('node');
25        h.definedone();
26     end;
27     output;
28     node=from; h.replace();
29     from=to; to=node;
30     output;
31     node=from; h.replace();
32     if last then h.output(dataset:'node(where=(node is not missing))');
33     drop node;
34   run;

ERROR: Invalid data set name at line 32 column 16.
ERROR: An error has occurred during instance method OM_OUTPUT(505) of "DATASTEP.HASH".
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 12 observations read from the data set WORK.HAVE.
WARNING: The data set WORK.FULL may be incomplete.  When this step was stopped there were 24
         observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


35
36
37   data want(keep=node household);
38   declare hash ha(ordered:'a');
39   declare hiter hi('ha');
40   ha.definekey('count');
41   ha.definedata('last');
42   ha.definedone();
43   declare hash _ha(hashexp: 20);
44   _ha.definekey('key');
45   _ha.definedone();
46
47   if 0 then set full;
48   declare hash from_to(dataset:'full(where=(from is not missing and to is not
48 ! missing))',hashexp:20,multidata&colon:'y');
                                          -
                                          22
                                          76
WARNING: Apparent symbolic reference COLON not resolved.
ERROR 22-322: Syntax error, expecting one of the following: <, <=, =, >, >=, EQ, GE, GT, LE, LT,
              NE, NG, NL, ^=, ~=.

ERROR 76-322: Syntax error, statement will be ignored.

49    from_to.definekey('from');
50    from_to.definedata('to');
51    from_to.definedone();
52
53   if 0 then set node;
ERROR: File WORK.NODE.DATA does not exist.
54   declare hash no(dataset:'node');
55   declare hiter hi_no('no');
56    no.definekey('node');
57    no.definedata('node');
58    no.definedone();
59
60
61   do while(hi_no.next()=0);
62    household+1; output;
63    count=1;
64    key=node;_ha.add();
65    last=node;ha.add();
66    rc=hi.first();
67    do while(rc=0);
68      from=last;rx=from_to.find();
69      do while(rx=0);
70        key=to;ry=_ha.check();
71         if ry ne 0 then do;
72          node=to;output;rr=no.remove(key:node);
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

ERROR: DATA STEP Component Object failure.  Aborted during the COMPILATION phase.
NOTE: The SAS System stopped processing this step because of errors.
73          key=to;_ha.add();
74          count+1;
75          last=to;ha.add();
76         end;
77         rx=from_to.find_next();
              -----------------
              558
ERROR 558-185: Reference find_next is not a member of object from_to.

78      end;
79      rc=hi.next();
80   end;
81   ha.clear();_ha.clear();
82   end;
83   stop;
84   run;
Super User
Posts: 9,681

Re: help with hash table

Yeah. I believe you SAS is too old to recognize the Hash Method  ha.output() .

And &colon; is not a macro vaiable. It is a mistake of SAS Forum .

You need 

Change 

multidata&colon;'y'

into

m.....a : 'y'

 

 

Copy the LOG here ,running under SAS University Edition .

Good Luck.

 

 

 

 

 
 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 51         
 52         data have(rename=(old=from contract=to));
 53         contract='1';old='';output;
 54         contract='2';old='1';output;
 55         contract='x';old='';output;
 56         contract='a';old='';output;
 57         contract='3';old='2';output;
 58         contract='y';old='';output;
 59         contract='b';old='a';output;
 60         contract='4';old='3';output;
 61         contract='z';old='';output;
 62         contract='5';old='4';output;
 63         contract='c';old='b';output;
 64         contract='d';old='c';output;
 65         run;
 
 NOTE: The data set WORK.HAVE has 12 observations and 2 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.03 seconds
       cpu time            0.01 seconds
       
 
 66         
 67         
 68         
 69         data full;
 70           set have end=last;
 71           if _n_ eq 1 then do;
 72            declare hash h();
 73             h.definekey('node');
 74              h.definedata('node');
 75              h.definedone();
 76           end;
 77           output;
 78           node=from; h.replace();
 79           from=to; to=node;
 80           output;
 81           node=from; h.replace();
 82           if last then h.output(dataset:'node(where=(node is not missing))');
 83           drop node;
 84         run;
 
 NOTE: The data set WORK.NODE has 12 observations and 1 variables.
 NOTE: There were 12 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.FULL has 24 observations and 2 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.07 seconds
       cpu time            0.04 seconds
       
 
 85         
 86         
 87         data want(keep=node household);
 88         declare hash ha(ordered:'a');
 89         declare hiter hi('ha');
 90         ha.definekey('count');
 91         ha.definedata('last');
 92         ha.definedone();
 93         declare hash _ha(hashexp: 20);
 94         _ha.definekey('key');
 95         _ha.definedone();
 96         
 97         if 0 then set full;
 98         declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata&colon;'y');
 99          from_to.definekey('from');
 100         from_to.definedata('to');
 101         from_to.definedone();
 102        
 103        if 0 then set node;
 104        declare hash no(dataset:'node');
 105        declare hiter hi_no('no');
 106         no.definekey('node');
 107         no.definedata('node');
 108         no.definedone();
 109        
 110        
 111        do while(hi_no.next()=0);
 112         household+1; output;
 113         count=1;
 114         key=node;_ha.add();
 115         last=node;ha.add();
 116         rc=hi.first();
 117         do while(rc=0);
 118           from=last;rx=from_to.find();
 119           do while(rx=0);
 120             key=to;ry=_ha.check();
 121              if ry ne 0 then do;
 122               node=to;output;rr=no.remove(key:node);
 123               key=to;_ha.add();
 124               count+1;
 125               last=to;ha.add();
 126              end;
 127              rx=from_to.find_next();
 128           end;
 129           rc=hi.next();
 130        end;
 131        ha.clear();_ha.clear();
 132        end;
 133        stop;
 134        run;
 
 NOTE: There were 14 observations read from the data set WORK.FULL.
       WHERE (from is not null) and (to is not null);
 NOTE: There were 12 observations read from the data set WORK.NODE.
 NOTE: The data set WORK.WANT has 12 observations and 2 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.07 seconds
       cpu time            0.07 seconds
       
 
 135        
 136        proc print;run;
 
 NOTE: There were 12 observations read from the data set WORK.WANT.
 NOTE: PROCEDURE PRINT used (Total process time):
       real time           0.06 seconds
       cpu time            0.03 seconds
       
 
 137        
 138        OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 148        

 

 

x.png

 

 

 

Ask a Question
Discussion stats
  • 5 replies
  • 493 views
  • 0 likes
  • 2 in conversation