## help with hash table

Super Contributor
Posts: 338

# help with hash table

[ Edited ]

Hi,

I want to use a hash table to create a lookup table.

Can someone help me correct the below?

Long story short, I am going to loop over the new and old contracts of a given customer. I got this working with data step, but it is too slow. Will give it a try with hash tables.

For the example below, once I iterate over these cases, I will notice that I have 5 customers:

Thanks!

Miguel

`data a;contract='1';old='';output;contract='2';old='1';output;contract='x';old='';output;contract='a';old='';output;contract='3';old='2';output;contract='y';old='';output;contract='b';old='a';output;contract='4';old='3';output;contract='z';old='';output;contract='5';old='4';output;contract='c';old='b';output;contract='d';old='c';output;run;data b(rename=(old=contract contract=contract1));set a;run;data result;length contract contract1 \$ 1;set a;if _n_=1 then do; declare hash sc (dataset: 'b'); sc.definekey('contract'); sc.definedata('contract1'); sc.defineDone();end;*if sc.find() ge 0;run;/*The results data set is not what I expected*//*work.expected below is what I am trying to get*/proc sort data=a;by contract;run;proc sort data=b;by contract;run;data expected;merge a b;by contract;run;`

Super Contributor
Posts: 338

## Re: help with hash table

Uncommenting this line did the trick:

`*if sc.find() ge 0;`

It also works if you assign that find() value to any variable. E.g.

code=sc.find();

Super User
Posts: 10,787

## Re: help with hash table

Here is .

I noticed you used to be sas employee. Now you leave off sas company ?

And this forum will  eat ' : '  in my code .

Change

multidata&colon;'y'

into

m.....a : 'y'

One more question: the start points are alway when old is missing, right ?

``````data have;
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;

data want;
if _n_ eq 1 then do;
length path _path  \$ 4000 ;

if 0 then set have;
declare hash k(hashexp:20,dataset:'have(where=(old is not missing and contract is not missing))',multidata&colon;'y');
k.definekey('old');
k.definedata('contract');
k.definedone();

declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('path');
pa.definedone();

end;

set have(where=(old is missing));
n=1;
path=catx(' ',old,contract);
do while(hi_path.next()=0);
_path=path;
old=scan(path,-1,' ');
rc=k.find();

if rc ne 0 then do;
c+1;output;
end;

do while(rc=0);
if not findw(path,strip(contract)) then do;
if length(path)+length(contract)+1 gt lengthc(path) then do;
putlog 'ERROR: The length of path and _path are set too short';
stop;
end;
n+1;
path=catx(' ',path,contract);
path=_path;
end;

rc=k.find_next();
end;
end;
pa.clear();
keep c path ;
run;
``````
Super User
Posts: 10,787

## Re: help with hash table

This code could be better.

``````
data have(rename=(old=from contract=to));
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;

data full;
set have end=last;
if _n_ eq 1 then do;
declare hash h();
h.definekey('node');
h.definedata('node');
h.definedone();
end;
output;
node=from; h.replace();
from=to; to=node;
output;
node=from; h.replace();
if last then h.output(dataset:'node(where=(node is not missing))');
drop node;
run;

data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata&colon;'y');
from_to.definekey('from');
from_to.definedata('to');
from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
no.definekey('node');
no.definedata('node');
no.definedone();

do while(hi_no.next()=0);
household+1; output;
count=1;
rc=hi.first();
do while(rc=0);
from=last;rx=from_to.find();
do while(rx=0);
key=to;ry=_ha.check();
if ry ne 0 then do;
node=to;output;rr=no.remove(key:node);
count+1;
end;
rx=from_to.find_next();
end;
rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;``````
Super Contributor
Posts: 338

## Re: help with hash table

Hi Xia,

Nice program! I didn't get it to work but I wasn't sure is if I my SAS version is too old (9.1.3) or if &colon is a macro variable that I had to declare somewhere with a %let statement.

I will give it another try with SAS 9.4 which I am so grateful to get shortly. I'll start on 9.4 to leverage hpsplit, gampl, and logistic procedures for our predictive modeling. With the extra revenue we'll buy a full license of Enterprise Miner next year.

I call myself a SAS alumni. Part of SAS family once, means SAS family always.

Thanks again! Looking forward to trying your code on SAS 9.4.

Keeping a copy of the log here for my records...

``````
1
2    data have(rename=(old=from contract=to));
3    contract='1';old='';output;
4    contract='2';old='1';output;
5    contract='x';old='';output;
6    contract='a';old='';output;
7    contract='3';old='2';output;
8    contract='y';old='';output;
9    contract='b';old='a';output;
10   contract='4';old='3';output;
11   contract='z';old='';output;
12   contract='5';old='4';output;
13   contract='c';old='b';output;
14   contract='d';old='c';output;
15   run;

NOTE: The data set WORK.HAVE has 12 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time           0.01 seconds
cpu time            0.01 seconds

16
17
18
19   data full;
20     set have end=last;
21     if _n_ eq 1 then do;
22      declare hash h();
23       h.definekey('node');
24        h.definedata('node');
25        h.definedone();
26     end;
27     output;
28     node=from; h.replace();
29     from=to; to=node;
30     output;
31     node=from; h.replace();
32     if last then h.output(dataset:'node(where=(node is not missing))');
33     drop node;
34   run;

ERROR: Invalid data set name at line 32 column 16.
ERROR: An error has occurred during instance method OM_OUTPUT(505) of "DATASTEP.HASH".
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 12 observations read from the data set WORK.HAVE.
WARNING: The data set WORK.FULL may be incomplete.  When this step was stopped there were 24
observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time           0.01 seconds
cpu time            0.01 seconds

35
36
37   data want(keep=node household);
38   declare hash ha(ordered:'a');
39   declare hiter hi('ha');
40   ha.definekey('count');
41   ha.definedata('last');
42   ha.definedone();
43   declare hash _ha(hashexp: 20);
44   _ha.definekey('key');
45   _ha.definedone();
46
47   if 0 then set full;
48   declare hash from_to(dataset:'full(where=(from is not missing and to is not
48 ! missing))',hashexp:20,multidata&colon:'y');
-
22
76
WARNING: Apparent symbolic reference COLON not resolved.
ERROR 22-322: Syntax error, expecting one of the following: <, <=, =, >, >=, EQ, GE, GT, LE, LT,
NE, NG, NL, ^=, ~=.

ERROR 76-322: Syntax error, statement will be ignored.

49    from_to.definekey('from');
50    from_to.definedata('to');
51    from_to.definedone();
52
53   if 0 then set node;
ERROR: File WORK.NODE.DATA does not exist.
54   declare hash no(dataset:'node');
55   declare hiter hi_no('no');
56    no.definekey('node');
57    no.definedata('node');
58    no.definedone();
59
60
61   do while(hi_no.next()=0);
62    household+1; output;
63    count=1;
66    rc=hi.first();
67    do while(rc=0);
68      from=last;rx=from_to.find();
69      do while(rx=0);
70        key=to;ry=_ha.check();
71         if ry ne 0 then do;
72          node=to;output;rr=no.remove(key:node);
NOTE: DATA statement used (Total process time):
real time           0.00 seconds
cpu time            0.00 seconds

ERROR: DATA STEP Component Object failure.  Aborted during the COMPILATION phase.
NOTE: The SAS System stopped processing this step because of errors.
74          count+1;
76         end;
77         rx=from_to.find_next();
-----------------
558
ERROR 558-185: Reference find_next is not a member of object from_to.

78      end;
79      rc=hi.next();
80   end;
81   ha.clear();_ha.clear();
82   end;
83   stop;
84   run;
``````
Super User
Posts: 10,787

## Re: help with hash table

Yeah. I believe you SAS is too old to recognize the Hash Method  ha.output() .

And &colon; is not a macro vaiable. It is a mistake of SAS Forum .

You need

Change

multidata&colon;'y'

into

m.....a : 'y'

Copy the LOG here ,running under SAS University Edition .

Good Luck.

``````
1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
51
52         data have(rename=(old=from contract=to));
53         contract='1';old='';output;
54         contract='2';old='1';output;
55         contract='x';old='';output;
56         contract='a';old='';output;
57         contract='3';old='2';output;
58         contract='y';old='';output;
59         contract='b';old='a';output;
60         contract='4';old='3';output;
61         contract='z';old='';output;
62         contract='5';old='4';output;
63         contract='c';old='b';output;
64         contract='d';old='c';output;
65         run;

NOTE: The data set WORK.HAVE has 12 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time           0.03 seconds
cpu time            0.01 seconds

66
67
68
69         data full;
70           set have end=last;
71           if _n_ eq 1 then do;
72            declare hash h();
73             h.definekey('node');
74              h.definedata('node');
75              h.definedone();
76           end;
77           output;
78           node=from; h.replace();
79           from=to; to=node;
80           output;
81           node=from; h.replace();
82           if last then h.output(dataset:'node(where=(node is not missing))');
83           drop node;
84         run;

NOTE: The data set WORK.NODE has 12 observations and 1 variables.
NOTE: There were 12 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.FULL has 24 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time           0.07 seconds
cpu time            0.04 seconds

85
86
87         data want(keep=node household);
88         declare hash ha(ordered:'a');
89         declare hiter hi('ha');
90         ha.definekey('count');
91         ha.definedata('last');
92         ha.definedone();
93         declare hash _ha(hashexp: 20);
94         _ha.definekey('key');
95         _ha.definedone();
96
97         if 0 then set full;
98         declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata&colon;'y');
99          from_to.definekey('from');
100         from_to.definedata('to');
101         from_to.definedone();
102
103        if 0 then set node;
104        declare hash no(dataset:'node');
105        declare hiter hi_no('no');
106         no.definekey('node');
107         no.definedata('node');
108         no.definedone();
109
110
111        do while(hi_no.next()=0);
112         household+1; output;
113         count=1;
116         rc=hi.first();
117         do while(rc=0);
118           from=last;rx=from_to.find();
119           do while(rx=0);
120             key=to;ry=_ha.check();
121              if ry ne 0 then do;
122               node=to;output;rr=no.remove(key:node);
124               count+1;
126              end;
127              rx=from_to.find_next();
128           end;
129           rc=hi.next();
130        end;
131        ha.clear();_ha.clear();
132        end;
133        stop;
134        run;

NOTE: There were 14 observations read from the data set WORK.FULL.
WHERE (from is not null) and (to is not null);
NOTE: There were 12 observations read from the data set WORK.NODE.
NOTE: The data set WORK.WANT has 12 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time           0.07 seconds
cpu time            0.07 seconds

135
136        proc print;run;

NOTE: There were 12 observations read from the data set WORK.WANT.
NOTE: PROCEDURE PRINT used (Total process time):
real time           0.06 seconds
cpu time            0.03 seconds

137
138        OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
148        ``````

Discussion stats
• 5 replies
• 895 views
• 0 likes
• 2 in conversation