Hi,
I want to use a hash table to create a lookup table.
Can someone help me correct the below?
Long story short, I am going to loop over the new and old contracts of a given customer. I got this working with data step, but it is too slow. Will give it a try with hash tables.
For the example below, once I iterate over these cases, I will notice that I have 5 customers:
C1 had contracts 1,2,3,4,5
C2 had contracts a,b,c,d
C3,C4,C5 respectively had contracts x,y,z
Thanks!
Miguel
data a;
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;
data b(rename=(old=contract contract=contract1));
set a;
run;
data result;
length contract contract1 $ 1;
set a;
if _n_=1 then do;
declare hash sc (dataset: 'b');
sc.definekey('contract');
sc.definedata('contract1');
sc.defineDone();
end;
*if sc.find() ge 0;
run;
/*The results data set is not what I expected*/
/*work.expected below is what I am trying to get*/
proc sort data=a;
by contract;
run;
proc sort data=b;
by contract;
run;
data expected;
merge a b;
by contract;
run;
Uncommenting this line did the trick:
*if sc.find() ge 0;
It also works if you assign that find() value to any variable. E.g.
code=sc.find();
Here is .
I noticed you used to be sas employee. Now you leave off sas company ?
And this forum will eat ' : ' in my code .
Change
multidata:'y'
into
m.....a : 'y'
One more question: the start points are alway when old is missing, right ?
data have;
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;
data want;
if _n_ eq 1 then do;
length path _path $ 4000 ;
if 0 then set have;
declare hash k(hashexp:20,dataset:'have(where=(old is not missing and contract is not missing))',multidata:'y');
k.definekey('old');
k.definedata('contract');
k.definedone();
declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('path');
pa.definedone();
end;
set have(where=(old is missing));
n=1;
path=catx(' ',old,contract);
pa.add();
do while(hi_path.next()=0);
_path=path;
old=scan(path,-1,' ');
rc=k.find();
if rc ne 0 then do;
c+1;output;
end;
do while(rc=0);
if not findw(path,strip(contract)) then do;
if length(path)+length(contract)+1 gt lengthc(path) then do;
putlog 'ERROR: The length of path and _path are set too short';
stop;
end;
n+1;
path=catx(' ',path,contract);
pa.add();
path=_path;
end;
rc=k.find_next();
end;
end;
pa.clear();
keep c path ;
run;
This code could be better.
data have(rename=(old=from contract=to));
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;
data full;
set have end=last;
if _n_ eq 1 then do;
declare hash h();
h.definekey('node');
h.definedata('node');
h.definedone();
end;
output;
node=from; h.replace();
from=to; to=node;
output;
node=from; h.replace();
if last then h.output(dataset:'node(where=(node is not missing))');
drop node;
run;
data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();
if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
from_to.definekey('from');
from_to.definedata('to');
from_to.definedone();
if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
no.definekey('node');
no.definedata('node');
no.definedone();
do while(hi_no.next()=0);
household+1; output;
count=1;
key=node;_ha.add();
last=node;ha.add();
rc=hi.first();
do while(rc=0);
from=last;rx=from_to.find();
do while(rx=0);
key=to;ry=_ha.check();
if ry ne 0 then do;
node=to;output;rr=no.remove(key:node);
key=to;_ha.add();
count+1;
last=to;ha.add();
end;
rx=from_to.find_next();
end;
rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;
Hi Xia,
Nice program! I didn't get it to work but I wasn't sure is if I my SAS version is too old (9.1.3) or if &colon is a macro variable that I had to declare somewhere with a %let statement.
I will give it another try with SAS 9.4 which I am so grateful to get shortly. I'll start on 9.4 to leverage hpsplit, gampl, and logistic procedures for our predictive modeling. With the extra revenue we'll buy a full license of Enterprise Miner next year.
I call myself a SAS alumni. Part of SAS family once, means SAS family always.
Thanks again! Looking forward to trying your code on SAS 9.4.
Keeping a copy of the log here for my records...
1
2 data have(rename=(old=from contract=to));
3 contract='1';old='';output;
4 contract='2';old='1';output;
5 contract='x';old='';output;
6 contract='a';old='';output;
7 contract='3';old='2';output;
8 contract='y';old='';output;
9 contract='b';old='a';output;
10 contract='4';old='3';output;
11 contract='z';old='';output;
12 contract='5';old='4';output;
13 contract='c';old='b';output;
14 contract='d';old='c';output;
15 run;
NOTE: The data set WORK.HAVE has 12 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
16
17
18
19 data full;
20 set have end=last;
21 if _n_ eq 1 then do;
22 declare hash h();
23 h.definekey('node');
24 h.definedata('node');
25 h.definedone();
26 end;
27 output;
28 node=from; h.replace();
29 from=to; to=node;
30 output;
31 node=from; h.replace();
32 if last then h.output(dataset:'node(where=(node is not missing))');
33 drop node;
34 run;
ERROR: Invalid data set name at line 32 column 16.
ERROR: An error has occurred during instance method OM_OUTPUT(505) of "DATASTEP.HASH".
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 12 observations read from the data set WORK.HAVE.
WARNING: The data set WORK.FULL may be incomplete. When this step was stopped there were 24
observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
35
36
37 data want(keep=node household);
38 declare hash ha(ordered:'a');
39 declare hiter hi('ha');
40 ha.definekey('count');
41 ha.definedata('last');
42 ha.definedone();
43 declare hash _ha(hashexp: 20);
44 _ha.definekey('key');
45 _ha.definedone();
46
47 if 0 then set full;
48 declare hash from_to(dataset:'full(where=(from is not missing and to is not
48 ! missing))',hashexp:20,multidata&colon:'y');
-
22
76
WARNING: Apparent symbolic reference COLON not resolved.
ERROR 22-322: Syntax error, expecting one of the following: <, <=, =, >, >=, EQ, GE, GT, LE, LT,
NE, NG, NL, ^=, ~=.
ERROR 76-322: Syntax error, statement will be ignored.
49 from_to.definekey('from');
50 from_to.definedata('to');
51 from_to.definedone();
52
53 if 0 then set node;
ERROR: File WORK.NODE.DATA does not exist.
54 declare hash no(dataset:'node');
55 declare hiter hi_no('no');
56 no.definekey('node');
57 no.definedata('node');
58 no.definedone();
59
60
61 do while(hi_no.next()=0);
62 household+1; output;
63 count=1;
64 key=node;_ha.add();
65 last=node;ha.add();
66 rc=hi.first();
67 do while(rc=0);
68 from=last;rx=from_to.find();
69 do while(rx=0);
70 key=to;ry=_ha.check();
71 if ry ne 0 then do;
72 node=to;output;rr=no.remove(key:node);
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
ERROR: DATA STEP Component Object failure. Aborted during the COMPILATION phase.
NOTE: The SAS System stopped processing this step because of errors.
73 key=to;_ha.add();
74 count+1;
75 last=to;ha.add();
76 end;
77 rx=from_to.find_next();
-----------------
558
ERROR 558-185: Reference find_next is not a member of object from_to.
78 end;
79 rc=hi.next();
80 end;
81 ha.clear();_ha.clear();
82 end;
83 stop;
84 run;
Yeah. I believe you SAS is too old to recognize the Hash Method ha.output() .
And : is not a macro vaiable. It is a mistake of SAS Forum .
You need
Change
multidata:'y'
into
m.....a : 'y'
Copy the LOG here ,running under SAS University Edition .
Good Luck.
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
51
52 data have(rename=(old=from contract=to));
53 contract='1';old='';output;
54 contract='2';old='1';output;
55 contract='x';old='';output;
56 contract='a';old='';output;
57 contract='3';old='2';output;
58 contract='y';old='';output;
59 contract='b';old='a';output;
60 contract='4';old='3';output;
61 contract='z';old='';output;
62 contract='5';old='4';output;
63 contract='c';old='b';output;
64 contract='d';old='c';output;
65 run;
NOTE: The data set WORK.HAVE has 12 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
66
67
68
69 data full;
70 set have end=last;
71 if _n_ eq 1 then do;
72 declare hash h();
73 h.definekey('node');
74 h.definedata('node');
75 h.definedone();
76 end;
77 output;
78 node=from; h.replace();
79 from=to; to=node;
80 output;
81 node=from; h.replace();
82 if last then h.output(dataset:'node(where=(node is not missing))');
83 drop node;
84 run;
NOTE: The data set WORK.NODE has 12 observations and 1 variables.
NOTE: There were 12 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.FULL has 24 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.04 seconds
85
86
87 data want(keep=node household);
88 declare hash ha(ordered:'a');
89 declare hiter hi('ha');
90 ha.definekey('count');
91 ha.definedata('last');
92 ha.definedone();
93 declare hash _ha(hashexp: 20);
94 _ha.definekey('key');
95 _ha.definedone();
96
97 if 0 then set full;
98 declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
99 from_to.definekey('from');
100 from_to.definedata('to');
101 from_to.definedone();
102
103 if 0 then set node;
104 declare hash no(dataset:'node');
105 declare hiter hi_no('no');
106 no.definekey('node');
107 no.definedata('node');
108 no.definedone();
109
110
111 do while(hi_no.next()=0);
112 household+1; output;
113 count=1;
114 key=node;_ha.add();
115 last=node;ha.add();
116 rc=hi.first();
117 do while(rc=0);
118 from=last;rx=from_to.find();
119 do while(rx=0);
120 key=to;ry=_ha.check();
121 if ry ne 0 then do;
122 node=to;output;rr=no.remove(key:node);
123 key=to;_ha.add();
124 count+1;
125 last=to;ha.add();
126 end;
127 rx=from_to.find_next();
128 end;
129 rc=hi.next();
130 end;
131 ha.clear();_ha.clear();
132 end;
133 stop;
134 run;
NOTE: There were 14 observations read from the data set WORK.FULL.
WHERE (from is not null) and (to is not null);
NOTE: There were 12 observations read from the data set WORK.NODE.
NOTE: The data set WORK.WANT has 12 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.07 seconds
135
136 proc print;run;
NOTE: There were 12 observations read from the data set WORK.WANT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.06 seconds
cpu time 0.03 seconds
137
138 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
148
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.