Hi,
I want to use a hash table to create a lookup table.
Can someone help me correct the below?
Long story short, I am going to loop over the new and old contracts of a given customer. I got this working with data step, but it is too slow. Will give it a try with hash tables.
For the example below, once I iterate over these cases, I will notice that I have 5 customers:
C1 had contracts 1,2,3,4,5
C2 had contracts a,b,c,d
C3,C4,C5 respectively had contracts x,y,z
Thanks!
Miguel
data a;
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;
data b(rename=(old=contract contract=contract1));
set a;
run;
data result;
length contract contract1 $ 1;
set a;
if _n_=1 then do;
declare hash sc (dataset: 'b');
sc.definekey('contract');
sc.definedata('contract1');
sc.defineDone();
end;
*if sc.find() ge 0;
run;
/*The results data set is not what I expected*/
/*work.expected below is what I am trying to get*/
proc sort data=a;
by contract;
run;
proc sort data=b;
by contract;
run;
data expected;
merge a b;
by contract;
run;
Uncommenting this line did the trick:
*if sc.find() ge 0;
It also works if you assign that find() value to any variable. E.g.
code=sc.find();
Here is .
I noticed you used to be sas employee. Now you leave off sas company ?
And this forum will eat ' : ' in my code .
Change
multidata:'y'
into
m.....a : 'y'
One more question: the start points are alway when old is missing, right ?
data have;
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;
data want;
if _n_ eq 1 then do;
length path _path $ 4000 ;
if 0 then set have;
declare hash k(hashexp:20,dataset:'have(where=(old is not missing and contract is not missing))',multidata:'y');
k.definekey('old');
k.definedata('contract');
k.definedone();
declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('path');
pa.definedone();
end;
set have(where=(old is missing));
n=1;
path=catx(' ',old,contract);
pa.add();
do while(hi_path.next()=0);
_path=path;
old=scan(path,-1,' ');
rc=k.find();
if rc ne 0 then do;
c+1;output;
end;
do while(rc=0);
if not findw(path,strip(contract)) then do;
if length(path)+length(contract)+1 gt lengthc(path) then do;
putlog 'ERROR: The length of path and _path are set too short';
stop;
end;
n+1;
path=catx(' ',path,contract);
pa.add();
path=_path;
end;
rc=k.find_next();
end;
end;
pa.clear();
keep c path ;
run;
This code could be better.
data have(rename=(old=from contract=to));
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;
data full;
set have end=last;
if _n_ eq 1 then do;
declare hash h();
h.definekey('node');
h.definedata('node');
h.definedone();
end;
output;
node=from; h.replace();
from=to; to=node;
output;
node=from; h.replace();
if last then h.output(dataset:'node(where=(node is not missing))');
drop node;
run;
data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();
if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
from_to.definekey('from');
from_to.definedata('to');
from_to.definedone();
if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
no.definekey('node');
no.definedata('node');
no.definedone();
do while(hi_no.next()=0);
household+1; output;
count=1;
key=node;_ha.add();
last=node;ha.add();
rc=hi.first();
do while(rc=0);
from=last;rx=from_to.find();
do while(rx=0);
key=to;ry=_ha.check();
if ry ne 0 then do;
node=to;output;rr=no.remove(key:node);
key=to;_ha.add();
count+1;
last=to;ha.add();
end;
rx=from_to.find_next();
end;
rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;
Hi Xia,
Nice program! I didn't get it to work but I wasn't sure is if I my SAS version is too old (9.1.3) or if &colon is a macro variable that I had to declare somewhere with a %let statement.
I will give it another try with SAS 9.4 which I am so grateful to get shortly. I'll start on 9.4 to leverage hpsplit, gampl, and logistic procedures for our predictive modeling. With the extra revenue we'll buy a full license of Enterprise Miner next year.
I call myself a SAS alumni. Part of SAS family once, means SAS family always.
Thanks again! Looking forward to trying your code on SAS 9.4.
Keeping a copy of the log here for my records...
1
2 data have(rename=(old=from contract=to));
3 contract='1';old='';output;
4 contract='2';old='1';output;
5 contract='x';old='';output;
6 contract='a';old='';output;
7 contract='3';old='2';output;
8 contract='y';old='';output;
9 contract='b';old='a';output;
10 contract='4';old='3';output;
11 contract='z';old='';output;
12 contract='5';old='4';output;
13 contract='c';old='b';output;
14 contract='d';old='c';output;
15 run;
NOTE: The data set WORK.HAVE has 12 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
16
17
18
19 data full;
20 set have end=last;
21 if _n_ eq 1 then do;
22 declare hash h();
23 h.definekey('node');
24 h.definedata('node');
25 h.definedone();
26 end;
27 output;
28 node=from; h.replace();
29 from=to; to=node;
30 output;
31 node=from; h.replace();
32 if last then h.output(dataset:'node(where=(node is not missing))');
33 drop node;
34 run;
ERROR: Invalid data set name at line 32 column 16.
ERROR: An error has occurred during instance method OM_OUTPUT(505) of "DATASTEP.HASH".
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 12 observations read from the data set WORK.HAVE.
WARNING: The data set WORK.FULL may be incomplete. When this step was stopped there were 24
observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
35
36
37 data want(keep=node household);
38 declare hash ha(ordered:'a');
39 declare hiter hi('ha');
40 ha.definekey('count');
41 ha.definedata('last');
42 ha.definedone();
43 declare hash _ha(hashexp: 20);
44 _ha.definekey('key');
45 _ha.definedone();
46
47 if 0 then set full;
48 declare hash from_to(dataset:'full(where=(from is not missing and to is not
48 ! missing))',hashexp:20,multidata&colon:'y');
-
22
76
WARNING: Apparent symbolic reference COLON not resolved.
ERROR 22-322: Syntax error, expecting one of the following: <, <=, =, >, >=, EQ, GE, GT, LE, LT,
NE, NG, NL, ^=, ~=.
ERROR 76-322: Syntax error, statement will be ignored.
49 from_to.definekey('from');
50 from_to.definedata('to');
51 from_to.definedone();
52
53 if 0 then set node;
ERROR: File WORK.NODE.DATA does not exist.
54 declare hash no(dataset:'node');
55 declare hiter hi_no('no');
56 no.definekey('node');
57 no.definedata('node');
58 no.definedone();
59
60
61 do while(hi_no.next()=0);
62 household+1; output;
63 count=1;
64 key=node;_ha.add();
65 last=node;ha.add();
66 rc=hi.first();
67 do while(rc=0);
68 from=last;rx=from_to.find();
69 do while(rx=0);
70 key=to;ry=_ha.check();
71 if ry ne 0 then do;
72 node=to;output;rr=no.remove(key:node);
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
ERROR: DATA STEP Component Object failure. Aborted during the COMPILATION phase.
NOTE: The SAS System stopped processing this step because of errors.
73 key=to;_ha.add();
74 count+1;
75 last=to;ha.add();
76 end;
77 rx=from_to.find_next();
-----------------
558
ERROR 558-185: Reference find_next is not a member of object from_to.
78 end;
79 rc=hi.next();
80 end;
81 ha.clear();_ha.clear();
82 end;
83 stop;
84 run;
Yeah. I believe you SAS is too old to recognize the Hash Method ha.output() .
And : is not a macro vaiable. It is a mistake of SAS Forum .
You need
Change
multidata:'y'
into
m.....a : 'y'
Copy the LOG here ,running under SAS University Edition .
Good Luck.
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
51
52 data have(rename=(old=from contract=to));
53 contract='1';old='';output;
54 contract='2';old='1';output;
55 contract='x';old='';output;
56 contract='a';old='';output;
57 contract='3';old='2';output;
58 contract='y';old='';output;
59 contract='b';old='a';output;
60 contract='4';old='3';output;
61 contract='z';old='';output;
62 contract='5';old='4';output;
63 contract='c';old='b';output;
64 contract='d';old='c';output;
65 run;
NOTE: The data set WORK.HAVE has 12 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
66
67
68
69 data full;
70 set have end=last;
71 if _n_ eq 1 then do;
72 declare hash h();
73 h.definekey('node');
74 h.definedata('node');
75 h.definedone();
76 end;
77 output;
78 node=from; h.replace();
79 from=to; to=node;
80 output;
81 node=from; h.replace();
82 if last then h.output(dataset:'node(where=(node is not missing))');
83 drop node;
84 run;
NOTE: The data set WORK.NODE has 12 observations and 1 variables.
NOTE: There were 12 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.FULL has 24 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.04 seconds
85
86
87 data want(keep=node household);
88 declare hash ha(ordered:'a');
89 declare hiter hi('ha');
90 ha.definekey('count');
91 ha.definedata('last');
92 ha.definedone();
93 declare hash _ha(hashexp: 20);
94 _ha.definekey('key');
95 _ha.definedone();
96
97 if 0 then set full;
98 declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
99 from_to.definekey('from');
100 from_to.definedata('to');
101 from_to.definedone();
102
103 if 0 then set node;
104 declare hash no(dataset:'node');
105 declare hiter hi_no('no');
106 no.definekey('node');
107 no.definedata('node');
108 no.definedone();
109
110
111 do while(hi_no.next()=0);
112 household+1; output;
113 count=1;
114 key=node;_ha.add();
115 last=node;ha.add();
116 rc=hi.first();
117 do while(rc=0);
118 from=last;rx=from_to.find();
119 do while(rx=0);
120 key=to;ry=_ha.check();
121 if ry ne 0 then do;
122 node=to;output;rr=no.remove(key:node);
123 key=to;_ha.add();
124 count+1;
125 last=to;ha.add();
126 end;
127 rx=from_to.find_next();
128 end;
129 rc=hi.next();
130 end;
131 ha.clear();_ha.clear();
132 end;
133 stop;
134 run;
NOTE: There were 14 observations read from the data set WORK.FULL.
WHERE (from is not null) and (to is not null);
NOTE: There were 12 observations read from the data set WORK.NODE.
NOTE: The data set WORK.WANT has 12 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.07 seconds
135
136 proc print;run;
NOTE: There were 12 observations read from the data set WORK.WANT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.06 seconds
cpu time 0.03 seconds
137
138 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
148
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.