BookmarkSubscribeRSS Feed
M_Maldonado
Barite | Level 11

Hi,

I want to use a hash table to create a lookup table.

Can someone help me correct the below?

 

Long story short, I am going to loop over the new and old contracts of a given customer. I got this working with data step, but it is too slow. Will give it a try with hash tables.

 

For the example below, once I iterate over these cases, I will notice that I have 5 customers:

C1 had contracts 1,2,3,4,5

C2 had contracts a,b,c,d

C3,C4,C5 respectively had contracts x,y,z

 

Thanks!

Miguel

 

data a;
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;

data b(rename=(old=contract contract=contract1));
set a;
run;

data result;
length contract contract1 $ 1;
set a;
if _n_=1 then do;
declare hash sc (dataset: 'b');
sc.definekey('contract');
sc.definedata('contract1');
sc.defineDone();
end;
*if sc.find() ge 0;
run;

/*The results data set is not what I expected*/
/*work.expected below is what I am trying to get*/

proc sort data=a;
by contract;
run;
proc sort data=b;
by contract;
run;

data expected;
merge a b;
by contract;
run;

 

 

 

5 REPLIES 5
M_Maldonado
Barite | Level 11

Uncommenting this line did the trick:

*if sc.find() ge 0;

It also works if you assign that find() value to any variable. E.g.

code=sc.find(); 

Ksharp
Super User

Here is .

I noticed you used to be sas employee. Now you leave off sas company ?

 

And this forum will  eat ' : '  in my code .

Change 

multidata:'y'

into

m.....a : 'y'

 

One more question: the start points are alway when old is missing, right ?

 

data have;
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;

data want;
if _n_ eq 1 then do;
length path _path  $ 4000 ;

if 0 then set have;
declare hash k(hashexp:20,dataset:'have(where=(old is not missing and contract is not missing))',multidata:'y');
k.definekey('old');
k.definedata('contract');
k.definedone();
 
declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('path');
pa.definedone();
 
end;
 
set have(where=(old is missing));
n=1;
path=catx(' ',old,contract);
pa.add();
do while(hi_path.next()=0);
_path=path;   
old=scan(path,-1,' ');
rc=k.find();  

if rc ne 0 then do;
 c+1;output;
end;

do while(rc=0);
  if not findw(path,strip(contract)) then do;
   if length(path)+length(contract)+1 gt lengthc(path) then do; 
     putlog 'ERROR: The length of path and _path are set too short';
     stop;
   end;
   n+1;
   path=catx(' ',path,contract);
   pa.add();
   path=_path;
  end;
   
  rc=k.find_next();
end;
end;
pa.clear();
keep c path ;
run;
Ksharp
Super User

This code could be better.

 

 


data have(rename=(old=from contract=to));
contract='1';old='';output;
contract='2';old='1';output;
contract='x';old='';output;
contract='a';old='';output;
contract='3';old='2';output;
contract='y';old='';output;
contract='b';old='a';output;
contract='4';old='3';output;
contract='z';old='';output;
contract='5';old='4';output;
contract='c';old='b';output;
contract='d';old='c';output;
run;



data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node;
  output;
  node=from; h.replace();
  if last then h.output(dataset:'node(where=(node is not missing))');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;
M_Maldonado
Barite | Level 11

Hi Xia,

Nice program! I didn't get it to work but I wasn't sure is if I my SAS version is too old (9.1.3) or if &colon is a macro variable that I had to declare somewhere with a %let statement.

I will give it another try with SAS 9.4 which I am so grateful to get shortly. I'll start on 9.4 to leverage hpsplit, gampl, and logistic procedures for our predictive modeling. With the extra revenue we'll buy a full license of Enterprise Miner next year.

I call myself a SAS alumni. Part of SAS family once, means SAS family always.

 

Thanks again! Looking forward to trying your code on SAS 9.4.

 

Keeping a copy of the log here for my records...

 


1
2    data have(rename=(old=from contract=to));
3    contract='1';old='';output;
4    contract='2';old='1';output;
5    contract='x';old='';output;
6    contract='a';old='';output;
7    contract='3';old='2';output;
8    contract='y';old='';output;
9    contract='b';old='a';output;
10   contract='4';old='3';output;
11   contract='z';old='';output;
12   contract='5';old='4';output;
13   contract='c';old='b';output;
14   contract='d';old='c';output;
15   run;

NOTE: The data set WORK.HAVE has 12 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


16
17
18
19   data full;
20     set have end=last;
21     if _n_ eq 1 then do;
22      declare hash h();
23       h.definekey('node');
24        h.definedata('node');
25        h.definedone();
26     end;
27     output;
28     node=from; h.replace();
29     from=to; to=node;
30     output;
31     node=from; h.replace();
32     if last then h.output(dataset:'node(where=(node is not missing))');
33     drop node;
34   run;

ERROR: Invalid data set name at line 32 column 16.
ERROR: An error has occurred during instance method OM_OUTPUT(505) of "DATASTEP.HASH".
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 12 observations read from the data set WORK.HAVE.
WARNING: The data set WORK.FULL may be incomplete.  When this step was stopped there were 24
         observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


35
36
37   data want(keep=node household);
38   declare hash ha(ordered:'a');
39   declare hiter hi('ha');
40   ha.definekey('count');
41   ha.definedata('last');
42   ha.definedone();
43   declare hash _ha(hashexp: 20);
44   _ha.definekey('key');
45   _ha.definedone();
46
47   if 0 then set full;
48   declare hash from_to(dataset:'full(where=(from is not missing and to is not
48 ! missing))',hashexp:20,multidata&colon:'y');
                                          -
                                          22
                                          76
WARNING: Apparent symbolic reference COLON not resolved.
ERROR 22-322: Syntax error, expecting one of the following: <, <=, =, >, >=, EQ, GE, GT, LE, LT,
              NE, NG, NL, ^=, ~=.

ERROR 76-322: Syntax error, statement will be ignored.

49    from_to.definekey('from');
50    from_to.definedata('to');
51    from_to.definedone();
52
53   if 0 then set node;
ERROR: File WORK.NODE.DATA does not exist.
54   declare hash no(dataset:'node');
55   declare hiter hi_no('no');
56    no.definekey('node');
57    no.definedata('node');
58    no.definedone();
59
60
61   do while(hi_no.next()=0);
62    household+1; output;
63    count=1;
64    key=node;_ha.add();
65    last=node;ha.add();
66    rc=hi.first();
67    do while(rc=0);
68      from=last;rx=from_to.find();
69      do while(rx=0);
70        key=to;ry=_ha.check();
71         if ry ne 0 then do;
72          node=to;output;rr=no.remove(key:node);
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

ERROR: DATA STEP Component Object failure.  Aborted during the COMPILATION phase.
NOTE: The SAS System stopped processing this step because of errors.
73          key=to;_ha.add();
74          count+1;
75          last=to;ha.add();
76         end;
77         rx=from_to.find_next();
              -----------------
              558
ERROR 558-185: Reference find_next is not a member of object from_to.

78      end;
79      rc=hi.next();
80   end;
81   ha.clear();_ha.clear();
82   end;
83   stop;
84   run;
Ksharp
Super User

Yeah. I believe you SAS is too old to recognize the Hash Method  ha.output() .

And &colon; is not a macro vaiable. It is a mistake of SAS Forum .

You need 

Change 

multidata&colon;'y'

into

m.....a : 'y'

 

 

Copy the LOG here ,running under SAS University Edition .

Good Luck.

 

 

 

 

 
 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 51         
 52         data have(rename=(old=from contract=to));
 53         contract='1';old='';output;
 54         contract='2';old='1';output;
 55         contract='x';old='';output;
 56         contract='a';old='';output;
 57         contract='3';old='2';output;
 58         contract='y';old='';output;
 59         contract='b';old='a';output;
 60         contract='4';old='3';output;
 61         contract='z';old='';output;
 62         contract='5';old='4';output;
 63         contract='c';old='b';output;
 64         contract='d';old='c';output;
 65         run;
 
 NOTE: The data set WORK.HAVE has 12 observations and 2 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.03 seconds
       cpu time            0.01 seconds
       
 
 66         
 67         
 68         
 69         data full;
 70           set have end=last;
 71           if _n_ eq 1 then do;
 72            declare hash h();
 73             h.definekey('node');
 74              h.definedata('node');
 75              h.definedone();
 76           end;
 77           output;
 78           node=from; h.replace();
 79           from=to; to=node;
 80           output;
 81           node=from; h.replace();
 82           if last then h.output(dataset:'node(where=(node is not missing))');
 83           drop node;
 84         run;
 
 NOTE: The data set WORK.NODE has 12 observations and 1 variables.
 NOTE: There were 12 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.FULL has 24 observations and 2 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.07 seconds
       cpu time            0.04 seconds
       
 
 85         
 86         
 87         data want(keep=node household);
 88         declare hash ha(ordered:'a');
 89         declare hiter hi('ha');
 90         ha.definekey('count');
 91         ha.definedata('last');
 92         ha.definedone();
 93         declare hash _ha(hashexp: 20);
 94         _ha.definekey('key');
 95         _ha.definedone();
 96         
 97         if 0 then set full;
 98         declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata&colon;'y');
 99          from_to.definekey('from');
 100         from_to.definedata('to');
 101         from_to.definedone();
 102        
 103        if 0 then set node;
 104        declare hash no(dataset:'node');
 105        declare hiter hi_no('no');
 106         no.definekey('node');
 107         no.definedata('node');
 108         no.definedone();
 109        
 110        
 111        do while(hi_no.next()=0);
 112         household+1; output;
 113         count=1;
 114         key=node;_ha.add();
 115         last=node;ha.add();
 116         rc=hi.first();
 117         do while(rc=0);
 118           from=last;rx=from_to.find();
 119           do while(rx=0);
 120             key=to;ry=_ha.check();
 121              if ry ne 0 then do;
 122               node=to;output;rr=no.remove(key:node);
 123               key=to;_ha.add();
 124               count+1;
 125               last=to;ha.add();
 126              end;
 127              rx=from_to.find_next();
 128           end;
 129           rc=hi.next();
 130        end;
 131        ha.clear();_ha.clear();
 132        end;
 133        stop;
 134        run;
 
 NOTE: There were 14 observations read from the data set WORK.FULL.
       WHERE (from is not null) and (to is not null);
 NOTE: There were 12 observations read from the data set WORK.NODE.
 NOTE: The data set WORK.WANT has 12 observations and 2 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.07 seconds
       cpu time            0.07 seconds
       
 
 135        
 136        proc print;run;
 
 NOTE: There were 12 observations read from the data set WORK.WANT.
 NOTE: PROCEDURE PRINT used (Total process time):
       real time           0.06 seconds
       cpu time            0.03 seconds
       
 
 137        
 138        OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 148        

 

 

x.png

 

 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 3264 views
  • 0 likes
  • 2 in conversation