Hi there,
I recently started learning hash objects, while I'm surfing net, I found we can also use set statement to pass the data and key values into hash objects. I have been trying to use this approach as shown below. What i'm looking is, i want to use set statement instead of using dataset tag. And would like to get merge both datasets (forget about the duplicates in hash object keys and values).
Raw datasets:
data participants;
input name $ gender:$1. treatment $;
datalines;
John M Placebo
Ronald M Drug-A
Barbara F Drug-B
Alice F Drug-A
;
data weight(drop=i);
input date:DATE9. @;
do i = 1 to 4;
input name $ weight @;
output;
end;
/* For brevity, only two dates are listed below */
datalines;
05May2006 Barbara 125 Alice 130 Ronald 170 John 160
04Jun2006 Barbara 122 Alice 133 Ronald 168 John 155
;
Tried Code:
data results ;
if _n_=1 then do ;
declare hash h( ) ;
h.definekey('name') ;
h.definedata('name','weight') ;
h.definedone() ;
end;
set weight end=eof ;
if h.find() ne 0 then
h.add() ;
set participants ;
if h.find()=0 then
output ;
h.output(dataset: 'a');
run;
Using this, am getting only two records, as shown in the trial image.
Im getting correct results using dataset tag argument:
Code:
data results1 ;
length weight 8;
/*attrib weight length=8;*/
/*retain weight . ;*/
if _n_=1 then do ;
declare hash h(dataset: 'weight') ;
h.definekey('name') ;
h.definedata('weight') ;
h.definedone() ;
call missing(weight) ;
end;
set participants ;
if h.find()=0 then output ;
run;
PLEASE HELP ME IN UNDERSTANDING BOTH
Thanks in advance
Please post code using the running-man icon in Rich Text view to preserve formatting.
The first set statement reads just one observation from work.weight (Barbara) and adds it to the hash. Before any other obs are read from work.weight the first obs from work.participants (John) is read. At this point "John" is not in the hash, so no observation is written to work.results. To solve the problem you have to move the code filling the hash into if _n_ = 1 and write an explicit loop to read all obs before work.participants is processed. BUT i really can't recommend doing this.
data work.narf;
if _n_=1 then
do;
declare hash h();
h.definekey('name');
h.definedata('name', 'weight');
h.definedone();
do until (jobDone);
set work.weight end=jobDone;
if h.check() ^=0 then
do;
h.add();
end;
end;
end;
set work.participants;
if h.find() = 0 then output;
run;
Please post code using the running-man icon in Rich Text view to preserve formatting.
The first set statement reads just one observation from work.weight (Barbara) and adds it to the hash. Before any other obs are read from work.weight the first obs from work.participants (John) is read. At this point "John" is not in the hash, so no observation is written to work.results. To solve the problem you have to move the code filling the hash into if _n_ = 1 and write an explicit loop to read all obs before work.participants is processed. BUT i really can't recommend doing this.
data work.narf;
if _n_=1 then
do;
declare hash h();
h.definekey('name');
h.definedata('name', 'weight');
h.definedone();
do until (jobDone);
set work.weight end=jobDone;
if h.check() ^=0 then
do;
h.add();
end;
end;
end;
set work.participants;
if h.find() = 0 then output;
run;
The first code gives you an incorrect result because you fill up the hash object with values from weight as you look for values in participants. You have to fill up your hash object lookup table entirely before you do the actual lookup.
As an example, in the first iteration of the data step you correctly declare the hash object. Then you add the key value Barbara to the hash object. Next, you read in the first observation from participants (John) and look for John in the hash object. However, John is not there yet because you only added Barbara so far.
It seems like you are well on your way to learning hash objects and hats off to you because it is an advance topic and the syntax is funky at first 🙂 For an excellent book, check out Data Management Solutions Using SAS Hash Table Operations.
The code can also be written like this
data results;
if _n_=1 then do;
declare hash h();
h.definekey('name');
h.definedata('name','weight');
h.definedone();
do until (eof);
set weight end=eof;
h.replace();
end;
end;
set participants;
if h.find()=0;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.