BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Sorry for the noobish question. I'm a sql background trying to get a handle on SAS...

So I've been adding a variable based on matching observations to a second table...

proc sql;
create table matching as
select a.*, b.id_final_new as match_id
from people a left join responders b
on a.id_one=b.final_id_one and a.id_two=b.final_id_two;
quit;

This takes forever! Can I do the same in a quick data step?
4 REPLIES 4
LAP
Quartz | Level 8 LAP
Quartz | Level 8
I have found similiar problems with SQL joins of large data sets. While the data step is usually faster, merging requires that the joined data be presorted. I have have began experimenting using Hash tables in the data step to join records. While I'm not comfortable fully explaining them, they have provided the performance enhancements that I wanted. There are many good SAS Global forum papers that explain and give examples. Just search hash tables in tech support search.

Here's one paper....

http://www2.sas.com/proceedings/forum2008/029-2008.pdf
deleted_user
Not applicable
Thanks, looks like I've got a lot of reading to do...
Oleg_L
Obsidian | Level 7
You can test this code. If dataset people is not very large then should work very fast.

data responders(keep=id_one id_two id_final index=(cr1=(id_one id_two)));
set responders(rename=(final_id_one=id_one final_id_two=id_two));
run;

data matching;
set people;
set responders key=cr1/unique;
select(_iorc_);
when(%sysrc(_sok)) do; end;
when(%sysrc(_dsenom)) do; id_final= . /* if id_final is char then should be id_final=''*/; _error_ = 0; end;
otherwise do;
put 'ERROR_: Unexpected value for _IORC_= ' _iorc_ ' Program terminating. Data set accessed is responders';
put _all_; _error_ = 0; stop; end;
end;
run;
ChrisNZ
Tourmaline | Level 20
You could investigate using a hash table in a data step.
While the syntax is unusual at first glance, it is quite easy to grasp.
Hash tables are fast as they are loaded in memory, and don't require any prior sorting.
The pseudo code looks like:
data matching ;
if _N_=1 then load hash table;
set base table;
find match;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 994 views
  • 0 likes
  • 4 in conversation