BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Sorry for the noobish question. I'm a sql background trying to get a handle on SAS...

So I've been adding a variable based on matching observations to a second table...

proc sql;
create table matching as
select a.*, b.id_final_new as match_id
from people a left join responders b
on a.id_one=b.final_id_one and a.id_two=b.final_id_two;
quit;

This takes forever! Can I do the same in a quick data step?
4 REPLIES 4
LAP
Quartz | Level 8 LAP
Quartz | Level 8
I have found similiar problems with SQL joins of large data sets. While the data step is usually faster, merging requires that the joined data be presorted. I have have began experimenting using Hash tables in the data step to join records. While I'm not comfortable fully explaining them, they have provided the performance enhancements that I wanted. There are many good SAS Global forum papers that explain and give examples. Just search hash tables in tech support search.

Here's one paper....

http://www2.sas.com/proceedings/forum2008/029-2008.pdf
deleted_user
Not applicable
Thanks, looks like I've got a lot of reading to do...
Oleg_L
Obsidian | Level 7
You can test this code. If dataset people is not very large then should work very fast.

data responders(keep=id_one id_two id_final index=(cr1=(id_one id_two)));
set responders(rename=(final_id_one=id_one final_id_two=id_two));
run;

data matching;
set people;
set responders key=cr1/unique;
select(_iorc_);
when(%sysrc(_sok)) do; end;
when(%sysrc(_dsenom)) do; id_final= . /* if id_final is char then should be id_final=''*/; _error_ = 0; end;
otherwise do;
put 'ERROR_: Unexpected value for _IORC_= ' _iorc_ ' Program terminating. Data set accessed is responders';
put _all_; _error_ = 0; stop; end;
end;
run;
ChrisNZ
Tourmaline | Level 20
You could investigate using a hash table in a data step.
While the syntax is unusual at first glance, it is quite easy to grasp.
Hash tables are fast as they are loaded in memory, and don't require any prior sorting.
The pseudo code looks like:
data matching ;
if _N_=1 then load hash table;
set base table;
find match;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 708 views
  • 0 likes
  • 4 in conversation