BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi all,
can anybody tell me what is the fastest method to find some records from a dataset in another dataset?
I explain better: I have two dataset, the first one has 6000 records and differents columns; the second one has 500 records and just one column with the ID.
I need to select in the first dataset the records with the ID from the second dataset. I have tried with merge and inner join (the where in (,,,) woluld be crazy), but the program runs to slowly, around 4 minutes, then i'd want to know if there is another faster method to do this.

Thanks in advance,
Elena
3 REPLIES 3
Reeza
Super User
try something like

proc sql;
create table want as
select * from have1
where id in (select id from have2);
quit;

I have no idea if that will be faster....but unless you have hundreds of text columns I'm not sure why it would take 4 minutes.
ArtC
Rhodochrosite | Level 12
For data sets of your size (assuming a reasonable number of columns) the SQL step ought to do the trick. If the SQL look-up is too slow, there are other alternatives that generally require more coding, but can usually process quicker. The DATA Step Components (HASH) Objects are about the fastest. An overview of look-up techniques can be found at:
http://caloxy.com/papers/43-i_how_table_lookups_from_ift.pdf
Ksharp
Super User
Hi.
I think you should first index these two dataset by using proc sort.then that wil be faster.
And if you do not want sort.;
there are some code i copy from somewhere.Wish will help you .

[pre]
proc sort data=small(keep=id) nodupkey force;*small is your second dataset;
by id;
run;

data fmt(rename=(id=start));
retain fmtname 'key'
type 'N' *'N' means numerical type format,if id is character type then use 'C';
label 'Y';
set small end=eof;
output;
if eof then do;
HLO='o';
label='N';
output;
end;

proc format cntlin=fmt;
run;

data matched;
set test; *test is your first dataset;
where put(id,key.)='Y';
run;
[/pre]

Ksharp

Message was edited by: Ksharp

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1707 views
  • 0 likes
  • 4 in conversation