Hi guys, is there a way to remove rows in a file that are not present in a second file by matching the column ID?
Thank you in advance
For example suppose to have:
data set1
ID AGE Role
001 35-50 A
002 55-60 B
003 35-50 C
004 35-50 D
005 35-50 E
and data set2
ID Hours Vacancies
002 900 21
003 456 54
005 45 0
I would like the data set1 to become:
ID AGE Role
002 55-60 B
003 35-50 C
005 35-50 E
Use a hash:
data want;
set ds1;
if _n_ = 1
then do;
declare hash ds2 (dataset:"ds2");
ds2.definekey("id");
ds2.definedone();
end;
if ds2.check() = 0; /* CHECK method returns zero if successful */
run;
Untested; for tested code, provide example data in data steps with datalines.
Yes there is. Please show us some data. Makes it much easier to provide a usable code answer.
Use a hash:
data want;
set ds1;
if _n_ = 1
then do;
declare hash ds2 (dataset:"ds2");
ds2.definekey("id");
ds2.definedone();
end;
if ds2.check() = 0; /* CHECK method returns zero if successful */
run;
Untested; for tested code, provide example data in data steps with datalines.
Here is two different approaches with the same result
/* Example Data */
data set1;
input ID $ AGE $ Role $;
datalines;
001 35-50 A
002 55-60 B
003 35-50 C
004 35-50 D
005 35-50 E
;
data set2;
input ID $ Hours Vacancies;
datalines;
002 900 21
003 456 54
005 45 0
;
/* Proc SQL Approach */
proc sql;
create table want as
select set1.*
from set1, set2
where set1.ID = set2.ID
;
quit;
/* Data Step Hash Approach */
data want;
if _N_ = 1 then do;
dcl hash h(dataset : 'set2');
h.definekey('ID');
h.definedone();
end;
set set1;
if h.check() = 0;
run;
/* Example Data */
data set1;
input ID $ AGE $ Role $;
datalines;
001 35-50 A
002 55-60 B
003 35-50 C
004 35-50 D
005 35-50 E
;
data set2;
input ID $ Hours Vacancies;
datalines;
002 900 21
003 456 54
005 45 0
;
/* Proc SQL Approach */
proc sql;
create table want as
select set1.*
from set1
where ID in (select ID from set2)
;
quit;
Are the data sorted? If so, and if data set 2 never has more than one observation per id, then:
data want;
merge ds1 (in=in1) ds2 (keep=id in=in2);
by id;
if in1=in2;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.