Hi guys, suppose to have the following:
data DB;
input ID ID2;
cards;
0001 .
0001 .
0001 .
0001 .
0002 .
0002 0002
0002 0002
0003 0003
0003 .
...;
Is there a way to get the following?
data DB1;
input ID ID2;
cards;
0002 .
0002 0002
0002 0002
0003 0003
0003 .
...;
In other words if an ID is absent in column ID2 then delete. I tried with if statement without success maybe because there are other missing values in ID2. Can anyone help me please?
Hash Tables can help here:
data DB;
input ID ID2;
cards;
0001 .
0001 .
0001 .
0001 .
0002 .
0002 0002
0002 0002
0003 0003
0003 .
;
run;
proc print;run;
data want;
declare hash H(dataset:"DB(keep=ID2 where=(ID2 is not null))");
H.defineKey('ID2');
H.defineDone();
do until (EOF);
set db end=EOF;
if 0 = H.check(key:ID) then output;
end;
stop;
run;
proc print;run;
Bart
Hash Tables can help here:
data DB;
input ID ID2;
cards;
0001 .
0001 .
0001 .
0001 .
0002 .
0002 0002
0002 0002
0003 0003
0003 .
;
run;
proc print;run;
data want;
declare hash H(dataset:"DB(keep=ID2 where=(ID2 is not null))");
H.defineKey('ID2');
H.defineDone();
do until (EOF);
set db end=EOF;
if 0 = H.check(key:ID) then output;
end;
stop;
run;
proc print;run;
Bart
Since you are treating 0002 as numeric, simply counting will get you the answer. Did you really want them to be treated as numeric?
proc sql;
create table want as select * from DB
group by id
having count(id2)>0 ;
quit;
OR
proc summary data=db nway;
class id;
var id2;
output out=stats n=n_not_missing;
run;
data want;
merge db stats;
by id;
if n_not_missing>0;
drop _type_ _freq_ n_not_missing;
run;
If you want to preserve the order in the original data set, the SQL doesn't do that but the PROC SUMMARY solution does.
Sounds like you mean that you are keeping those observations because the value of ID appears at some point in the value of ID2.
proc sql;
create table want as select a.*
from have a
where a.id in (select b.id2 from have b where b.id2 is not null)
;
quit;
But the example data you showed also supports the more restrictive condition that you are keeping those observations because at some point in the subset of observations with that same value of ID there exists an observation where ID2 matches ID.
proc sql;
create table want2 as select a.*
from have a
where a.id in (select b.id2 from have b where b.id=b.id2 and b.id2 is not null)
;
quit;
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.