I have a dataset with multiple records with the same ID. If a specific condition exists for one of the records, I need to delete all the records with that same ID. How can I do this?
Generally you need to identify the record, select the Id to delete an then do it.
Here's one example:
data have; input id value; if value > 100 then Flag=1; datalines; 1 10 1 15 1 25 2 12 2 125 2 16 3 . 3 1 ; run; proc sql; create table want as select * from have where id ne ( select distinct id from have where flag=1 ); quit;
The output could also be achieved using a merge step.
data want1 want2 ;
input id value;
if value > 100 then output want1 ;
else output want2 ;
datalines;
1 10
1 15
1 130
2 12
2 125
2 16
3 .
3 1
;
run;
data want ;
merge want1 (in=a) want2 (in=b) ;
by id;
if (a=0) and (b=1) ;
run ;
Assuming your data set is in order by ID, here is a fairly standard approach:
data want;
ID_wanted='Y';
do until (last.id);
set have;
by id;
if /* some condition goes here */ then ID_wanted='N';
end;
do until (last.id);
set have;
by id;
if ID_wanted='Y' then output;
end;
drop ID_wanted;
run;
The top loop lets you examine all observations for an ID, and determine whether you want them or not. The bottom loop reads in the same observations and outputs if the top loop determines that is appropriate.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.