Hi guys,
I have a very big dataset, i know hoe to delete rows based on certain values, but i want to know how to delete specific rows. For example take the following table below,
ID | time | income |
1 | 1 | .. |
1 | 2 | .. |
1 | 3 | .. |
2 | 1 | .. |
2 | 2 | .. |
2 | 3 | .. |
3 | 1 | .. |
4 | 1 | .. |
4 | 2 | .. |
4 | 3 | .. |
5 | 1 | .. |
6 | 1 | .. |
6 | 2 | .. |
6 | 3 | .. |
I want to be able to delete rows where ID has only one time value, i.e if driver_id occurs only once in the data, i want to delet that row! please advice!
jessica
Use by group processing with if first.id and last.id then delete.
proc sort data=have;
by id;
run;
data want;
set have;
by id;
if first.id and last.id then delete;
run;
Hi Scott,
This i could use if i want to remove ID's that have only one row, but i need to have a condition which says , if one particular ID has less than 3 rows (i.e repeated thrice in the ID column) then i want the ID and its associated rows out of the sample.
jessica
Sorry I must have got confused about what you were looking for. I thought you wanted only ID's with one OBS.
How about this DOW loop example? Just customize the output statement to alter the number of obs you want to keep or remove, which will be contained in the count variable.
DATA WANT;
DO UNTIL (LAST.ID);
SET HAVE;
BY ID;
IF FIRST.ID THEN COUNT = 1;
ELSE COUNT + 1;
END;
DO UNTIL (LAST.ID);
SET HAVE;
BY ID;
IF COUNT ~= 2 THEN OUTPUT;
END;
RUN;
Hi,
Something like (not tested):
proc sql;
create table WANT as
select *
from HAVE
having count(ID) <= 3;
quit;
proc sql;
select * from have group by id having count(*)<=3;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.