Hi guys,
I have a very big dataset, i know hoe to delete rows based on certain values, but i want to know how to delete specific rows. For example take the following table below,
ID | time | income |
1 | 1 | .. |
1 | 2 | .. |
1 | 3 | .. |
2 | 1 | .. |
2 | 2 | .. |
2 | 3 | .. |
3 | 1 | .. |
4 | 1 | .. |
4 | 2 | .. |
4 | 3 | .. |
5 | 1 | .. |
6 | 1 | .. |
6 | 2 | .. |
6 | 3 | .. |
I want to be able to delete rows where ID has only one time value, i.e if driver_id occurs only once in the data, i want to delet that row! please advice!
jessica
Use by group processing with if first.id and last.id then delete.
proc sort data=have;
by id;
run;
data want;
set have;
by id;
if first.id and last.id then delete;
run;
Hi Scott,
This i could use if i want to remove ID's that have only one row, but i need to have a condition which says , if one particular ID has less than 3 rows (i.e repeated thrice in the ID column) then i want the ID and its associated rows out of the sample.
jessica
Sorry I must have got confused about what you were looking for. I thought you wanted only ID's with one OBS.
How about this DOW loop example? Just customize the output statement to alter the number of obs you want to keep or remove, which will be contained in the count variable.
DATA WANT;
DO UNTIL (LAST.ID);
SET HAVE;
BY ID;
IF FIRST.ID THEN COUNT = 1;
ELSE COUNT + 1;
END;
DO UNTIL (LAST.ID);
SET HAVE;
BY ID;
IF COUNT ~= 2 THEN OUTPUT;
END;
RUN;
Hi,
Something like (not tested):
proc sql;
create table WANT as
select *
from HAVE
having count(ID) <= 3;
quit;
proc sql;
select * from have group by id having count(*)<=3;
quit;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.