Re: Deleting single ID rows

Jessica98 · Posted 05-30-2014 01:48 AM

Hi guys,

I have a very big dataset, i know hoe to delete rows based on certain values, but i want to know how to delete specific rows. For example take the following table below,

ID	time	income
1	1	..
1	2	..
1	3	..
2	1	..
2	2	..
2	3	..
3	1	..
4	1	..
4	2	..
4	3	..
5	1	..
6	1	..
6	2	..
6	3	..

I want to be able to delete rows where ID has only one time value, i.e if driver_id occurs only once in the data, i want to delet that row! please advice!

jessica

Scott_Mitchell · Posted 05-30-2014 02:12 AM

Use by group processing with if first.id and last.id then delete.

proc sort data=have;

by id;

run;

data want;

set have;

by id;

if first.id and last.id then delete;

run;

Jessica98 · Posted 05-30-2014 02:33 AM

Hi Scott,

This i could use if i want to remove ID's that have only one row, but i need to have a condition which says , if one particular ID has less than 3 rows (i.e repeated thrice in the ID column) then i want the ID and its associated rows out of the sample.

jessica

Scott_Mitchell · Posted 05-30-2014 04:30 AM

Sorry I must have got confused about what you were looking for. I thought you wanted only ID's with one OBS.

How about this DOW loop example? Just customize the output statement to alter the number of obs you want to keep or remove, which will be contained in the count variable.

DATA WANT;

DO UNTIL (LAST.ID);

SET HAVE;

BY ID;

IF FIRST.ID THEN COUNT = 1;

ELSE COUNT + 1;

END;

DO UNTIL (LAST.ID);

SET HAVE;

BY ID;

IF COUNT ~= 2 THEN OUTPUT;

END;

RUN;

RW9 · Posted 05-30-2014 04:30 AM

Hi,

Something like (not tested):

proc sql;

create table WANT as

select *

from HAVE

having count(ID) <= 3;

quit;

slchen · Posted 05-30-2014 07:56 AM

proc sql;

select * from have group by id having count(*)<=3;

quit;

Deleting single ID rows