delete duplicate based on value of a second variable

Kelli · Posted 03-07-2014 11:43 AM

I have a dataset where I want to delete duplicate entries for the same person. I have an employee ID field and then two values A and B for a second variable. In any case where there is a duplicate employee ID, I want to delete the entry that has a value of B for the second variable. Is there a simple way to code for this?

Peter_C · Posted 03-07-2014 12:24 PM

assuming your data are sorted by empID, how about

data REDUCED ;

set yourdata( where=( secondvar NE 'B' ))

yourdata( where=( secondvar EQ 'B' )) ;

by empID ;

if first.empID ;

run ;

sas121987 · Posted 03-07-2014 12:36 PM

Hello you can do this in using nodupkey in data step using first.

please find the code logic:

proc sort data = test ;

by employeeID Column2;

run;

data test2;

set test;

by employeeID ;

if first.column2 then output;

run;

Here you will get the unique first values of column2

Thanks

delete duplicate based on value of a second variable

Re: delete duplicate based on value of a second variable

Re: delete duplicate based on value of a second variable

delete duplicate based on value of a second variable

Re: delete duplicate based on value of a second variable

Re: delete duplicate based on value of a second variable

Click image to register for webinar

Classroom Training Available!