I have a dataset where I want to delete duplicate entries for the same person. I have an employee ID field and then two values A and B for a second variable. In any case where there is a duplicate employee ID, I want to delete the entry that has a value of B for the second variable. Is there a simple way to code for this?
assuming your data are sorted by empID, how about
data REDUCED ;
set yourdata( where=( secondvar NE 'B' ))
yourdata( where=( secondvar EQ 'B' )) ;
by empID ;
if first.empID ;
run ;
Hello you can do this in using nodupkey in data step using first.
please find the code logic:
proc sort data = test ;
by employeeID Column2;
run;
data test2;
set test;
by employeeID ;
if first.column2 then output;
run;
Here you will get the unique first values of column2
Thanks
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.