I have a dataset where I want to delete the duplicate entries but the problem is that even though I have a person in there twice with identical demographic characteristics, my variable of interest needs to be averaged out and then I want to delete the second row.
The data set looks as follows:
Do you actually want all those variables? SAS has plenty of ways to compute an average.
proc sort data=have;
by PersonID;
run;
proc summary data=have;
by PersonID;
output out=means (keep=PersonID MeanEXP) mean=MeanEXP;
var EXP;
run;
data want;
merge have means;
by PersonID;
if first.PersonID;
run;
You don't have to merge the two together ... you could keep the two data sets separate if that fits your needs better.
Good luck.
You rock - been coding all day and could not figure out how to deal with this.
Thanks so much!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.