hi all,
I have a large data set that includes multiple rows for some subjects. Can anyone tell me how to (1) view all of the entries for which this is the case and (2) remove all entries except for the first one for each subject, And ideally place them in a different data set?
tia for any help,
leslie
Proc Sort should give you an easy option to implement such logic.
data have;
do subject=3,2,4,1,1,3,2,2;
otherVar+1;
output;
end;
stop;
run;
proc sort nodupkey
data=have
out=firstSubj
dupout=dupSubj
;
by subject;
run;
title 'firstSubj';
proc print data=firstSubj;
run;
title 'dupSubj';
proc print data=dupSubj;
run;
title;
For part 2:
/* UNTESTED CODE */
proc sql;
create table want as select * from have
group by subject having count(subject)=1;
quit;
Thank you! I am not sure I need to delete yet so I have not tried your code. does it remove all entries where count>1, including first?
thanks again,
leslie
You are correct, and so I withdraw my solution.
Proc Sort should give you an easy option to implement such logic.
data have;
do subject=3,2,4,1,1,3,2,2;
otherVar+1;
output;
end;
stop;
run;
proc sort nodupkey
data=have
out=firstSubj
dupout=dupSubj
;
by subject;
run;
title 'firstSubj';
proc print data=firstSubj;
run;
title 'dupSubj';
proc print data=dupSubj;
run;
title;
Thank you!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.