Hi All,
Is there a way to remove duplicate on all data except few that I am not interested in.
In the following table I want to remove the duplicate, but I want to exclude where the Course_ID=Eng. Can somebody help please. Thanks.
Table1:
Student_ID Course_ID
101 Eng
102 Bio
102 Geo
102 Bio
102 Geo
101 Eng
PROC SORT DATA= table1
OUT = want
NODUPKEY DUPOUT= have_DUPDEL;
BY Student_ID Course_ID;
RUN;
Output:
Student_ID Course_ID
101 Eng
101 Eng
102 Bio
102 Geo
data have; input Student_ID Course_ID $; cards; 101 Eng 102 Bio 102 Geo 102 Bio 102 Geo 101 Eng ; run; PROC SORT DATA= have OUT = temp; BY Student_ID Course_ID; RUN; data want; set temp; by Student_ID Course_ID; if first.Course_ID or Course_ID='Eng'; run;
data vtemp / view=vtemp;
set have;
if course_id='Eng' then keepdup=_n_;
run;
proc sort data=vtemp out=want (drop=keepdup) nodupkey DUPOUT= have_DUPDEL ;
by student_id course_id keepdup;
run;
data have; input Student_ID Course_ID $; cards; 101 Eng 102 Bio 102 Geo 102 Bio 102 Geo 101 Eng ; run; PROC SORT DATA= have OUT = temp; BY Student_ID Course_ID; RUN; data want; set temp; by Student_ID Course_ID; if first.Course_ID or Course_ID='Eng'; run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.