Hi All,
Is there a way to remove duplicate on all data except few that I am not interested in.
In the following table I want to remove the duplicate, but I want to exclude where the Course_ID=Eng. Can somebody help please. Thanks.
Table1:
Student_ID Course_ID
101 Eng
102 Bio
102 Geo
102 Bio
102 Geo
101 Eng
PROC SORT DATA= table1
OUT = want
NODUPKEY DUPOUT= have_DUPDEL;
BY Student_ID Course_ID;
RUN;
Output:
Student_ID Course_ID
101 Eng
101 Eng
102 Bio
102 Geo
data have; input Student_ID Course_ID $; cards; 101 Eng 102 Bio 102 Geo 102 Bio 102 Geo 101 Eng ; run; PROC SORT DATA= have OUT = temp; BY Student_ID Course_ID; RUN; data want; set temp; by Student_ID Course_ID; if first.Course_ID or Course_ID='Eng'; run;
data vtemp / view=vtemp;
set have;
if course_id='Eng' then keepdup=_n_;
run;
proc sort data=vtemp out=want (drop=keepdup) nodupkey DUPOUT= have_DUPDEL ;
by student_id course_id keepdup;
run;
data have; input Student_ID Course_ID $; cards; 101 Eng 102 Bio 102 Geo 102 Bio 102 Geo 101 Eng ; run; PROC SORT DATA= have OUT = temp; BY Student_ID Course_ID; RUN; data want; set temp; by Student_ID Course_ID; if first.Course_ID or Course_ID='Eng'; run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.