DATA Step, Macro, Functions and more

Duplicate removal with condition

Accepted Solution Solved
Reply
Regular Contributor
Posts: 218
Accepted Solution

Duplicate removal with condition

[ Edited ]

Hi All,

Is there a way to remove duplicate on all data except few that I am not interested in.

In the following table I want to remove the duplicate, but I want to exclude where the Course_ID=Eng. Can somebody help please. Thanks.

 

Table1:

Student_ID Course_ID

101             Eng            

102             Bio

102             Geo

102             Bio

102             Geo

101             Eng

 

PROC SORT DATA= table1
		  OUT = want
		  NODUPKEY DUPOUT= have_DUPDEL;
BY Student_ID Course_ID;
RUN;

 

Output:

Student_ID Course_ID

101             Eng      

101             Eng      

102             Bio

102             Geo

 

 


Accepted Solutions
Solution
‎06-25-2016 04:12 PM
Super User
Posts: 10,023

Re: Duplicate removal with condition

data have;
input Student_ID Course_ID $;
cards;
101             Eng            
102             Bio
102             Geo
102             Bio
102             Geo
101             Eng
;
run;


PROC SORT DATA= have
		  OUT = temp;
BY Student_ID Course_ID;
RUN;

data want;
 set temp;
 by Student_ID Course_ID;
 if first.Course_ID or Course_ID='Eng';
run;


View solution in original post


All Replies
Trusted Advisor
Posts: 1,019

Re: Duplicate removal with condition

[ Edited ]

data vtemp / view=vtemp;

  set have;

  if course_id='Eng' then keepdup=_n_;

run;

 

proc sort data=vtemp out=want (drop=keepdup) nodupkey  DUPOUT= have_DUPDEL ;

  by student_id course_id  keepdup;

run;

Solution
‎06-25-2016 04:12 PM
Super User
Posts: 10,023

Re: Duplicate removal with condition

data have;
input Student_ID Course_ID $;
cards;
101             Eng            
102             Bio
102             Geo
102             Bio
102             Geo
101             Eng
;
run;


PROC SORT DATA= have
		  OUT = temp;
BY Student_ID Course_ID;
RUN;

data want;
 set temp;
 by Student_ID Course_ID;
 if first.Course_ID or Course_ID='Eng';
run;


☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 313 views
  • 2 likes
  • 3 in conversation