BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mlogan
Lapis Lazuli | Level 10

Hi All,

Is there a way to remove duplicate on all data except few that I am not interested in.

In the following table I want to remove the duplicate, but I want to exclude where the Course_ID=Eng. Can somebody help please. Thanks.

 

Table1:

Student_ID Course_ID

101             Eng            

102             Bio

102             Geo

102             Bio

102             Geo

101             Eng

 

PROC SORT DATA= table1
		  OUT = want
		  NODUPKEY DUPOUT= have_DUPDEL;
BY Student_ID Course_ID;
RUN;

 

Output:

Student_ID Course_ID

101             Eng      

101             Eng      

102             Bio

102             Geo

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input Student_ID Course_ID $;
cards;
101             Eng            
102             Bio
102             Geo
102             Bio
102             Geo
101             Eng
;
run;


PROC SORT DATA= have
		  OUT = temp;
BY Student_ID Course_ID;
RUN;

data want;
 set temp;
 by Student_ID Course_ID;
 if first.Course_ID or Course_ID='Eng';
run;


View solution in original post

2 REPLIES 2
mkeintz
PROC Star

data vtemp / view=vtemp;

  set have;

  if course_id='Eng' then keepdup=_n_;

run;

 

proc sort data=vtemp out=want (drop=keepdup) nodupkey  DUPOUT= have_DUPDEL ;

  by student_id course_id  keepdup;

run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
data have;
input Student_ID Course_ID $;
cards;
101             Eng            
102             Bio
102             Geo
102             Bio
102             Geo
101             Eng
;
run;


PROC SORT DATA= have
		  OUT = temp;
BY Student_ID Course_ID;
RUN;

data want;
 set temp;
 by Student_ID Course_ID;
 if first.Course_ID or Course_ID='Eng';
run;


How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1484 views
  • 2 likes
  • 3 in conversation