BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Below is a snippet from very large data set. The problem I am experiencing is to do with duplicates. As you can see below I have duplicated values in the 'ID' column, however the rest of the values in the other coloumns are not duplicates. What I want to do is:

  • Remove the row with the 'first' duplicate, where there a zero entry in Var1 and blanks in Var2 and Var 3.
  • Therefore keeping the row where there is information for all variables.

How can I achieve this is SAS?

 

Thanks

 

data.PNG

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

One way:

proc sort data=have;
   by id Var1;
run;

data want;
   set have;
   by id;
   if first.id and var1=0 and missing(var2) and missing(var3) then delete;
run;

View solution in original post

2 REPLIES 2
ballardw
Super User

One way:

proc sort data=have;
   by id Var1;
run;

data want;
   set have;
   by id;
   if first.id and var1=0 and missing(var2) and missing(var3) then delete;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 513 views
  • 0 likes
  • 2 in conversation