BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Below is a snippet from very large data set. The problem I am experiencing is to do with duplicates. As you can see below I have duplicated values in the 'ID' column, however the rest of the values in the other coloumns are not duplicates. What I want to do is:

  • Remove the row with the 'first' duplicate, where there a zero entry in Var1 and blanks in Var2 and Var 3.
  • Therefore keeping the row where there is information for all variables.

How can I achieve this is SAS?

 

Thanks

 

data.PNG

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

One way:

proc sort data=have;
   by id Var1;
run;

data want;
   set have;
   by id;
   if first.id and var1=0 and missing(var2) and missing(var3) then delete;
run;

View solution in original post

2 REPLIES 2
ballardw
Super User

One way:

proc sort data=have;
   by id Var1;
run;

data want;
   set have;
   by id;
   if first.id and var1=0 and missing(var2) and missing(var3) then delete;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 578 views
  • 0 likes
  • 2 in conversation