BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
deega
Quartz | Level 8

 I want to remove the duplicate records but the ones that have missing values as follows

Input

123412
1234 
1234 
123513
123513
123611
1237 
1237 
1237 
123710
1238 
1238 

 

Desired Output

123412
123513
123611
123710
1238 
1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

data have;
infile cards truncover expandtabs;
input id x;
cards;
1234	12
1234	 
1234	 
1235	13
1235	13
1236	11
1237	 
1237	 
1237	 
1237	10
1238	 
1238	 
;
run;
proc sort data=have;
 by id descending x;
run;
data want;
 set have;
 by id;
 if first.id;
run;

View solution in original post

5 REPLIES 5
Ksharp
Super User

data have;
infile cards truncover expandtabs;
input id x;
cards;
1234	12
1234	 
1234	 
1235	13
1235	13
1236	11
1237	 
1237	 
1237	 
1237	10
1238	 
1238	 
;
run;
proc sort data=have;
 by id descending x;
run;
data want;
 set have;
 by id;
 if first.id;
run;

PGStats
Opal | Level 21

Use the fact that missing values are inferior to all non-missing but are not included in the evaluation of MIN:

 

data test;
infile datalines missover;
input a b;
datalines;
1234	12
1234	 
1234	 
1235	13
1235	13
1236	11
1237	 
1237	 
1237	 
1237	10
1238	 
1238	 
;

proc sql;
select unique *
from test
group by a
having b >= min(b);
quit;
PG
deega
Quartz | Level 8

@PGStats : Its not working for the records where 'b' in all the duplicate records is missing

PGStats
Opal | Level 21

I get the result

 

                                        a         b
                                 --------------------------
                                     1234        12
                                     1235        13
                                     1236        11
                                     1237        10
                                     1238         .

case where a=1238 seems to be handled properly.

 

BTW, @Ksharp's solution would not return the case 1235 12, which is not a duplicate, if it was present in your data. 

PG
Ksharp
Super User
OK. If there are other non missing value in a group , which you want keep:



data have;
infile cards truncover expandtabs;
input id x;
cards;
1234	12
1234	 
1234	 
1235	13
1235	13
1235    12
1235    
1236	11
1237	 
1237	 
1237	 
1237	10
1238	 
1238	 
;
run;
proc sort data=have;
 by id descending x;
run;
data want;
 set have;
 by id descending x ;
 if first.x and not missing(x) or first.id;
run;


sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 3244 views
  • 3 likes
  • 3 in conversation