I want to remove the duplicate records but the ones that have missing values as follows
Input
| 1234 | 12 |
| 1234 | |
| 1234 | |
| 1235 | 13 |
| 1235 | 13 |
| 1236 | 11 |
| 1237 | |
| 1237 | |
| 1237 | |
| 1237 | 10 |
| 1238 | |
| 1238 |
Desired Output
| 1234 | 12 |
| 1235 | 13 |
| 1236 | 11 |
| 1237 | 10 |
| 1238 |
data have; infile cards truncover expandtabs; input id x; cards; 1234 12 1234 1234 1235 13 1235 13 1236 11 1237 1237 1237 1237 10 1238 1238 ; run; proc sort data=have; by id descending x; run; data want; set have; by id; if first.id; run;
data have; infile cards truncover expandtabs; input id x; cards; 1234 12 1234 1234 1235 13 1235 13 1236 11 1237 1237 1237 1237 10 1238 1238 ; run; proc sort data=have; by id descending x; run; data want; set have; by id; if first.id; run;
Use the fact that missing values are inferior to all non-missing but are not included in the evaluation of MIN:
data test;
infile datalines missover;
input a b;
datalines;
1234 12
1234
1234
1235 13
1235 13
1236 11
1237
1237
1237
1237 10
1238
1238
;
proc sql;
select unique *
from test
group by a
having b >= min(b);
quit;
@PGStats : Its not working for the records where 'b' in all the duplicate records is missing
I get the result
a b
--------------------------
1234 12
1235 13
1236 11
1237 10
1238 .
case where a=1238 seems to be handled properly.
BTW, @Ksharp's solution would not return the case 1235 12, which is not a duplicate, if it was present in your data.
OK. If there are other non missing value in a group , which you want keep: data have; infile cards truncover expandtabs; input id x; cards; 1234 12 1234 1234 1235 13 1235 13 1235 12 1235 1236 11 1237 1237 1237 1237 10 1238 1238 ; run; proc sort data=have; by id descending x; run; data want; set have; by id descending x ; if first.x and not missing(x) or first.id; run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.