I have a dataset like this (below). The variable code takes 4 possible values
The ID variable is not unique. There can be an ID with only code of the 4 possible, or with more than one code. I want to keep only the unique IDs. So if an ID is repeated I want to remove it. Objective is to keep all IDs that have only one code.
Obs ID code
1 AE0000037163 C1
2 BN0000037282 U2
3 CD0000037693 U1
4 RS0000037738 U2
5 RS0000037738 C2
. . .
. . .
. . .
Use PROC SORT
proc sort data=have nouniquekey uniqueout=want;
by id;
run;
Use PROC SORT
proc sort data=have nouniquekey uniqueout=want;
by id;
run;
proc sql; create table data1
as select distinct
*
from data;
quit;
proc sql; create table data2
as select
*,
count(code) as k
from data1
group by id;
quit;
data data3; set data2;
if k = 1;
run;
data have;
input obs id $ code $;
datalines;
1 AE0000037163 C1
2 BN0000037282 U2
3 CD0000037693 U1
4 RS0000037738 U2
5 RS0000037738 C2
;
proc sql;
create table want as
select distinct id
from have
group by id
having count(code) eq 1;
quit;
data want;
set have;
by id;
if first.id and last.id;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.