I have a dataset like this (below). The variable code takes 4 possible values
The ID variable is not unique. There can be an ID with only code of the 4 possible, or with more than one code. I want to keep only the unique IDs. So if an ID is repeated I want to remove it. Objective is to keep all IDs that have only one code.
Obs ID code
1 AE0000037163 C1
2 BN0000037282 U2
3 CD0000037693 U1
4 RS0000037738 U2
5 RS0000037738 C2
. . .
. . .
. . .
Use PROC SORT
proc sort data=have nouniquekey uniqueout=want;
by id;
run;
Use PROC SORT
proc sort data=have nouniquekey uniqueout=want;
by id;
run;
proc sql; create table data1
as select distinct
*
from data;
quit;
proc sql; create table data2
as select
*,
count(code) as k
from data1
group by id;
quit;
data data3; set data2;
if k = 1;
run;
data have;
input obs id $ code $;
datalines;
1 AE0000037163 C1
2 BN0000037282 U2
3 CD0000037693 U1
4 RS0000037738 U2
5 RS0000037738 C2
;
proc sql;
create table want as
select distinct id
from have
group by id
having count(code) eq 1;
quit;
data want;
set have;
by id;
if first.id and last.id;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.