I have a dataset like this (below). The variable code takes 4 possible values
The ID variable is not unique. There can be an ID with only code of the 4 possible, or with more than one code. I want to keep only the unique IDs. So if an ID is repeated I want to remove it. Objective is to keep all IDs that have only one code.
Obs ID code
1 AE0000037163 C1
2 BN0000037282 U2
3 CD0000037693 U1
4 RS0000037738 U2
5 RS0000037738 C2
. . .
. . .
. . .
Use PROC SORT
proc sort data=have nouniquekey uniqueout=want;
by id;
run;
Use PROC SORT
proc sort data=have nouniquekey uniqueout=want;
by id;
run;
proc sql; create table data1
as select distinct
*
from data;
quit;
proc sql; create table data2
as select
*,
count(code) as k
from data1
group by id;
quit;
data data3; set data2;
if k = 1;
run;
data have;
input obs id $ code $;
datalines;
1 AE0000037163 C1
2 BN0000037282 U2
3 CD0000037693 U1
4 RS0000037738 U2
5 RS0000037738 C2
;
proc sql;
create table want as
select distinct id
from have
group by id
having count(code) eq 1;
quit;
data want;
set have;
by id;
if first.id and last.id;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.