Hello,
I have this dataset
data have;
input ID Case_Dx
1 S72080
2 812
3 S72100
4 813.2
5 820.2
6 808.4
7 805.6
8 S5251
9 S220
10 S320
11 806
12 S5262
;
I want to add a column to the dataset that groups the 'Case_dx' column into group A, B,C.
The groups are defined as follows
GroupA= Anything that starts with 'S720', 'S721' or 'S722' (up to 8 characters)
GroupB= Anything that starts with 'S525' or 'S526' (up to 8 characters)
GroupC= Anything that starts with '805', '806', 'S220', 'S320' or 'S221' (up to 8 characters)
I usually use this
proc format;
value $ Casetype
'S720', 'S721', 'S722' = 'A'
'S525', 'S526' = 'B'
'805', '806', 'S220', 'S320', 'S221' = 'C;
run;
data have;
set want;
Type= put(Case_DX, Casetype.);
RUN;
But in this case it doesn't work because of the approximate matches.
How can I go about this?
Thanks