Hello all,
I am trying to create sets of dummy variables, as shown by my example below. I'd prefer not to hard code this process and was thinking an array and/or do loop might be the best option but not exactly sure how to go about this. I just want the dummy variables to flag if the diagnosis occurred, not to count the number of a specific diagnosis per ID. I have multiple other variables that I will be creating dummy variables, hence why I thought an array or do loop would help. Any help is appreciated!
Dataset:
| ID | Sex | Diagnosis |
|
01 |
1 | 4 |
| 01 | 1 | 4 |
| 01 | 1 | 2 |
| 02 | 1 | 3 |
| 02 | 1 | 3 |
| 03 | 2 | 2 |
| 03 | 2 | 2 |
| 03 | 2 | 1 |
| 04 | 2 | 1 |
Want:
| ID | Sex | Diag_1 | Diag_2 | Diag_3 | Diag_4 |
| 01 | 1 | 0 | 1 | 0 | 1 |
| 02 | 1 | 0 | 0 | 1 | 0 |
| 03 | 2 | 1 | 1 | 0 | 0 |
| 04 | 2 | 1 | 0 | 0 | 0 |
data have;
input id $ sex $ diag $;
dummy=1;
datalines;
01 1 4
01 1 4
01 1 2
02 1 3
02 1 3
03 2 2
03 2 2
03 2 1
04 2 1
;
run;
proc glmselect data=have outdesign(addinputvars)=temp noprint;
class diag;
model dummy=diag/selection=none noint;
run;
proc summary data=temp nway;
class id sex;
var diag_:;
output out=want(drop=_freq_ _type_) max=;
run;
Regarding frequencies, definitely no dummy coding required for that:
proc freq data=have;
table diag / out=diag_counts;
run;
proc print data=diag_counts;
run;
To create dummy variables automatically see this post as well as the linked posts below. One of those options will work for you.
data have;
input id $ sex $ diag $;
datalines;
01 1 4
01 1 4
01 1 2
02 1 3
02 1 3
03 2 2
03 2 2
03 2 1
04 2 1
;
run;
data have2;
set have;
diag_1=diag=1;
diag_2=diag=2;
diag_3=diag=3;
diag_4=diag=4;
run;
proc sql;
select id,sex,max(diag_1) as diag_1,
max(diag_2) as diag_2,
max(diag_3) as diag_3,
max(diag_4) as diag_4
from have2
group by id,sex;
quit;
data have;
input id $ sex $ diag $;
dummy=1;
datalines;
01 1 4
01 1 4
01 1 2
02 1 3
02 1 3
03 2 2
03 2 2
03 2 1
04 2 1
;
run;
proc glmselect data=have outdesign(addinputvars)=temp noprint;
class diag;
model dummy=diag/selection=none noint;
run;
proc summary data=temp nway;
class id sex;
var diag_:;
output out=want(drop=_freq_ _type_) max=;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.