Hello!
I'm trying to create some categorical and binary variables from a couple of continuous ones in my data set... Binary ones turn out fine, and one of the categorical ones did as well, but the last two did not... and I have no idea why. I have attached my code and log.... PLEASE HELP!
CODE:
DATA recodes;
SET project;
IF drugs EQ 0 THEN drugsBIN = 0;
IF drugs NE 0 THEN drugsBIN = 1;
IF complications EQ 0 THEN compBIN = 0;
IF complications NE 0 THEN compBIN = 1;
IF ervisits EQ 0-4 THEN visitsCAT = 0;
IF ervisits EQ 5-9 THEN visitsCAT = 1;
IF ervisits EQ 10-14 THEN visitsCAT = 2;
IF ervisits EQ 15-19 THEN visitsCAT = 3;
IF ervisits GE 20 THEN visitsCAT = 4;
IF interventions EQ 0-9 THEN interCAT = 0;
IF interventions EQ 10-19 THEN interCAT = 1;
IF interventions EQ 20-29 THEN interCAT = 2;
IF interventions EQ 30-39 THEN interCAT = 3;
IF interventions EQ 40-49 THEN interCAT = 4;
IF comorbidities EQ 0-9 THEN comorbCAT = 0;
IF comorbidities EQ 10-19 THEN comorbCAT = 1;
IF comorbidities EQ 20-29 THEN comorbCAT = 2;
IF comorbidities EQ 30-39 THEN comorbCAT = 3;
IF comorbidities EQ 40-49 THEN comorbCAT = 4;
IF comorbidities EQ 50-59 THEN comorbCAT = 5;
IF comorbidities EQ 60-69 THEN comorbCAT = 6;
RUN;
PROC UNIVARIATE DATA=recodes;
VAR drugsBIN compBIN visitsCAT interCAT comorbCAT;
HISTOGRAM drugsBIN compBIN visitsCAT interCAT comorbCAT;
RUN;
LOG:
52 DATA recodes;
53 SET project;
54
55 IF drugs EQ 0 THEN drugsBIN = 0;
56 IF drugs NE 0 THEN drugsBIN = 1;
57
58 IF complications EQ 0 THEN compBIN = 0;
59 IF complications NE 0 THEN compBIN = 1;
60
61 IF ervisits EQ 0-4 THEN visitsCAT = 0;
62 IF ervisits EQ 5-9 THEN visitsCAT = 1;
63 IF ervisits EQ 10-14 THEN visitsCAT = 2;
64 IF ervisits EQ 15-19 THEN visitsCAT = 3;
65 IF ervisits GE 20 THEN visitsCAT = 4;
66
67 IF interventions EQ 0-9 THEN interCAT = 0;
68 IF interventions EQ 10-19 THEN interCAT = 1;
69 IF interventions EQ 20-29 THEN interCAT = 2;
70 IF interventions EQ 30-39 THEN interCAT = 3;
71 IF interventions EQ 40-49 THEN interCAT = 4;
72
73 IF comorbidities EQ 0-9 THEN comorbCAT = 0;
74 IF comorbidities EQ 10-19 THEN comorbCAT = 1;
75 IF comorbidities EQ 20-29 THEN comorbCAT = 2;
76 IF comorbidities EQ 30-39 THEN comorbCAT = 3;
77 IF comorbidities EQ 40-49 THEN comorbCAT = 4;
78 IF comorbidities EQ 50-59 THEN comorbCAT = 5;
79 IF comorbidities EQ 60-69 THEN comorbCAT = 6;
80
81 RUN;
NOTE: There were 788 observations read from the data set WORK.PROJECT.
NOTE: The data set WORK.RECODES has 788 observations and 15 variables.
NOTE: DATA statement used (Total process time):
real time 0.05 seconds
cpu time 0.06 seconds
82 PROC UNIVARIATE DATA=recodes;
NOTE: Writing HTML Body file: sashtml.htm
83 VAR drugsBIN compBIN visitsCAT interCAT comorbCAT;
84 HISTOGRAM drugsBIN compBIN visitsCAT interCAT comorbCAT;
85 RUN;
WARNING: Insufficient number of nonmissing observations to create a histogram for interCAT.
WARNING: Insufficient number of nonmissing observations to create a histogram for comorbCAT.
NOTE: PROCEDURE UNIVARIATE used (Total process time):
real time 2.49 seconds
cpu time 1.12 seconds
Using formats instead of if-statements is recommended.
Here is an example for visitCat:
proc format;
value visitCategory
0 - 4 = '0'
5 - 9 = '1'
10 - 14 = '2'
15 - 19 = '3'
20-HIGH = '4'
;
run;
data fmttest;
do i = 1 to 20;
ervisits = rand('integer', 0, 30);
visitCat = put(ervisits, visitCategory.);
output;
end;
drop i;
run;
I changed the type of visitCat to char. In your data step, replace
IF ervisits EQ 0-4 THEN visitsCAT = 0;
IF ervisits EQ 5-9 THEN visitsCAT = 1;
IF ervisits EQ 10-14 THEN visitsCAT = 2;
IF ervisits EQ 15-19 THEN visitsCAT = 3;
IF ervisits GE 20 THEN visitsCAT = 4;
with
visitCat = put(ervisits, visitCategory.);
And for many purposes you don't even need to add the variable. Most of the SAS analysis procedures will honor the groups created by a format. Example:
proc freq data=fmttest; tables ervisits; format ervisits visitcategory.; run;
or graphing procedures
proc sgplot data=fmttest; vbar ervisits / stat=freq; format ervisits visitcategory.; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.