This topic is solved and locked.
Posted 03-11-2019 06:45 PM
(1245 views)

Hello!

I'm trying to create some categorical and binary variables from a couple of continuous ones in my data set... Binary ones turn out fine, and one of the categorical ones did as well, but the last two did not... and I have no idea why. I have attached my code and log.... PLEASE HELP!

**CODE:**

DATA recodes;

SET project;

IF drugs EQ 0 THEN drugsBIN = 0;

IF drugs NE 0 THEN drugsBIN = 1;

IF complications EQ 0 THEN compBIN = 0;

IF complications NE 0 THEN compBIN = 1;

IF ervisits EQ 0-4 THEN visitsCAT = 0;

IF ervisits EQ 5-9 THEN visitsCAT = 1;

IF ervisits EQ 10-14 THEN visitsCAT = 2;

IF ervisits EQ 15-19 THEN visitsCAT = 3;

IF ervisits GE 20 THEN visitsCAT = 4;

IF interventions EQ 0-9 THEN interCAT = 0;

IF interventions EQ 10-19 THEN interCAT = 1;

IF interventions EQ 20-29 THEN interCAT = 2;

IF interventions EQ 30-39 THEN interCAT = 3;

IF interventions EQ 40-49 THEN interCAT = 4;

IF comorbidities EQ 0-9 THEN comorbCAT = 0;

IF comorbidities EQ 10-19 THEN comorbCAT = 1;

IF comorbidities EQ 20-29 THEN comorbCAT = 2;

IF comorbidities EQ 30-39 THEN comorbCAT = 3;

IF comorbidities EQ 40-49 THEN comorbCAT = 4;

IF comorbidities EQ 50-59 THEN comorbCAT = 5;

IF comorbidities EQ 60-69 THEN comorbCAT = 6;

RUN;

PROC UNIVARIATE DATA=recodes;

VAR drugsBIN compBIN visitsCAT interCAT comorbCAT;

HISTOGRAM drugsBIN compBIN visitsCAT interCAT comorbCAT;

RUN;

**LOG:**

52 DATA recodes;

53 SET project;

54

55 IF drugs EQ 0 THEN drugsBIN = 0;

56 IF drugs NE 0 THEN drugsBIN = 1;

57

58 IF complications EQ 0 THEN compBIN = 0;

59 IF complications NE 0 THEN compBIN = 1;

60

61 IF ervisits EQ 0-4 THEN visitsCAT = 0;

62 IF ervisits EQ 5-9 THEN visitsCAT = 1;

63 IF ervisits EQ 10-14 THEN visitsCAT = 2;

64 IF ervisits EQ 15-19 THEN visitsCAT = 3;

65 IF ervisits GE 20 THEN visitsCAT = 4;

66

67 IF interventions EQ 0-9 THEN interCAT = 0;

68 IF interventions EQ 10-19 THEN interCAT = 1;

69 IF interventions EQ 20-29 THEN interCAT = 2;

70 IF interventions EQ 30-39 THEN interCAT = 3;

71 IF interventions EQ 40-49 THEN interCAT = 4;

72

73 IF comorbidities EQ 0-9 THEN comorbCAT = 0;

74 IF comorbidities EQ 10-19 THEN comorbCAT = 1;

75 IF comorbidities EQ 20-29 THEN comorbCAT = 2;

76 IF comorbidities EQ 30-39 THEN comorbCAT = 3;

77 IF comorbidities EQ 40-49 THEN comorbCAT = 4;

78 IF comorbidities EQ 50-59 THEN comorbCAT = 5;

79 IF comorbidities EQ 60-69 THEN comorbCAT = 6;

80

81 RUN;

NOTE: There were 788 observations read from the data set WORK.PROJECT.

NOTE: The data set WORK.RECODES has 788 observations and 15 variables.

NOTE: DATA statement used (Total process time):

real time 0.05 seconds

cpu time 0.06 seconds

82 PROC UNIVARIATE DATA=recodes;

NOTE: Writing HTML Body file: sashtml.htm

83 VAR drugsBIN compBIN visitsCAT interCAT comorbCAT;

84 HISTOGRAM drugsBIN compBIN visitsCAT interCAT comorbCAT;

85 RUN;

WARNING: Insufficient number of nonmissing observations to create a histogram for interCAT.

WARNING: Insufficient number of nonmissing observations to create a histogram for comorbCAT.

NOTE: PROCEDURE UNIVARIATE used (Total process time):

real time 2.49 seconds

cpu time 1.12 seconds

5-9 does not mean 5 through 9. It means 5 minus 9. Here is a way to check a range of values:

if (5 <= ervisits <= 9) then visitcat = 1;

if (5 <= ervisits <= 9) then visitcat = 1;

if (5 <= ervisits <= 9) then visitcat = 1;

if (5 <= ervisits <= 9) then visitcat = 1;

Using formats instead of if-statements is recommended.

Here is an example for visitCat:

```
proc format;
value visitCategory
0 - 4 = '0'
5 - 9 = '1'
10 - 14 = '2'
15 - 19 = '3'
20-HIGH = '4'
;
run;
data fmttest;
do i = 1 to 20;
ervisits = rand('integer', 0, 30);
visitCat = put(ervisits, visitCategory.);
output;
end;
drop i;
run;
```

I changed the type of visitCat to char. In your data step, replace

```
IF ervisits EQ 0-4 THEN visitsCAT = 0;
IF ervisits EQ 5-9 THEN visitsCAT = 1;
IF ervisits EQ 10-14 THEN visitsCAT = 2;
IF ervisits EQ 15-19 THEN visitsCAT = 3;
IF ervisits GE 20 THEN visitsCAT = 4;
```

with

`visitCat = put(ervisits, visitCategory.);`

And for many purposes you don't even need to add the variable. Most of the SAS analysis procedures will honor the groups created by a format. Example:

proc freq data=fmttest; tables ervisits; format ervisits visitcategory.; run;

or graphing procedures

proc sgplot data=fmttest; vbar ervisits / stat=freq; format ervisits visitcategory.; run;

