I have a dataset with age ... values being like 15,16,17,...90 (all whole numbers), but sas makes this continuous (numerical). Is there a way for me to make this categorical?
natural numbers are not continuous.
But I digress. What categories do you want?
so id want 21 to be 21 and 22 to be 22 and so on, but I just don't want them to be treated as integers. I'd like for them to be treated as different/independent groups.
By "treated as categories", I presume you mean the equivalent of a CLASS variable in a GLM, or other statistical procedures. If so, you don't have to modify the age variable. Instead just tell your analysis/tabulation procedure to treat age as a class var. In the GLM proc the statement would be
CLASS age;
which you would associate with the appropriate MODEL statement.
@mkeintz wrote:
By "treated as categories", I presume you mean the equivalent of a CLASS variable in a GLM, or other statistical procedures. If so, you don't have to modify the age variable. Instead just tell your analysis/tabulation procedure to treat age as a class var. In the GLM proc the statement would be
CLASS age;
which you would associate with the appropriate MODEL statement.
And if we're talking about AGE, then yes the CLASS statement would cause AGE to be treated as a categorical variable, and @toesockshoe, most likely this is a poor way to analyze this variable.
Variables are numerical or character. A categorical variable is mostly defined by usage, but can typically be of either group. So why do you think you need a categorical variable?
If this is for a regression using GLM/LOGISTIC or that form you need to place the variable in a CLASS statement or create dummy variables manually. It won't matter if it's character or numerical when creating the dummy variables.
@toesockshoe wrote:
so id want 21 to be 21 and 22 to be 22 and so on, but I just don't want them to be treated as integers. I'd like for them to be treated as different/independent groups.
Here's one example format:
Proc format; value age10yr 0 - 9 =' 0-9' 10-19 ='10-19' 20-29 ='20-29' 30-39 ='30-39' 40-49 ='40-49' 50-59 ='50-59' 60-69 ='60-69' 70-79 ='70-79' 80-89 ='80-89' 90-99 ='90-99' 100-high ='100+' ; run;
Use the format associated with the variable in Proc Print, Freq or what have you such as:
Format age age10yr. ;
The name of a format cannot end in a number to avoid headaches when using the notation to specify print width.
As has already been mentioned, the CLASS statement does this in most cases.
If you find a case where the CLASS statement doesn't work, just create your own categorical variable based on AGE:
agecat = put(age, z3.);
Then use AGECAT instead of AGE in your program. Note that AGECAT will be character (not numeric).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.