I have a dataset with age ... values being like 15,16,17,...90 (all whole numbers), but sas makes this continuous (numerical). Is there a way for me to make this categorical?
natural numbers are not continuous.
But I digress. What categories do you want?
so id want 21 to be 21 and 22 to be 22 and so on, but I just don't want them to be treated as integers. I'd like for them to be treated as different/independent groups.
By "treated as categories", I presume you mean the equivalent of a CLASS variable in a GLM, or other statistical procedures. If so, you don't have to modify the age variable. Instead just tell your analysis/tabulation procedure to treat age as a class var. In the GLM proc the statement would be
CLASS age;
which you would associate with the appropriate MODEL statement.
@mkeintz wrote:
By "treated as categories", I presume you mean the equivalent of a CLASS variable in a GLM, or other statistical procedures. If so, you don't have to modify the age variable. Instead just tell your analysis/tabulation procedure to treat age as a class var. In the GLM proc the statement would be
CLASS age;
which you would associate with the appropriate MODEL statement.
And if we're talking about AGE, then yes the CLASS statement would cause AGE to be treated as a categorical variable, and @toesockshoe, most likely this is a poor way to analyze this variable.
Variables are numerical or character. A categorical variable is mostly defined by usage, but can typically be of either group. So why do you think you need a categorical variable?
If this is for a regression using GLM/LOGISTIC or that form you need to place the variable in a CLASS statement or create dummy variables manually. It won't matter if it's character or numerical when creating the dummy variables.
@toesockshoe wrote:
so id want 21 to be 21 and 22 to be 22 and so on, but I just don't want them to be treated as integers. I'd like for them to be treated as different/independent groups.
Here's one example format:
Proc format; value age10yr 0 - 9 =' 0-9' 10-19 ='10-19' 20-29 ='20-29' 30-39 ='30-39' 40-49 ='40-49' 50-59 ='50-59' 60-69 ='60-69' 70-79 ='70-79' 80-89 ='80-89' 90-99 ='90-99' 100-high ='100+' ; run;
Use the format associated with the variable in Proc Print, Freq or what have you such as:
Format age age10yr. ;
The name of a format cannot end in a number to avoid headaches when using the notation to specify print width.
As has already been mentioned, the CLASS statement does this in most cases.
If you find a case where the CLASS statement doesn't work, just create your own categorical variable based on AGE:
agecat = put(age, z3.);
Then use AGECAT instead of AGE in your program. Note that AGECAT will be character (not numeric).
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.