Hi,
I have a character variable, SmokingStatus, with these values: "Never Smoker","Former Smoker","Light Smoker", and "Heavy Smoker". I am trying to assign them numbers so I can interpret them easier in a regression:
DATA SubQ.dataclean3;
LENGTH SmokingStatus $22.
Smoking 8.;
SET SubQ.combined;
IF SmokingStatus="*Unknown" THEN Smoking_Status=' '; /* Highlight missing values */
IF SmokingStatus="Never Smoker" THEN Smoking=0; /* Change smoking group to easier variable to use in regression */
IF SmokingStatus="Former Smoker" THEN Smoking=1;
IF Smoking_Status="Light Smoker" THEN Smoking=2;
IF Smoking_Status="Heavy Smoker" THEN Smoking=3;
RUN;
The problem is they assign 0 and 1 correctly but not 2 and 3; for light and heavy smokers, Smoking is shown as missing. Does anyone know why this is? I have also tried this making Smoking a character variable with $22. length and quotations around the number values.
Thanks.
@leackell13 wrote:
That is a good point; I think I was afraid that the values would be too large. So under the CLASS statement, I can just put SmokingStatus (REF='Never Smoker')?
Yes, that is how you'd specify a reference level.
I would also add PARAM=REF to indicate that you're using the One hot encoding/dummy variables/Referential parameterization instead of GLM which is the SAS default.
IF SmokingStatus="*Unknown" THEN Smoking_Status=.;
Numeric missing is . (a dot)
{EDIT:} sorry didn't spot the issue. You are missing _ in the variable name:
SmokingStatus is not the same as Smoking_Status
{EDIT2:} The length statement should be like:
LENGTH SmokingStatus $ 22 Smoking 8;
Bart
The spelling must match exactly, including any non-printable characters that might be contained in the string. Print the string with a $HEX format to reveal such characters.
@leackell13 wrote:
Hi,
I have a character variable, SmokingStatus, with these values: "Never Smoker","Former Smoker","Light Smoker", and "Heavy Smoker". I am trying to assign them numbers so I can interpret them easier in a regression:
How do you think that coding these to numbers will make it easier to interpret your results? They're categorical variables so need to be included in a CLASS statement and then the output would be labeled as "Never Smoker" vs "Heavy Smoker" compared to 1 vs 3? The first version would be much easier to read.
Are you planning to treat these as a continuous ordinal variable instead of categorical?
@leackell13 wrote:
Hi,
I have a character variable, SmokingStatus, with these values: "Never Smoker","Former Smoker","Light Smoker", and "Heavy Smoker". I am trying to assign them numbers so I can interpret them easier in a regression:
DATA SubQ.dataclean3;
LENGTH SmokingStatus $22.
Smoking 8.;
SET SubQ.combined;
IF SmokingStatus="*Unknown" THEN Smoking_Status=' '; /* Highlight missing values */
IF SmokingStatus="Never Smoker" THEN Smoking=0; /* Change smoking group to easier variable to use in regression */
IF SmokingStatus="Former Smoker" THEN Smoking=1;
IF Smoking_Status="Light Smoker" THEN Smoking=2;
IF Smoking_Status="Heavy Smoker" THEN Smoking=3;
RUN;The problem is they assign 0 and 1 correctly but not 2 and 3; for light and heavy smokers, Smoking is shown as missing. Does anyone know why this is? I have also tried this making Smoking a character variable with $22. length and quotations around the number values.
Thanks.
@leackell13 wrote:
That is a good point; I think I was afraid that the values would be too large. So under the CLASS statement, I can just put SmokingStatus (REF='Never Smoker')?
Yes, that is how you'd specify a reference level.
I would also add PARAM=REF to indicate that you're using the One hot encoding/dummy variables/Referential parameterization instead of GLM which is the SAS default.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.