04-24-2017 04:26 PM
I've got a dataset imported from SPSS. The smoking status from patients is coded as "smoke" variable. And it has 3 categories "Current smoker", "Ex smoker" and "Non-smoker". I wanted to code this categorical variable into 0, 1, 2 so I wrote below code. However, it did not work. It is showing "." in each column.
if smoke='Current smoker' then smoke1=2;
else if smoke='Ex smoker' then smoke1=1;
else if smoke='Non-smoker' then smoke1=0;
else if smoke='No Answer' then smoke1=.;
04-24-2017 04:36 PM
This section of the program looks OK. The problem might lie in another section of the program, or it might lie in the data.
Example: The data actually contains all uppercase values for SMOKE.
Example: The number of characters in SMOKE is actually less than 10 (for whatever reason).
Example: The DATA step that contains this code forgot to use a SET statement to read in the data source.
The steps you can take to help:
Run a PROC FREQ on the variable SMOKE to verify the actual values it contains.
Post the log from your DATA step (not the program). That will contain key results to help diagnose the source of the problem.
04-24-2017 05:23 PM
Please run Proc Contents on that data set and share the results for the smoke variable.
When you say you get . it may mean that smoke is already numeric and you perhaps are seeing a Format applied to an existing numeric value.
proc freq data=<your data set name>;
format smoke best.;