Hello,
I am trying to recode a 5 level categorical variable into a dichotomous variable and SAS is not recognizing the last three levels: "Dont Know/ Unsure", 'Parents not married' , and 'Refused'.
data probably;
set temp3;
IF ACEDIVRC = 1 THEN test = 1;
IF ACEDIVRC = 'No' OR ACEDIVRC = 'Dont Know/ Unsure' OR ACEDIVRC = 'Parents not married' OR ACEDIVRC = 'Refused' THEN test = 2;
IF ACEDIVRC = ' ' THEN test = '.';
Run;
When I run a proc freq it only shows the frequencies "No" level in the second level. I wrote another code to just look at one of the other levels and chose the "Refused" level because it doesn't have any spaces in its name. This is the Code:
data probably;
set temp3;
IF ACEDIVRC = 1 THEN test1 = 1;
IF ACEDIVRC = 2 THEN test2 = 2;
IF ACEDIVRC = 'Refused' THEN test3 = 3;
Run;
proc freq data = probably;
tables ACEDIVRC test1 test2 test3 /MISSING;
run;
The proc freq automatically defaults the third level to the frequency of the missing values (See below, test3). Any idea how to fix this issue?
. | 8626 | 8.36 | 8626 | 8.36 |
---|---|---|---|---|
Yes | 18292 | 17.72 | 26918 | 26.08 |
No | 74957 | 72.63 | 101875 | 98.71 |
Dont Know/ Unsure | 305 | 0.30 | 102180 | 99.01 |
Parents not married | 519 | 0.50 | 102699 | 99.51 |
Refused | 504 | 0.49 | 103203 | 100.00 |
. | 84911 | 82.28 | 84911 | 82.28 |
---|---|---|---|---|
1 | 18292 | 17.72 | 103203 | 100.00 |
. | 28246 | 27.37 | 28246 | 27.37 |
---|---|---|---|---|
2 | 74957 | 72.63 | 103203 | 100.00 |
. | 94577 | 91.64 | 94577 | 91.64 |
---|---|---|---|---|
3 | 8626 | 8.36 | 103203 | 100.00 |
Thank you!
I am using SAS 9.4.
Could you give us a proc contents listing of "PROBABLY". specifically what is the variable type of ACEDIVRC and does it have a format assigned to it.
Could you give us a proc contents listing of "PROBABLY". specifically what is the variable type of ACEDIVRC and does it have a format assigned to it.
Ah PhilC! I think you just solved my problem! I combined 20 state-level BRFSS datasets with modular data and one of the states had formats for specific variables, including ACEDIVRC. If I remove the format, it should work fine.
Thank you!
Glad to help, you don't need to remove the format, but when you run PROC FREQ you can issue a FORMAT statement so that you may see the unformatted values.
proc freq data = probably;
FORMAT ACEDIVRC ;
tables ACEDIVRC test1 test2 test3 /MISSING;
run;
For text comparisons the words are case sensitive.
data probably;
set temp3;
IF ACEDIVRC = '1' THEN test = 1;
ELSE IF upper(trim(ACEDIVRC)) in ('NO' 'DONT KNOW/ UNSURE' 'PARENTS NOT MARRIED' 'REFUSED') THEN test = 2;
else IF missing(ACEDIVRC) THEN test = .;
else test = 99;
Run;
You're mixing up data types in both ACEDIVRC and TEST. Be sure to know what is numeric and what is character. Also the IN operator would make this a bit easier to read:
data temp3;
length acedivrc $20;
acedivrc = '1'; output;
acedivrc = 'No'; output;
acedivrc = "Don't know/Unsure"; output;
acedivrc = 'Parents not married'; output;
acedivrc = 'Refused'; output;
run;
data probably;
set temp3;
if acedivrc = '1' then test = 1;
else if acedivrc in ('No',"Don't know/Unsure",'Parents not married', 'Refused') then test = 2;
else if missing(acedivrc) then test = .;
run;
Hope this helps.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.