Hello,
I am trying to recode a 5 level categorical variable into a dichotomous variable and SAS is not recognizing the last three levels: "Dont Know/ Unsure", 'Parents not married' , and 'Refused'.
data probably;
set temp3;
IF ACEDIVRC = 1 THEN test = 1;
IF ACEDIVRC = 'No' OR ACEDIVRC = 'Dont Know/ Unsure' OR ACEDIVRC = 'Parents not married' OR ACEDIVRC = 'Refused' THEN test = 2;
IF ACEDIVRC = ' ' THEN test = '.';
Run;
When I run a proc freq it only shows the frequencies "No" level in the second level. I wrote another code to just look at one of the other levels and chose the "Refused" level because it doesn't have any spaces in its name. This is the Code:
data probably;
set temp3;
IF ACEDIVRC = 1 THEN test1 = 1;
IF ACEDIVRC = 2 THEN test2 = 2;
IF ACEDIVRC = 'Refused' THEN test3 = 3;
Run;
proc freq data = probably;
tables ACEDIVRC test1 test2 test3 /MISSING;
run;
The proc freq automatically defaults the third level to the frequency of the missing values (See below, test3). Any idea how to fix this issue?
. | 8626 | 8.36 | 8626 | 8.36 |
---|---|---|---|---|
Yes | 18292 | 17.72 | 26918 | 26.08 |
No | 74957 | 72.63 | 101875 | 98.71 |
Dont Know/ Unsure | 305 | 0.30 | 102180 | 99.01 |
Parents not married | 519 | 0.50 | 102699 | 99.51 |
Refused | 504 | 0.49 | 103203 | 100.00 |
. | 84911 | 82.28 | 84911 | 82.28 |
---|---|---|---|---|
1 | 18292 | 17.72 | 103203 | 100.00 |
. | 28246 | 27.37 | 28246 | 27.37 |
---|---|---|---|---|
2 | 74957 | 72.63 | 103203 | 100.00 |
. | 94577 | 91.64 | 94577 | 91.64 |
---|---|---|---|---|
3 | 8626 | 8.36 | 103203 | 100.00 |
Thank you!
I am using SAS 9.4.
Could you give us a proc contents listing of "PROBABLY". specifically what is the variable type of ACEDIVRC and does it have a format assigned to it.
Could you give us a proc contents listing of "PROBABLY". specifically what is the variable type of ACEDIVRC and does it have a format assigned to it.
Ah PhilC! I think you just solved my problem! I combined 20 state-level BRFSS datasets with modular data and one of the states had formats for specific variables, including ACEDIVRC. If I remove the format, it should work fine.
Thank you!
Glad to help, you don't need to remove the format, but when you run PROC FREQ you can issue a FORMAT statement so that you may see the unformatted values.
proc freq data = probably;
FORMAT ACEDIVRC ;
tables ACEDIVRC test1 test2 test3 /MISSING;
run;
For text comparisons the words are case sensitive.
data probably;
set temp3;
IF ACEDIVRC = '1' THEN test = 1;
ELSE IF upper(trim(ACEDIVRC)) in ('NO' 'DONT KNOW/ UNSURE' 'PARENTS NOT MARRIED' 'REFUSED') THEN test = 2;
else IF missing(ACEDIVRC) THEN test = .;
else test = 99;
Run;
You're mixing up data types in both ACEDIVRC and TEST. Be sure to know what is numeric and what is character. Also the IN operator would make this a bit easier to read:
data temp3;
length acedivrc $20;
acedivrc = '1'; output;
acedivrc = 'No'; output;
acedivrc = "Don't know/Unsure"; output;
acedivrc = 'Parents not married'; output;
acedivrc = 'Refused'; output;
run;
data probably;
set temp3;
if acedivrc = '1' then test = 1;
else if acedivrc in ('No',"Don't know/Unsure",'Parents not married', 'Refused') then test = 2;
else if missing(acedivrc) then test = .;
run;
Hope this helps.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.