BookmarkSubscribeRSS Feed
GS2
Obsidian | Level 7 GS2
Obsidian | Level 7

Using SAS 9.4

 

Hello,

 

I have formatted a variable race

proc format;

value $Race 'C' = 'Caucasian'

'AA' = 'African American'

other = 'Unknown/Other';

run;

 

and this works for some tests, (eg)

proc freq data = mylib.a (where=(Used_recommendation = 'Yes'));

tables  Race /chisq list missing nocum;

format  Race $Race.;

title Table 1. ‘Demographics’;

run;

 

However my format does not work when I run a fisher exact test with monte carlo estimates (eg)

proc freq data = mylib.a;

tables Used_recommendation*Race;

format Race $Race.;

exact fisher /mc;

run;

 

It comes up with my Caucasian and African American groups but neglects the other/unknown category. Am I doing something wrong? Is their a better way to write my code so that it would work? Thank you

5 REPLIES 5
Rick_SAS
SAS Super FREQ

Do you use missing values to code the "Other" category, or are the values nonmissing? (Missing values will be ignored, even if they are formatted.)

 

Do you see any warnings in the SAS log? How many observations in the data? Exact tests can be computing intensive, so you might get an error message for huge data sets,

 

Your PROC FREQ statement looks correct. It runs on the following simulated data. I conclude that there is something specific to your data that is causing the issue:

 

data Have;
array RaceCode[3] $2 ("C" "AA" "O");
call streaminit(321);
do i = 1 to 1000;
   k = rand("Table", 0.45, 0.35, 0.2);
   Race = RaceCode[k];
   Used_recommendation = rand("Bernoulli", 0.6*k/4);
   output;
end;
keep i Race Used_recommendation;
run;

proc format;
value $Race 'C' = 'Caucasian'
'AA' = 'African American'
other = 'Unknown/Other';
run;
 
proc freq data = Have;
tables Used_recommendation*Race;
format Race $Race.;
exact fisher /mc;
run;
GS2
Obsidian | Level 7 GS2
Obsidian | Level 7

I am using 'other'  to code for missing data. Would it be more effective to use '.' to code for the missing data?

Rick_SAS
SAS Super FREQ

Ah! Well that explains it. By default PROC FREQ drops observations that have missing values. You need to tell it that it should consider missing values as a valid category:

 

proc freq data = Have;
   tables Used_recommendation*Race / MISSING; /* <== */
   format Race $Race.;
   exact fisher /mc;
run;
GS2
Obsidian | Level 7 GS2
Obsidian | Level 7

Is their a way to write that into my sas format code so that I can name the missing values? Thank you

Rick_SAS
SAS Super FREQ

You already have it covered. The missing values will be labeled as 'Unknown/Other', according to your format.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1968 views
  • 0 likes
  • 2 in conversation