Programming the statistical procedures from SAS

Adding Chi-Square P-values to tables using PROC TABULATE

Reply
Contributor
Posts: 67

Adding Chi-Square P-values to tables using PROC TABULATE

Hello,

I have a bunch of variables but I'll just focus on 3 here. So for example, I have two cities CITYA and CITYB. I am looking at differences between education levels and ethnicities of providers in CITYA vs. CITYB. So the provider is either in one or the other, so both are dichotomous variables. ethnicity is a categorical variable (1=Caucasian, 2=African American, etc.). I have education level, which is categorical (1=college, 2= graduate school, etc.)

For simplicity sake, here is just the table of city by ethnicity:

proc tabulate data=h.april17;

class city ethnicity;

format ethnicity ethnicity.;

table city all, (ethnicity='ethnicity (count)' all)*n*f=4. (ethnicity='ethnicity%' all)*PCTN/BOX='spa by ethnicity';

run;

proc freq data=h.april17; title 'chisquare: city by ethnicity';tables city*ethnicity/chisq;run;

So what I end up getting is a table with the counts and the percentages. then I ran the proc freq to get the chi-square p-value. However, I'd like to produce a table with a column on the side with the chi-square test statistic as well as the p-value. How may I do this? I really wanted a word document with a big, simple table that's easy to read however I'm not sure how to do that (I'd need a macro?) so for now I'm trying this.

Thanks!

Super Contributor
Posts: 291

Re: Adding Chi-Square P-values to tables using PROC TABULATE

Perhaps you could have tabulate and freq each output a dataset and then merge them together into one dataset.  From there, a simple proc print via ods rtf should take it most of the way ...

Super User
Posts: 10,871

Re: Adding Chi-Square P-values to tables using PROC TABULATE

You only need a macro if you are going to reuse with different data sets and variables that you would pass by parameter.
Since the CHISQ results actually are going to give you ONE p-value for city*ethnicity I'm not sure that you gain much by having a "column" added to a table.

I have done something remotely similar and I used an output data set from FREQ, (add a line like Output out=<your dataset name> chisq; ) and extracted the statistics I wanted into a macro variable that I used in POSTTEXT statement with the tabulate to create a short sentence immediately after the table.

Something like:

Data _null_;

     set <chisq output dataset name>;

     length string $ 300; /* or however long a phrase you may want*/

     If p_chi le <your critical value> then string="The distribution of ethnicities varies significantly between cities with a p-value of "||strip(put(p_chi,z6.4))||".";

     Else string="There was no statistically significant difference detected";

     Call symput ("Pstring",string);

end;

and in your tablulate code after the BOX= add something like: style={posttext="&pstring"}

Ask a Question
Discussion stats
  • 2 replies
  • 2494 views
  • 0 likes
  • 3 in conversation