I am trying to determine if there is an easier way to run post hoc tests for chi-square test of independence. This is a scenario of when chi-square is significant and the explanatory variable has more than 2 levels. I have been using the code below, but it is tedious and my students typically get confused as you have to do the entire code for each comparison you want to make.
Data comparison1; set withage;
IF hinsur=1 OR hinsur=2;
Proc sort; by AID;
Proc freq; tables notwanted*hinsur/chisq;
Run;
Also, I have been exploring using the Tasks and Utilities. I discovered under Statistics and Table Analysis how to test chi-square but I do not see an option to run post hoc tests from any of the menu options. How do I run post hoc tests from Tasks and Utilities?
Thank You!
Thank you for the help. The sort statement was in the example code I found like 4 years ago so I continued to use it.
In terms of the students, with the previous code they would just get mixed up about putting the explanatory variable and dummy codes in correctly.
So here is what I have now that allows me to test all 6 comparisons of this 4x2 chi square. Is there a more simple way to test all 6 comparisons as once?
Proc freq; tables notwanted*hinsur/chisq nocol nopercent ;
where hinsur in (1 2);
Proc freq; tables notwanted*hinsur/chisq nocol nopercent ;
where hinsur in (1 3);
Proc freq; tables notwanted*hinsur/chisq nocol nopercent ;
where hinsur in (1 4);
Proc freq; tables notwanted*hinsur/chisq nocol nopercent ;
where hinsur in (2 3);
Proc freq; tables notwanted*hinsur/chisq nocol nopercent ;
where hinsur in (2 4);
Proc freq; tables notwanted*hinsur/chisq nocol nopercent ;
where hinsur in (3 4);
Run;
This code should produce equivalent results for the specific chi-square in our example:
Proc freq data=withage; where hinsur in (1 2); tables notwanted*hinsur/chisq; Run;
the WHERE statement filters data and the IN operator provides a list of values for selection similar to a bunch of OR comparisons.
Some questions:
Why the SORT? You aren't using the sorted data with the variable in any way such as a BY group in proc freq?
By "my students typically get confused as you have to do the entire code for each comparison " do you mean if they want to compare other variables instead of notwanted? You can compare many variables using () to group multiple variables
Proc freq data=withage; where hinsur in (1 2); tables ( notwanted var1 othervar somevar varetc )*hinsur/chisq; Run;
which would produce output for each of the variables in the parentheses with hinsur.
If something else, perhaps you could provide more narrative as to the students confusion.
Thank you for the help. The sort statement was in the example code I found like 4 years ago so I continued to use it.
In terms of the students, with the previous code they would just get mixed up about putting the explanatory variable and dummy codes in correctly.
So here is what I have now that allows me to test all 6 comparisons of this 4x2 chi square. Is there a more simple way to test all 6 comparisons as once?
Proc freq; tables notwanted*hinsur/chisq nocol nopercent ;
where hinsur in (1 2);
Proc freq; tables notwanted*hinsur/chisq nocol nopercent ;
where hinsur in (1 3);
Proc freq; tables notwanted*hinsur/chisq nocol nopercent ;
where hinsur in (1 4);
Proc freq; tables notwanted*hinsur/chisq nocol nopercent ;
where hinsur in (2 3);
Proc freq; tables notwanted*hinsur/chisq nocol nopercent ;
where hinsur in (2 4);
Proc freq; tables notwanted*hinsur/chisq nocol nopercent ;
where hinsur in (3 4);
Run;
At this point I am a little concerned over splitting one sample and creating all these tests on subpopulations as multiple tests on the same data like this have certain potential of false positives.
Can you describe what the purpose of these multiple tests actually may be? Or the overall research question(s) being answered?
Proc Logistic is often used for modeling a dichotomous variable dependent on one or more categorical variables with multiple levels depending on the type of question to answer.
The multiple tests are because these are post hoc tests to determine where significance is after determining the overall chi-square is significant. We calculate the Bonferroni adjustment and use that p-value when determine which of the post hoc tests are significant to account for type 1 error.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.