BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kristin_flaming
Fluorite | Level 6

I am trying to determine if there is an easier way to run post hoc tests for chi-square test of independence. This is a scenario of when chi-square is significant and the explanatory variable has more than 2 levels. I have been using the code below, but it is tedious and my students typically get confused as you have to do the entire code for each comparison you want to make.

 

Data comparison1; set withage;
IF hinsur=1 OR hinsur=2;
Proc sort; by AID;
Proc freq; tables notwanted*hinsur/chisq;
 Run;

 

Also, I have been exploring using the Tasks and Utilities. I discovered under Statistics and Table Analysis how to test chi-square but I do not see an option to run post hoc tests from any of the menu options. How do I run post hoc tests from Tasks and Utilities?


Thank You!

1 ACCEPTED SOLUTION

Accepted Solutions
kristin_flaming
Fluorite | Level 6

Thank you for the help. The sort statement was in the example code I found like 4 years ago so I continued to use it.

 

In terms of the students, with the previous code they would just get mixed up about putting the explanatory variable and dummy codes in correctly.

 

So here is what I have now that allows me to test all 6 comparisons of this 4x2 chi square. Is there a more simple way to test all 6 comparisons as once?

 

Proc freq; tables notwanted*hinsur/chisq  nocol nopercent ;
where hinsur in (1 2);
Proc freq; tables notwanted*hinsur/chisq  nocol nopercent ;
where hinsur in (1 3);
Proc freq; tables notwanted*hinsur/chisq  nocol nopercent ;
where hinsur in (1 4);
Proc freq; tables notwanted*hinsur/chisq  nocol nopercent ;
where hinsur in (2 3);
Proc freq; tables notwanted*hinsur/chisq  nocol nopercent ;
where hinsur in (2 4);
Proc freq; tables notwanted*hinsur/chisq  nocol nopercent ;
where hinsur in (3 4);
Run;

View solution in original post

4 REPLIES 4
ballardw
Super User

This code should produce equivalent results for the specific chi-square in our example:

Proc freq data=withage;
   where hinsur in (1 2);
   tables notwanted*hinsur/chisq;
Run;

the WHERE statement filters data and the IN operator provides a list of values for selection similar to a bunch of OR comparisons.

 

 

Some questions:

Why the SORT? You aren't using the sorted data with the variable in any way such as a BY group in proc freq?

 

By "my students typically get confused as you have to do the entire code for each comparison " do you mean if they want to compare other variables instead of notwanted? You can compare many variables using () to group multiple variables

 

Proc freq data=withage;
   where hinsur in (1 2);
   tables ( notwanted  var1 othervar somevar varetc )*hinsur/chisq;
Run;

which would produce output for each of the variables in the parentheses with hinsur.

 

 

If something else, perhaps you could provide more narrative as to the students confusion.

kristin_flaming
Fluorite | Level 6

Thank you for the help. The sort statement was in the example code I found like 4 years ago so I continued to use it.

 

In terms of the students, with the previous code they would just get mixed up about putting the explanatory variable and dummy codes in correctly.

 

So here is what I have now that allows me to test all 6 comparisons of this 4x2 chi square. Is there a more simple way to test all 6 comparisons as once?

 

Proc freq; tables notwanted*hinsur/chisq  nocol nopercent ;
where hinsur in (1 2);
Proc freq; tables notwanted*hinsur/chisq  nocol nopercent ;
where hinsur in (1 3);
Proc freq; tables notwanted*hinsur/chisq  nocol nopercent ;
where hinsur in (1 4);
Proc freq; tables notwanted*hinsur/chisq  nocol nopercent ;
where hinsur in (2 3);
Proc freq; tables notwanted*hinsur/chisq  nocol nopercent ;
where hinsur in (2 4);
Proc freq; tables notwanted*hinsur/chisq  nocol nopercent ;
where hinsur in (3 4);
Run;

ballardw
Super User

At this point I am a little concerned over splitting one sample and creating all these tests on subpopulations as multiple tests on the same data like this have certain potential of false positives.

 

Can you describe what the purpose of these multiple tests actually may be? Or the overall research question(s) being answered?

 

Proc Logistic is often used for modeling a dichotomous variable dependent on one or more categorical variables with multiple levels depending on the type of question to answer.

 

kristin_flaming
Fluorite | Level 6

The multiple tests are because these are post hoc tests to determine where significance is after determining the overall chi-square is significant. We calculate the Bonferroni adjustment and use that p-value when determine which of the post hoc tests are significant to account for type 1 error.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 8490 views
  • 1 like
  • 2 in conversation