Hi,
I run a chisquare test. My codes are below:
proc format;
value $category
'' =''
'Agree'='Agree1'
'Disagree'= 'Disagree1'
'Neither agree or disagree'='Neither agree or disagree1'
'Strongly agree'= 'Agree1'
'Strongly disagree'= 'Disagree1'
;
proc freq data=red.survey2;
tables My_compensation_should_be_augmen*Program__select_one_ /chisq exact ;
format My_compensation_should_be_augmen $category. ;
run;
I received following output:
Table of My_compensation_should_be_augmen by Program__select_one_ 

My_compensation_should_be_augmen(My compensation should be augmented due to the associated risks) 
Program__select_one_(Program: select one#) 

Frequency 
Anesthesia 
Emergency Medicine 
General Surgery 
Total 
Agree1 
28 
24 
34 
86 
Disagree1 
12 
1 
9 
22 
Neither agree or disagree1 
17 
4 
18 
39 
Total 
57 
29 
61 
147 
Frequency Missing = 4 
Statistic 
DF 
Value 
Prob 
ChiSquare 
4 
10.0471 
0.0396 
Likelihood Ratio ChiSquare 
4 
11.1415 
0.0250 
MantelHaenszel ChiSquare 
1 
0.1559 
0.6929 
Phi Coefficient 

0.2614 

Contingency Coefficient 

0.2529 

Cramer's V 

0.1849 

I get a pvalue of 0.0396.
This shows me that My_compensation_should_be_augmen is significantly different by the program.
However, it does not give me what is specifically different here significantly.
I want to say something like respondents who agreed were significantly higher in Emergency Medicine than Anesthesia or may be respondents who agreed were significantly higher in Emergency Medicine than Anesthesia and Surgery. I am not sure which statement is correct.
How can i do that?
Thanks,
There might be more than two numbers. It is the sum of the squared deviations that is important.
But to answer the spirit of your question, read the article "Color cells in a mosaic plot by deviation from independence," which shows how to create a mosaic plot in PROC FREQ that colorcodes each cell according to the deviation from independence.
tables X*Y / norow cellchi2 expected stdres crosslist
missing plots=MosaicPlot(colorstat=StdRes);
When you perform a twoway frequency analysis, the syntax
tables X*Y / chisq;
performs a chisquare test for association between X and Y. The null hypothesis is that X and Y are independent. If the pvalue is small, you reject the null hypothesis and conclude that there is an association between X and Y. See the section "Pearson ChiSquare Test for TwoWay Tables" in the PROC FREQ documentation.
Thank you for the reply. How can I study which two numbers are statistically different in the Chi square analysis which is making two variables associated.
Thanks,
There might be more than two numbers. It is the sum of the squared deviations that is important.
But to answer the spirit of your question, read the article "Color cells in a mosaic plot by deviation from independence," which shows how to create a mosaic plot in PROC FREQ that colorcodes each cell according to the deviation from independence.
tables X*Y / norow cellchi2 expected stdres crosslist
missing plots=MosaicPlot(colorstat=StdRes);
A couple of things to note. You have 2 cells out of 9 where the counts are less than 5. Beware of using the standard chi squared test. Your code calls for the exact option, so looking at the exact test results should inform you as to your next step. Assuming you have identified that at least one column has a different association with the response variables than another. You get 2 chances to test this, so you need to clearly identify what question you are trying to answer. One way might be to ask: "Is anesthesia different from general surgery" and "Is EM different from general surgery". In this case, you have identified general surgery as a reference category. So you run followup analyses where you subset the data to include only anesthesia and general surgery in the first case and only EM and general surgery in the second case. Those are your 2 chances to test under the reference scenario. Another scenario might be to test 1)Is the collapsed incidence of anesthesia and EM different from general surgery and 2)Is anesthesia different from EM? What you shouldn't do is the all possible comparisons of anesthesia vs. general surgery, EM vs. general surgery, and anesthesia vs. EM.
So how do you go about this? The reference scenario can be done in two separate PROC FREQ calls, using a where= clause in the data= option to select the two columns of interest in each case. The second scenario involves some DATA step preprocessing to collapse the observations in two categories to a single category and then comparing the "collapsed" column to the third in one PROC FREQ call, and comparing the two categories involved in the collapse in a second PROC FREQ call.
SteveDenham
SAS Innovate 2025 is scheduled for May 69 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to levelup your skills? Choose your own adventure.