I have a problem interpreting the result from proc npar1way. Hopefully, someone here could help me out of this trouble.
I'm working on a project investigating whether there is any difference in physical therapy received by Whites and Blacks. The dependent variable is minutes of therapy received within a year and the independent one is the race (white or black).
Neither the black or white group has normal distribution, so I use WMW test instead. Below is the result.
My question is:
1. The p value for F-test is 0.4398, indicating that race does not accounts for a significant portion of the variability in 'sum_PT_mins_1yr'( miniutes of therapy) . However, the result from the Kruskal-Wallis Test (p<0.0001) seems to reject the null h0 of no difference in therapy received among people with different race. I'm wondering whether these result contradict with each other... Any thoughts?
2. I also tried the T test and proc GLM with LSMEANS. Both suggest there is no racial difference on therapy. But I just couldn't figure out which result I should buy. In addition, the SD is much greater from proc GLM than from Proc means. Is there any one know the reason for that? Thank you for sharing your valuable inputs. I'll be appreciated.
The results from GLM and the anova part of NPAR1WAY agree completely, and point out the danger of not meeting the assumption of normality of errors when doing linear modeling. As a result, the Wilcoxon test is able to detect the large difference in location, as it does not assume normality of errors. Since you know the assumptions for ANOVA are violated, I would not place a lot of trust in those p values.
Steve Denham
You should not only test the distribution of your variable, but also look at it, for both groups. Use proc univariate. Simple histograms might be very informative.
The results from GLM and the anova part of NPAR1WAY agree completely, and point out the danger of not meeting the assumption of normality of errors when doing linear modeling. As a result, the Wilcoxon test is able to detect the large difference in location, as it does not assume normality of errors. Since you know the assumptions for ANOVA are violated, I would not place a lot of trust in those p values.
Steve Denham
To test for normality of errors, run your data in GLM and use the OUTPUT statement to get the residuals for each observation in a dataset. You can then use that as input to PROC UNIVARIATE and get graphical (QQ plot) and inferential test information about the residuals.
Steve Denham
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.