Dear SAS experts,
I have a summary data set with means and standard deviations (SD) for a continuous variable, force (Fx) measured on about 5000 subjects. This force, Fx is measured at 10 different time points (baseline, 8 hours, 16 hrs, 24 hrs, 48 hrs, 3days, 4 days, 5 days, 6 days, 7 days and 14 days). I would like to know what is the best statistical analysis for checking if there is any significant difference between the mean values? Should I use linear regression or something like a Cochran Armitage trend test? Any help with this would be much appreciated.
Data have:
Time Fx_Mean Fx_SD
8 Hours 2.706 1.725
16 Hours 1.868 1.637
24 Hours 2.103 2.064
48 Hours 2.482 2.186
Day 3 2.328 2.179
Day 4 2.490 2.076
Day 5 2.308 2.000
Day 6 2.217 2.005
Day 7 2.644 2.019
Day 14 2.546 1.596
Thank you so much,
SM
@sms1891 wrote:
Dear SAS experts,
I have a summary data set with means and standard deviations (SD) for a continuous variable, force (Fx) measured on about 5000 subjects. This force, Fx is measured at 10 different time points (baseline, 8 hours, 16 hrs, 24 hrs, 48 hrs, 3days, 4 days, 5 days, 6 days, 7 days and 14 days). I would like to know what is the best statistical analysis for checking if there is any significant difference between the mean values? Should I use linear regression or something like a Cochran Armitage trend test? Any help with this would be much appreciated.
Data have:
Time Fx_Mean Fx_SD
8 Hours 2.706 1.725
16 Hours 1.868 1.637
24 Hours 2.103 2.064
48 Hours 2.482 2.186
Day 3 2.328 2.179
Day 4 2.490 2.076
Day 5 2.308 2.000
Day 6 2.217 2.005
Day 7 2.644 2.019
Day 14 2.546 1.596
Thank you so much,
SM
Do you mean that you want to compare the mean at 8 hours with the mean at 16 hours? Or do you mean you want to compare all the means to each other to see if there are even one pair of means that are different? Or do you mean something else? Please be specific and detailed.
I want to compare all the means to each other to see if there are even one pair of means that are different and at the same time compare mean at 8 hours with mean at 16 hours, etc. Do I need to to anova?
ANOVA on summarized data should work.
Here is an explanation how to do ANOVA on summarized data in SAS, code is provided, you have to scroll down in the Results tab to see this ANOVA performed on the summarized data:
https://support.sas.com/kb/25/020.html
Your data looks like a repeated measure data.
I think you should use PROC MIXED or PROC GLIMMIX with LSMEANS statement .
@SteveDenham @StatDave would give you the exact right syntax .
While this is repeated data, I would hesitate to try mixed model methods because it would be nearly impossible to estimate any variance-covariance structure. My reasoning there is that the mixed model method requires a subject level for estimating the variance-covariance matrix. I think a generalized estimating equation (GEE) approach would have the same difficulty regarding specification of a subject variable.
All that takes us back to the macro that @PaigeMiller referred to for the analysis of summary data.
SteveDenham
Hi experts,
I also have the individual level data for this as well. I have 500 observations/ subjects who got the forces measured (Fx) over time (variable: Time)- baseline, 8 hours, 16 hours, 24 hrs, 48 hrs, 3 days, 4 days, 5 days, 6 days, 7 days and 14 days. I wanted to know what is the best way to compare the Fx over time?
Thanks,
SM
Well, that would have been good to know earlier. If you have homogeneity of variance across the time points, you can use PROC GEE or PROC GLIMMIX for the analysis. If the variances are heterogeneous across time then GLIMMIX allows for the fitting of heterogeneous variance using a heterogeneous compound symmetry variance-covariance matrix or an unstructured covariance matrix. The following assumes heterogeneity by time.
proc glimmix data=yourdata plots=all;
class time subjectid;
model force = time;
random time/subject=subjectid type=csh residual;
lsmeans time/diff cl;
run;
This fits a model where the residuals are heterogeneous but normally distributed. The plots will enable you to make a decision on whether this is a reasonable assumption. The lsmeans statement will yield comparisons between all 10 timepoints. Note that this will involve 45 comparisons which are likely to not be independent. Control for multiplicity can be added to the lsmeans statement. For GLIMMIX, the adjust=simulate (seed=<pick a number>) usually provides the best control of Type I error, especially when combined with the stepdown option.
Since this is fitting an R-side variance-covariance matrix with no random effects, it could also be fit using PROC GEE. I will leave this for @StatDave to suggest some starting code.
SteveDenham
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.