BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Adam1
Calcite | Level 5

Hi everyone

I'm trying to model a continuous outcome variable (blood pressure) against categorical and continuous predictors. The study is longitudinal, includes several observations for each individual and follow up is between 5 and 10 years. I am interested in examining how treatment group impacts blood pressure. For all treatment groups the value of the outcome (blood pressure) decrease the first 3-4 years and then increases steadily the remaning years in the study. Initially i used PROC MIXED with random effects for person (repeated measurements), treatment group (individuals where nested within treatment groups) and follow up time. Here is the code:

proc mixed data = DATASET covtest noclprint method=reml;

class Person_ID Treatment_Group;

model bloodpressure = followup followup*followup Treatment_Group / solution ddfm = satterthwaite;

random intercept / sub=Person_ID type=ar(1);

random intercept / sub=Person_ID (Treatment_Group) type=ar(1);

run;

Thus, in order to manage the nonlinear outcome i squared the time variable (followup). This rendered parameter estimates more credible. So, the non-linear outcome could be accounted for in PROC MIXED by this method.
Should i prefer doing this in PROC NLMIXED?
Note that SAS performed my analysis, which included almost a million observations, in 20 minutes; extremely fast compared to other software.mixed
1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

Two points:  Watch out for comparing the results from the solution to the type 3 tests when there are interactions.  Trust the type 3 tests to tell you what is going on.  Also, the interactions are more important than the main effects when it comes to analysis of covariance (which is what this is).

Second, unequal spacing can be handled a couple of ways.  The spatial power structure is one that is used when the subjects have different spacing,  Another possibility is fitting a spline to the data, with missing values inserted for subjects as needed to complete the time effect.  However, with a very large dataset, that seems too time consuming to even consider.

Steve Denhjam


View solution in original post

4 REPLIES 4
SteveDenham
Jade | Level 19

I would stay with MIXED, unless the quadratic nature doesn't capture the actual nonlinearity.  I would offer the following though:

Center followup before fitting it in the model.  The quadratic term may be generating extremely large leverage realtive to the other terms.

Use ddfm=kenwardrogers--it applies the Satterthwaite approximation plus makes adjustments for the correlations in estimating the standard errors.

And that brings up the random statements.  You could replace the two statements with;

random intercept Treatment_Group/subject=PersonID ;

I don't understand how type=ar(1) applies correctly here, though.  This doesn't look like a repeated measures structure, as I don't see the intercept as something that varies with time (I could be wrong here, and may be about to learn something new, though).

Finally, this model assumes that the time course for all treatment groups is identical, shifting only the intercept as measured by Treatment_Group.  I would think that if there was a group effect, it would also change the time course.  To look for that, try the following model statement:

model bloodpressure = followup_c followup_c*followup_c Treatment_Group Treatment_Group*followup_c Treatment_Group*followup_c*followup_c/solution ddfm=kenwardrogers; /*followup_c is the centered follow up time */

Your method fits a smooth curve.  If the followup times fit a reasonable number of discrete time points, you may be able to fit this as follows:

proc mixed data = DATASET covtest noclprint method=reml;

class Person_ID Treatment_Group followup_cat;

model bloodpressure = Treatment_Group|followup_cat/ solution ddfm = kenwardrogers;

random intercept / sub=Person_ID ;

repeated followup_cat / sub=Person_ID (Treatment_Group) type=ar(1);

run;

This allows for a "non-smooth" time effect, and eliminates the need to construct a second-order time term.

Hope this helps.

Steve Denham

Adam1
Calcite | Level 5

Hi

I sincerely appreciate the detailed and educating reply Steve. Thank you.

I took your advice; I grand mean centered continuous predictors (incl followup), however I did not center the outcome (assuming interpretations would be more straight forward).

Since I applied all your advice at once, I could not conclude which one of these resulted in:

1) Running the models now take 5-10 minutes (500'000 observations tested) which is several times faster.

2) 'Followup' and 'Followup*Followup' are significant as fixed effects (solutions), however the type 3 test results in a non significant 'followup'.

3) There were several interactions.

I did not fit Your last model since repeated measuresments are spaced (in time) very unequally for individuals in the study.

I will run the models tomorrow and post some results here, in case that would be interesting.

Thanks again!

&

Happy new years!

SteveDenham
Jade | Level 19

Two points:  Watch out for comparing the results from the solution to the type 3 tests when there are interactions.  Trust the type 3 tests to tell you what is going on.  Also, the interactions are more important than the main effects when it comes to analysis of covariance (which is what this is).

Second, unequal spacing can be handled a couple of ways.  The spatial power structure is one that is used when the subjects have different spacing,  Another possibility is fitting a spline to the data, with missing values inserted for subjects as needed to complete the time effect.  However, with a very large dataset, that seems too time consuming to even consider.

Steve Denhjam


Adam1
Calcite | Level 5

I implemented all suggestions Steve; it worked great and the work is done.

Using only one random statement reduced computing times markedly.

sp(pow) brought parameter estimates closer to what descriptive data indicate.

Thanks.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1657 views
  • 3 likes
  • 2 in conversation