turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Accounting for regression to the mean

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-01-2013 08:46 AM

Apologies if this is duplicate question, but I can't find anything similar using search.

I have a data set containing a continuous variable, isotope GFR (which should remain stable) measured at time 0 and at 4 months. There are three categorical outcome states assigned at four months (stable, improved, deteriorated). I'm trying to consider differences in baseline variable between outcome groups.

Using scatter and Galton plots it appears there is a degree of regression to the mean in the measured variable. My thought on accounting for this is to use ANCOVA as below. However, my stats background is limited and I'd be very grateful for any comments as to the appropriateness of this method and / or advice on how to better handle this.

Many thanks

Jime

**PROC** **GLM** DATA=Date;

CLASS RESPONSE;

MODEL GFR_0 =RESPONSE (GFR_4-GFR_0) ;

LSMEANS RESPONSE / ADJUST=TUKEY PDIFF ;

**RUN**;

Accepted Solutions

Solution

05-01-2013
09:01 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-01-2013 09:01 AM

Try changing the model statement to:

MODEL GFR_4=RESPONSE GFR_0;

This will give the four month value as adjusted for the time 0 value.

You should probably have a preliminary step, to check the homogeneity of slopes across the RESPONSE categories (see Milliken and Johnson's *Analysis of Messy Data III: Analysis of Covariance*).

So first fit:

MODEL GFR_4=RESPONSE GFR_0 GFR_0*RESPONSE;

and check the significance of the interaction term. If it is non-significant, then the MODEL statement I gave at first is appropriate. If it is significant, then the differences need to be calculated at a minimum of three time zero values (low, median, high) using multiple LSMEANS statements and the AT= option (check the documentation on how to do this).

Steve Denham

All Replies

Solution

05-01-2013
09:01 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-01-2013 09:01 AM

Try changing the model statement to:

MODEL GFR_4=RESPONSE GFR_0;

This will give the four month value as adjusted for the time 0 value.

You should probably have a preliminary step, to check the homogeneity of slopes across the RESPONSE categories (see Milliken and Johnson's *Analysis of Messy Data III: Analysis of Covariance*).

So first fit:

MODEL GFR_4=RESPONSE GFR_0 GFR_0*RESPONSE;

and check the significance of the interaction term. If it is non-significant, then the MODEL statement I gave at first is appropriate. If it is significant, then the differences need to be calculated at a minimum of three time zero values (low, median, high) using multiple LSMEANS statements and the AT= option (check the documentation on how to do this).

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-01-2013 09:15 AM

Many thanks Steve!