- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Suppose we have a regression model between two variables: Yvar and Xvar. Suppose the results are grouped in terms of a variable with 3 levels (say this variable is for race: asian, black, white). If I observe the scatterplot in terms of these groupings (using different colors) and I observe that all 3 have different slopes - how I do I test this using the proc glm procedure?
That is, if I want to test that the slopes for Yvar and Xvar are different for each level of the race variable, how do I do this with proc glm? I'm pretty sure it involves partial F-tests, but I only have experience with those using the proc reg procedure.
Thanks in advance.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The parallel slopes testing model is:
Y = B0 + B1*Asian + B2*Black + B3*White + B4*X + B5*X*Asian + B6*X*Black + B7*X*White + e
The CLASS RACE statement creates dummy variables (Asian, Black, White) with values (1,0,0) for RACE="Asian", (0,1,0) for RACE="Black" and (0,0,1) for RACE="White". SAS will force parameters B3 and B7 to zero because they are redundant, so that the linear equations are:
Asian : Y = (B0+B1) + (B4+B5)*X
Black : Y = (B0+B2) + (B4+B6)*X
White : Y = B0 + B4*X
The F test for effect RACE*X tests the hypothesis B5=B6=B7=0 which, if true, would mean that all slopes are equal.
hth
PG
Message was edited by: PG corrected typo reported by Steve Denham.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I could be entirely wrong here:
1. Isn't that the parameter estimated for the categorical variable?
2. You could use contrast statements to test specific hypothesis.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The indicator (which has 3 levels here) is categorical, but the Y and X variables are numerical.
What do contrast statements look like?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Take a look at the Example in the documentation:
http://www.ats.ucla.edu/stat/sas/output/sas_glm_output.htm
A bit more complex than what you're doing, but hopefully gives you an idea of how to specify the class statement for your categorical variable and use contrasts for testing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Testing for differences in slopes is done routinely prior to analysis of covariance (check the example in SAS/STAT(R) 9.3 User's Guide). You simply need to include a slope-by-class interaction term in the ANCOVA model to fit separate slopes for each class (Race) :
class Race;
model VarY = Race|VarX / solution;
estimate 'Asian vs Black' Race*VarX 1 -1 0;
The Race*VarX term in the analysis of variance table tests for overall slope homogeneity.ESTIMATE statements test for individual differences.
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks. I've been a little confused, and I think it might stem from my understanding of the partial F test rather than how to use SAS. So I would like to clarify that I have the right approach:
I used proc glm to get various sum of squares information to do a partial F test.
Now, for testing the hypothesis that the slopes are all different, would it be correct to describe the model as:
Let R be indicator variable with 3 levels.
Y = B_0 + B_1 X + B_2 R + B_3 XR
Then, I would test the hypothesis that B_1 = B_2 = B_3 by setting up the model Y = B_0 + B (X + R + (X)(R))? And finally, just use the sum of squares information where this becomes my reduced model?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The parallel slopes testing model is:
Y = B0 + B1*Asian + B2*Black + B3*White + B4*X + B5*X*Asian + B6*X*Black + B7*X*White + e
The CLASS RACE statement creates dummy variables (Asian, Black, White) with values (1,0,0) for RACE="Asian", (0,1,0) for RACE="Black" and (0,0,1) for RACE="White". SAS will force parameters B3 and B7 to zero because they are redundant, so that the linear equations are:
Asian : Y = (B0+B1) + (B4+B5)*X
Black : Y = (B0+B2) + (B4+B6)*X
White : Y = B0 + B4*X
The F test for effect RACE*X tests the hypothesis B5=B6=B7=0 which, if true, would mean that all slopes are equal.
hth
PG
Message was edited by: PG corrected typo reported by Steve Denham.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PG, I always believed that the test for the hypothesis of equal slopes (B5=B6=B7=0) was RACE*VarX, as you mentioned in your first post, and the F test for effect RACE tested B1=B2=B3=0, or equality of intercepts.
Steve Denham
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Steve. I corrected the typo. - PG