BookmarkSubscribeRSS Feed
rfrancis
Obsidian | Level 7

Hi everyone, I'm working with a dataset that is cross-sectional and longitudinal in nature, which creates cluster issues for standard errors (clustered in subjects and clustered in time). The goal is to test for differences between regression coefficients from two different models. So, for example, say Model 1 is Y = X1 + X2, and Model 2 is Y' = X1' + X2'. Here the goal is a test of equality between the coefficients for X1 and X1' ... and between X2 and X2'. PROC SYSLIN seems like the perfect vehicle for this scenario. But the clustered data creates a problem for the standard errors used in SYSLIN. Guess two-way SEs for SYSLIN is a solution. I have access to two-way SEs using SURVEYREG, So, guess I could use these two-way SEs but need a bridge to incorporate them into SYSLIN. And I'm ALL ears for other approaches. I am grateful for any ideas. Thank you! Rick

13 REPLIES 13
StatDave
SAS Super FREQ

The meaning of your notation is not clear... but if the variables in two models are the same variables just measured on independent sets of data, such as observations on two independent groups, then you can use the method discussed in this note. You could use a GEE model to deal with the clustering. 

rfrancis
Obsidian | Level 7

The two models are fit to the same dataset. However, the measurement of Y, X1 and X2 are different (I tried to highlight this idea with prime notation, Y', X1' and X2'). Thank you for your suggestion.

PaigeMiller
Diamond | Level 26

@rfrancis wrote:

The two models are fit to the same dataset. However, the measurement of Y, X1 and X2 are different (I tried to highlight this idea with prime notation, Y', X1' and X2'). Thank you for your suggestion.


I'm still not clear on this. Y is a different variable than Y' and so on? Same variable measured by a different method/different device? Can you explain further what the differences are?

 

If these are all truly different variables, I don't see why you would expect the slopes of X1 and X2 to predict Y would be the same as the slopes of X1' and X2' to predict Y', nor do I see a reason why a statistical test of the slopes would be valid.

--
Paige Miller
rfrancis
Obsidian | Level 7

Hi Paige, the difference between the variables in the two models is simply how we scale them (i.e., deflate them). Thanks for your help!  Rick

Rick_SAS
SAS Super FREQ

So you are fitting one model in PROC SYSLIN and the other in PROC SURVEYREG? Can you post the SAS code? 

These procedures have very different assumptions. How was the data collected? For example, is your data from a survey or from a time series?

rfrancis
Obsidian | Level 7

Probably best to purge this issue of any existing statistical procedures, and start with a clean slate. Think of the variables (both Y and X) for both models as consisting of a numerator and a denominator. The numerators are identical for both models. The denominators represent an attempt to scale the numerators, and the denominators are different for the two models. Goal is to compare the coefficients between the two models. Does this help frame the issue?  Thanks for you help guys!  Rick

Rick_SAS
SAS Super FREQ

You can write down the relationship between regression coefficients for one set of variables as compared to a scaled version of those variables. See the section "The effect of standardizing variables on regression estimates" in the article "Standardized regression coefficients", which not only handles scaling but also recentering the variables.

 

For simple scaling, the regression coefficients will rescale according to the denominator that you are using. Suppose that the predictive regression model is

Y = b0 + b1*X1 + b2*X2.

You are interested in what the regression would be if you define Z1 = X1/c1 and Z2 = X2/c2. Solving for X1 and X2 and plugging in gives

Y = b0 + (b1*c2)*Z1 + (b2*c2)*Z2.

So the regression coefficient for Z1 is c1 times as big as for X1, and the regression coefficient for Z2 is c2 times as big as for X2.

rfrancis
Obsidian | Level 7

Not sure if that works if you also rescale the dependent variable.

PaigeMiller
Diamond | Level 26

Is the scaling that transforms Y into Y' the same scaling that transforms X1 into X1' and X2 into X2'? In other word, to do a scaling, you multiply or divide by a constant, is it the same constant for all three variables, or is the constant different for each variable?

--
Paige Miller
rfrancis
Obsidian | Level 7

Yes Paige, you are correct. There may be a much simpler solution (famous last words) ... I'm thinking now that I could use two samples: one for each measurement.  Then simply use a dummy variable to pick up the difference in the coefficients. Thanks for your time, I really appreciate it!  Rick

PaigeMiller
Diamond | Level 26

@rfrancis wrote:

Yes Paige, you are correct. 


Which of my questions are you saying that I am correct about?

--
Paige Miller
rfrancis
Obsidian | Level 7

Paige, in detail, Model 1 looks like this: Y/Z = X1/Z + X2/Z, where X, Y and Z are variables (no constants). So, for a given observation, Y, X1 and X2 are scaled by the same value. Same for Model 2 just a different scalar (i.e., instead of Z, substitute say W, which is a variable not a constant). I think I have a solution that will work.  Thank you!!  Rick

Rick_SAS
SAS Super FREQ

Of course it works. Any linear transformation of variables in a linear equation can be explicitly solved.

If V = d*Y (in addition to above), then 

V = b0/d + (b1*c1/d)*Z1 + (b2*c2/d)*Z2

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 3561 views
  • 4 likes
  • 4 in conversation