Re: Compare regression coefficients from two models using clustered da...

rfrancis · Posted 12-07-2020 04:31 PM

Hi everyone, I'm working with a dataset that is cross-sectional and longitudinal in nature, which creates cluster issues for standard errors (clustered in subjects and clustered in time). The goal is to test for differences between regression coefficients from two different models. So, for example, say Model 1 is Y = X1 + X2, and Model 2 is Y' = X1' + X2'. Here the goal is a test of equality between the coefficients for X1 and X1' ... and between X2 and X2'. PROC SYSLIN seems like the perfect vehicle for this scenario. But the clustered data creates a problem for the standard errors used in SYSLIN. Guess two-way SEs for SYSLIN is a solution. I have access to two-way SEs using SURVEYREG, So, guess I could use these two-way SEs but need a bridge to incorporate them into SYSLIN. And I'm ALL ears for other approaches. I am grateful for any ideas. Thank you! Rick

StatDave · Posted 12-07-2020 04:56 PM

The meaning of your notation is not clear... but if the variables in two models are the same variables just measured on independent sets of data, such as observations on two independent groups, then you can use the method discussed in this note. You could use a GEE model to deal with the clustering.

rfrancis · Posted 12-07-2020 06:41 PM

The two models are fit to the same dataset. However, the measurement of Y, X1 and X2 are different (I tried to highlight this idea with prime notation, Y', X1' and X2'). Thank you for your suggestion.

PaigeMiller · Posted 12-08-2020 07:03 AM

@rfrancis wrote:

The two models are fit to the same dataset. However, the measurement of Y, X1 and X2 are different (I tried to highlight this idea with prime notation, Y', X1' and X2'). Thank you for your suggestion.

I'm still not clear on this. Y is a different variable than Y' and so on? Same variable measured by a different method/different device? Can you explain further what the differences are?

If these are all truly different variables, I don't see why you would expect the slopes of X1 and X2 to predict Y would be the same as the slopes of X1' and X2' to predict Y', nor do I see a reason why a statistical test of the slopes would be valid.

--
Paige Miller

rfrancis · Posted 12-08-2020 10:40 AM

Hi Paige, the difference between the variables in the two models is simply how we scale them (i.e., deflate them). Thanks for your help! Rick

Rick_SAS · Posted 12-08-2020 09:06 AM

So you are fitting one model in PROC SYSLIN and the other in PROC SURVEYREG? Can you post the SAS code?

These procedures have very different assumptions. How was the data collected? For example, is your data from a survey or from a time series?

rfrancis · Posted 12-08-2020 10:53 AM

Probably best to purge this issue of any existing statistical procedures, and start with a clean slate. Think of the variables (both Y and X) for both models as consisting of a numerator and a denominator. The numerators are identical for both models. The denominators represent an attempt to scale the numerators, and the denominators are different for the two models. Goal is to compare the coefficients between the two models. Does this help frame the issue? Thanks for you help guys! Rick

Rick_SAS · Posted 12-08-2020 11:13 AM

You can write down the relationship between regression coefficients for one set of variables as compared to a scaled version of those variables. See the section "The effect of standardizing variables on regression estimates" in the article "Standardized regression coefficients", which not only handles scaling but also recentering the variables.

For simple scaling, the regression coefficients will rescale according to the denominator that you are using. Suppose that the predictive regression model is

Y = b0 + b1*X1 + b2*X2.

You are interested in what the regression would be if you define Z1 = X1/c1 and Z2 = X2/c2. Solving for X1 and X2 and plugging in gives

Y = b0 + (b1*c2)*Z1 + (b2*c2)*Z2.

So the regression coefficient for Z1 is c1 times as big as for X1, and the regression coefficient for Z2 is c2 times as big as for X2.

rfrancis · Posted 12-08-2020 11:30 AM

Not sure if that works if you also rescale the dependent variable.

PaigeMiller · Posted 12-08-2020 11:37 AM

Is the scaling that transforms Y into Y' the same scaling that transforms X1 into X1' and X2 into X2'? In other word, to do a scaling, you multiply or divide by a constant, is it the same constant for all three variables, or is the constant different for each variable?

--
Paige Miller

rfrancis · Posted 12-08-2020 12:17 PM

Yes Paige, you are correct. There may be a much simpler solution (famous last words) ... I'm thinking now that I could use two samples: one for each measurement. Then simply use a dummy variable to pick up the difference in the coefficients. Thanks for your time, I really appreciate it! Rick

PaigeMiller · Posted 12-09-2020 06:54 AM

@rfrancis wrote:

Yes Paige, you are correct.

Which of my questions are you saying that I am correct about?

--
Paige Miller

rfrancis · Posted 12-09-2020 10:16 AM

Paige, in detail, Model 1 looks like this: Y/Z = X1/Z + X2/Z, where X, Y and Z are variables (no constants). So, for a given observation, Y, X1 and X2 are scaled by the same value. Same for Model 2 just a different scalar (i.e., instead of Z, substitute say W, which is a variable not a constant). I think I have a solution that will work. Thank you!! Rick

Rick_SAS · Posted 12-08-2020 12:59 PM

Of course it works. Any linear transformation of variables in a linear equation can be explicitly solved.

If V = d*Y (in addition to above), then

V = b0/d + (b1*c1/d)*Z1 + (b2*c2/d)*Z2

Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

Re: Compare regression coefficients from two models using clustered data

SAS Innovate 2025: Register Today!