Re: Proc Surveyreg, F-test of coefficients from separate models

lezgin · Posted 09-18-2020 12:10 AM

I am running two separate regressions using proc surveyreg as follows.

proc surveyreg data=have1; cluster date; model y= x1 x2 x1*x2 x3 x4 x5/ adjrsq;run;quit;

proc surveyreg data=have2; cluster date; model y= x1 x2 x1*x2 x3 x4 x6 x7/ adjrsq;run;quit;

I need to do an F-test to determine whether the interaction term x1*x2 from the first model equals x1*x2 from the second model. Please, see that the two models do not share the same set of independent variables. I appreciate any help.

PaigeMiller · Posted 09-18-2020 06:11 AM

What is the difference between HAVE1 and HAVE2? Is it just two different sub-populations (such as males and females) of a larger population? Are the date clusters the same?

--
Paige Miller

lezgin · Posted 09-18-2020 09:15 AM

Have1 and have 2 have different date clusters. Year and month
dummies and the sample sizes are different.

PaigeMiller · Posted 09-18-2020 01:03 PM

Repeating my question:

What is the difference between HAVE1 and HAVE2? Is it just two different sub-populations (such as males and females) of a larger population?

--
Paige Miller

lezgin · Posted 09-18-2020 01:16 PM

Sorry if it wasn't clear. Have1 and have2 are subsamples based on the cluster variable.

PaigeMiller · Posted 09-18-2020 02:03 PM

@lezgin wrote:
Sorry if it wasn't clear. Have1 and have2 are subsamples based on the cluster variable.

The problem I am trying to understand is if the two data sets, HAVE1 and HAVE2, can be combined in some reasonable fashion, for example if they are two sub-populations. If the cluster dates are different, it's not clear to me how to proceed. Its also not clear to me how to proceed if one model has a different set of predictor variables than the other model. So I think I will leave it there.

--
Paige Miller

lezgin · Posted 09-18-2020 02:10 PM

Yes, the cluster dates are different and this causes differences in independent variables only for time dummies. Otherwise, I could concatenate the two samples and create a binary variable to distinguish one sample from the other and expand the model by interacting all variables with this dummy, the triple interaction between this new binary variable and the x1*x2 interaction would give me the difference and the t-stat but this results in an error since, I believe, some of the dummy variables are missing in one subsample.

SteveDenham · Posted 09-21-2020 08:41 AM

This is convoluted and perhaps unnecessarily complicated, but what would happen if:

You create a 'shell' for have1 and have2 so that all of the observations in have1 have a corresponding record in have2 and vice versa. There will be missing values for both the independent and dependent variables.

Use PROC MI to impute the missing values. This is likely the most complicated part - selecting the best method for imputation.

Run SURVEYREG on several imputed data sets, with the interaction you are considering in the model.

Use PROC MIANALYZE to get a combined analysis for the imputed datasets.

SteveDenham

The 2025 SAS Hackathon has begun!