BookmarkSubscribeRSS Feed
braam
Quartz | Level 8

Dear All,

 

I was wondering how I can run a fixed-effect regression with standard errors being clustered. I have a panel data of individuals being observed multiple times. I would like to run the regression with the individual fixed effects and standard errors being clustered by individuals. Since I have more than several thousands of individuals, CLASS statement with PROC SURVEYREG is really inefficient, and SAS says insufficient memory. So I don't think I can use PROC SURVEYREG.

 

Can I achieve this using proc glm or proc model? I searched, but didn't find a clear way to do so. Thanks in advance. 

13 REPLIES 13
PaigeMiller
Diamond | Level 26

Maybe PROC GLM with a WEIGHT statement? https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=statug&docsetTarget=statu...

 

From the documentation: "If the weights for the observations are proportional to the reciprocals of the error variances, then the weighted least squares estimates are best linear unbiased estimators (BLUE)"

--
Paige Miller
braam
Quartz | Level 8
Isn't WLS about heteroscedasticity (i.e., variance) while clustering standard errors is about covariance within a unit (having multiple observations)? I think they are two different issues.
PaigeMiller
Diamond | Level 26

How are you thinking about including cluster in any model you would fit?

--
Paige Miller
braam
Quartz | Level 8
I'm not sure if I understand your suggestion.

What I would like to do is to include IDs as fixed effects and get standard errors clustered by IDs at the same time. I know it's possible with PROC SURVEYREG, but when I have many ID values, it's practically impossible. So I'm looking for another procedure.
PaigeMiller
Diamond | Level 26

@braam wrote:
... and get standard errors clustered by IDs at the same time.

Now this implies that the standard errors clustered by IDs are the output of the regression. Is that correct? I thought the standard errors were inputs to a regression.

--
Paige Miller
braam
Quartz | Level 8
Sorry for the confusion. Yes, I would like to 1) have clustered standard errors and 2) include individual-fixed effects.
PaigeMiller
Diamond | Level 26

Show us the SURVEYREG code you were thinking of using, even if it doesn't work because there's too many individuals.

--
Paige Miller
braam
Quartz | Level 8

This is the code that you requested. In this example, having too many values for Origin would make this type of regression really inefficient, which takes more than several hours for my case/data. 

 

The below is GLM code where I cannot cluster standard errors. I also absorb Origin, rather than estimating its fixed effects. I actually expected the same coefficients on Cylinders from these two approaches, but they are not, which is strange to me.

 

proc surveyreg data= sashelp.cars;
	cluster Origin;
	class Origin Type;
	model EngineSize= Cylinders Origin Type/ solution;
	run;

proc glm data= sashelp.cars;
	absorb Origin;
	class Type;
	model EngineSize= Cylinders Type/ solution;
	run;
SURVEYREG RESULT

Estimated Regression Coefficients

Parameter Estimate Standard Error t Value Pr > |t|
Intercept -0.2423962 0.24823069 -0.98 0.4318
Cylinders 0.6195316 0.03299998 18.77 0.0028
Origin Asia -0.2473363 0.02963121 -8.35 0.0141
Origin Europe -0.4510775 0.00538821 -83.72 0.0001
Origin USA 0.0000000 0.00000000 . .
Type Hybrid -0.1485498 0.10737472 -1.38 0.3007
Type SUV 0.2723754 0.09245885 2.95 0.0985
Type Sedan -0.0206628 0.05296500 -0.39 0.7341
Type Sports 0.1480223 0.17265540 0.86 0.4816
Type Truck 0.5319361 0.11385004 4.67 0.0429
Type Wagon 0.0000000 0.00000000 . .

 

 GLM RESULT

Parameter Estimate   Standard Error t Value Pr > |t|
Cylinders 0.6292556337   0.01473441 42.71 <.0001
Type Hybrid -.1535480401 B 0.23825545 -0.64 0.5196
Type SUV 0.2436920500 B 0.08982120 2.71 0.0070
Type Sedan -.0144629620 B 0.07368536 -0.20 0.8445
Type Sports 0.0949267303 B 0.09199753 1.03 0.3028
Type Truck 0.4970593441 B 0.10899489 4.56 <.0001
Type Wagon 0.0000000000 B . . .

 

PaigeMiller
Diamond | Level 26

This seems to be a problem that I will have to think about, as I don't see an obvious path forward right now. Large number of levels of any class variable do cause this problem where you don't have enough memory or it takes a huge long time.

 

How were you going to handle the issue that SAS always assigns a standard error of zero to one (or more) of the class levels?

 

 

--
Paige Miller
Rick_SAS
SAS Super FREQ

To get the same parameter estimates, you need to specify NOINT in the SURVEYREG procedure:

 

proc sort data=sashelp.cars out=cars;
by Origin;
run;

proc surveyreg data=cars;
	cluster Origin;
	class Origin Type;
	model EngineSize= Cylinders Origin Type/ noint solution;
        ods select parameterestimates;
	run;

proc glm data=cars;
	absorb Origin;
	class Type;
	model EngineSize= Cylinders Type/ solution;
        ods select parameterestimates;
	quit;
braam
Quartz | Level 8
Thanks! I confirmed it! One thing that is interesting to me is that the coefficient on Cylinders is 0.619 in both ways, but their t-stat varies a lot. For surveyreg, t-stat is 18.77 while for glm, t-stat is 46.32.

Is it because absorbing fixed-effects (conceptually demeaning) influences variance-covariance matrix?
Rick_SAS
SAS Super FREQ

It is because the variance estimation formulas for survey statistics (like in PROC SURVEYREG) are different from the variance estimation formulas in linear modeling. Although the point estimates are the same, the standard errors are not. The survey variance is inflated because you need to account for the sample design.

Ksharp
Super User

If you have  panel data ,Try post it at Forecast forum. also try PROC PANEL .

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 6061 views
  • 0 likes
  • 4 in conversation