How to input automatically the thresholds of Cook Distance in proc reg

Reply
New Contributor
Posts: 3

How to input automatically the thresholds of Cook Distance in proc reg

I am a newcomer to SAS  and I have what I think is a pretty basic question for the support community.

I am running some regression analyses and I would like to create the threshold (4/n) automatically for Cook Distance for my easy instead of running first the regression and input the observation used manually.

Please see below  the  Language  below.

proc reg data=myData;

     model  Y = X1 X2 t;

     where cd<4/184  ;

run;

quit;

Furthermore there are some statistical measures such as leverage that require the number of parameters variables in the threshold. How to get also them automatically?

I am looking forward for your prompt reply.

Many thanks in advance

Georgesssss

SAS Super FREQ
Posts: 3,408

Re: How to input automatically the thresholds of  Cook Distance  in proc reg

I think the cutoff is only used for displaying a graph of the Cook's D statistic for each obs. Thus there is not built-in option to set the cutoff value.

You can use the OUTPUT option to create an output data set that contains the CookD statistic. You can then use PROC SGPLOT to plot whatever cutoff reference line you want, or use a DATA step to assign a 0/1 indicator variable for the Cook's D statistic, like this:

proc reg data=myData noprint;   /* analyze full data */
model Y = X1 X2 t / influence;
output out=RegOut CookD=D;
quit;

data FilteredData;  /* exclude obs with large Cook's D stat */
set RegOut;
where D < %sysevalf(4/184);
run;

/* analyze filtered data */

proc reg data=FilteredData;
model Y = X1 X2 t;
quit;

However, I don't recommend this approach. I think using the ROBUSTREG procedure is a more statistically sound way to carry out a robust regression when you have outliers or high-leverage points.

SAS Super FREQ
Posts: 3,408

Re: How to input automatically the thresholds of  Cook Distance  in proc reg

Oh yeah, you can use the OUTEST= option on the PROC REG statement to get an output data set. That data set will contain the number of parameters in the model is you use the EDF option on the MODEL statement. See

SAS/STAT(R) 14.1 User's Guide

Ask a Question
Discussion stats
  • 2 replies
  • 450 views
  • 0 likes
  • 2 in conversation