Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Forecasting
- /
- PROC MODEL: How can I get the data for the Cook's D graph in an output...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 07-04-2017 11:26 AM
(935 views)

First a small disclaimer: I am asking this question as a statistical noob on behalf of a custormer of ours.

We are running SAS 9.4 M3 (SAS/ETS 14.1)

The customer runs various fit analysis using PROC MODEL and one of the graphs that is being generated is the Cook's D graph

*The question of the customer is: can I get the data that is being used to generate the Cook's D graph in an output dataset and link this back to my source data so I can filter outliers?*

By using the ODS TRACE statement I have been able to derive - I think - the base data that is being used.

The code I have used (hopefully as a representative example) is:

```
ods trace on;
proc model data=sashelp.citimon;
lhur = 1/(a * ip + b) + c;
fit lhur;
id date;
run;
ods trace off;
```

Attached a sample of part the output that is generated, specifically Panel 1 that contains the Cook's D graph.

Comparing the output in the log against the output my assumption is that the base data for the graphs can be obtained by adding the following statement just before the PROC MODEL statement:

`ods output DiagnosticsPanel=work.panel;`

However, it seems as if the dataset contains all the data (the dataset contains 36 variables) that is needed to create the graphs in Panel 1 (and 2?) and I could find no way to relate the data back to the source data. I was hoping that adding the option PLOT(ONLY)=COOKSD would reduce the number of variables in the output dataset but it - as maybe could be expected - only reduced the number of graphs.

Hopefully someone can shed some light on this and help me help our customer.

TIA

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The dataset includes two variables to help you match back to the original:

ID - count of obs from 1 to NObs

Actual - the observed value

_____RESIDUAL_RESIDUAL____MSE_WE - I think this is the Cook's D variable.

@Resa wrote:

First a small disclaimer: I am asking this question as a statistical noob on behalf of a custormer of ours.

We are running SAS 9.4 M3 (SAS/ETS 14.1)

The customer runs various fit analysis using PROC MODEL and one of the graphs that is being generated is the Cook's D graph

The question of the customer is: can I get the data that is being used to generate the Cook's D graph in an output dataset and link this back to my source data so I can filter outliers?

By using the ODS TRACE statement I have been able to derive - I think - the base data that is being used.

The code I have used (hopefully as a representative example) is:

`ods trace on; proc model data=sashelp.citimon; lhur = 1/(a * ip + b) + c; fit lhur; id date; run; ods trace off;`

Attached a sample of part the output that is generated, specifically Panel 1 that contains the Cook's D graph.

Comparing the output in the log against the output my assumption is that the base data for the graphs can be obtained by adding the following statement just before the PROC MODEL statement:

`ods output DiagnosticsPanel=work.panel;`

However, it seems as if the dataset contains all the data (the dataset contains 36 variables) that is needed to create the graphs in Panel 1 (and 2?) and I could find no way to relate the data back to the source data. I was hoping that adding the option PLOT(ONLY)=COOKSD would reduce the number of variables in the output dataset but it - as maybe could be expected - only reduced the number of graphs.

Hopefully someone can shed some light on this and help me help our customer.

TIA

2 REPLIES 2

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The dataset includes two variables to help you match back to the original:

ID - count of obs from 1 to NObs

Actual - the observed value

_____RESIDUAL_RESIDUAL____MSE_WE - I think this is the Cook's D variable.

@Resa wrote:

First a small disclaimer: I am asking this question as a statistical noob on behalf of a custormer of ours.

We are running SAS 9.4 M3 (SAS/ETS 14.1)

The customer runs various fit analysis using PROC MODEL and one of the graphs that is being generated is the Cook's D graph

The question of the customer is: can I get the data that is being used to generate the Cook's D graph in an output dataset and link this back to my source data so I can filter outliers?

By using the ODS TRACE statement I have been able to derive - I think - the base data that is being used.

The code I have used (hopefully as a representative example) is:

`ods trace on; proc model data=sashelp.citimon; lhur = 1/(a * ip + b) + c; fit lhur; id date; run; ods trace off;`

Attached a sample of part the output that is generated, specifically Panel 1 that contains the Cook's D graph.

Comparing the output in the log against the output my assumption is that the base data for the graphs can be obtained by adding the following statement just before the PROC MODEL statement:

`ods output DiagnosticsPanel=work.panel;`

However, it seems as if the dataset contains all the data (the dataset contains 36 variables) that is needed to create the graphs in Panel 1 (and 2?) and I could find no way to relate the data back to the source data. I was hoping that adding the option PLOT(ONLY)=COOKSD would reduce the number of variables in the output dataset but it - as maybe could be expected - only reduced the number of graphs.

Hopefully someone can shed some light on this and help me help our customer.

TIA

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi @Reeza,

Thank you for your quick reply.

Based on your information (created a needle plot using the variable you indicated) it seems you have this correct.

I will confirm with the customer tomorrow and if your information is correct will mark your reply as a solution.

It would have been nice if they would have given the column a somewhat more meaningfull name though

Will keep you posted

--Resa

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.