turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Proc Score Intermediate Matrix

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-09-2016 11:48 AM

Proc Score scores a dataset using parameters from Proc Reg and provides you with a final estimate on the dependent variable. Is there a way to get the intermediate results? That is, can you get an X*Beta matrix with BY statement functionality without hand coding the multiplication? I have large confidential data, so the following fake data is provided as an example. Consider a model estimating a bank's return on investment (ROI) as a function of the time the account has been open, the balance in the account, and the average times the customer visits a branch each month:

DATA WORK.TESTFILE;

INFILE DATALINES DLM=',';

INPUT MBR_NO ACCT_TYPE $ ROI YEARS_OPEN BALANCE VISIT_PER_MO ;

DATALINES;

01,C,0.515,5,11885,1

02,C,0.964,6,6023,0

03,S,0.735,2,9078,3

04,S,0.504,1,181,4

05,C,0.653,2,805,10

06,C,0.698,7,8321,4

07,S,0.108,6,3634,0

08,C,0.952,5,6500,7

09,C,0.309,1,7680,2

10,C,0.221,2,6969,2

11,C,0.352,13,8218,6

12,C,0.440,2,19995,1

13,S,0.782,4,1301,5

14,S,0.004,11,12871,3

15,C,0.525,8,12168,0

16,C,0.119,11,17187,4

17,S,0.169,7,6931,11

18,C,0.261,1,21894,7

19,C,0.406,2,6236,2

20,C,0.403,14,13017,1

;

RUN;

PROC SORT DATA=WORK.TESTFILE;

BY ACCT_TYPE;

RUN;

PROC REG DATA=WORK.TESTFILE OUTEST=WORK.TESTPARMS;

BY ACCT_TYPE ;

MODEL ROI = YEARS_OPEN BALANCE VISIT_PER_MO;

RUN;

PROC SCORE DATA=WORK.TESTFILE SCORE=WORK.TESTPARMS TYPE=PARMS;

BY ACCT_TYPE;

VAR ROI YEARS_OPEN BALANCE VISIT_PER_MO;

RUN;

The parameter for Visits_Per_Mo where Acct_Type = 'C' is -0.00242. I'm seeking a matrix where Visit_Per_Mo = 0 for Mbr_No = 15, -0.00242 for Mbr_No = 20, -0.00484 for Mbr_No = 9, etc. I'm not seeing anything in the documentation that will allow me to achieve this with Proc Score. Is there another procedure that will achieve the same thing? The dataset and number of variables in my actual dataset are large, so hand-coding the multiplication would prove time-consuming.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-09-2016 11:52 AM

Can you post your expected output?

I'm not seeing what an intermediate matrix would look like.

Also, the multiplication should be very straightforward - using two arrays gets you there easily...if I understand what you're looking for.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-09-2016 12:16 PM

You can get that from the scored data set by adding out=data set name:

PROC SCORE DATA=WORK.TESTFILE SCORE=WORK.TESTPARMS **out=WANT** TYPE=PARMS;

BY ACCT_TYPE;

VAR ROI YEARS_OPEN BALANCE VISIT_PER_MO;

RUN;

proc print data=want;

run;

Model1 variable in want data set will give you the desired values.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-09-2016 12:28 PM

The Out= option does not provide the desired output. It does give a final sumproduct, but it does not provide the intermediate products prior to summation.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-09-2016 12:26 PM

Given that the parameters from the regression look like this:

ACCT_TYPE | _MODEL_ | _TYPE_ | _DEPVAR_ | _RMSE_ | Intercept | YEARS_OPEN | BALANCE | VISIT_PER_MO | ROI |

C | MODEL1 | PARMS | ROI | 0.252807 | 0.721278 | -0.00053 | -2.13E-05 | -0.00242 | -1 |

S | MODEL1 | PARMS | ROI | 0.307368 | 0.723341 | -0.08484 | 1.19E-05 | 0.007182 | -1 |

The desired results look like this:

MBR_NO | ACCT_TYP | YEARS_OPEN | BALANCE | VISITS_PER_MO |

1 | C | -0.00267 | -0.25276 | -0.00242 |

2 | C | -0.0032 | -0.12809 | 0 |

5 | C | -0.00107 | -0.01712 | -0.01693 |

6 | C | -0.00374 | -0.17696 | -0.00484 |

8 | C | -0.00267 | -0.13824 | -0.01451 |

9 | C | -0.00053 | -0.16333 | -0.00242 |

10 | C | -0.00107 | -0.14821 | 0 |

11 | C | -0.00694 | -0.17477 | -0.00967 |

12 | C | -0.00107 | -0.42523 | -0.01693 |

15 | C | -0.00427 | -0.25878 | -0.00725 |

16 | C | -0.00587 | -0.36552 | -0.00967 |

18 | C | -0.00053 | -0.46562 | -0.01209 |

19 | C | -0.00107 | -0.13262 | -0.00725 |

20 | C | -0.00747 | -0.27683 | -0.0266 |

3 | S | -0.16969 | 0.108255 | 0.07182 |

4 | S | -0.08484 | 0.002158 | 0.028728 |

7 | S | -0.50907 | 0.043335 | 0.014364 |

13 | S | -0.33938 | 0.015514 | 0.014364 |

14 | S | -0.93329 | 0.153487 | 0.007182 |

17 | S | -0.59391 | 0.082652 | 0 |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-09-2016 12:36 PM

You'll need to do the math.

It should be relatively straightforward. Do you have a naming convention for your variables, that would help.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-09-2016 12:47 PM

Crud. I was afraid you'd say that. There is a naming convention, but, due to the confidentiality of the actual data, I cannot release it. Thanks for taking the time to look at this, regardless.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-09-2016 12:49 PM

Release fake names similar to what you did for your fake data. I won't have time to code anything now, someone else may, but it's a very straightforward calculation regardless of the number of values.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-09-2016 02:21 PM - edited 05-09-2016 02:22 PM

It sounds like you want to use the CODE statement to generate DATA step code. You can read about "Techniques for scoring a regression model in SAS." The last section in the article describes the CODE statement and shows how to use it to score a model. You can examine the scoring code, and it automatically encodes the matrix multiplication so that you don't have to code it yourself.