01-19-2017 05:33 PM

Hello everyone,

Here I have a question about how to perform mathematic computation across datasets.

Say I have a dataset **coefficient** like this:

variable coef

age 15

gender 3

hight 6

and another dataset **student** which have variables age, gender, hight, and others (all numerical):

ID age gender hight grade weight

1015 5 0 146 10 37

.........

I want to compute a score of each row in dataset "**student** " by using the value of variables in **student** time the corresponding coef in data **coefficient** and then sum up, i.e. : age*15+gender*3+hight*6.

Can someone help me realize this?

Thank you very much!!!

Best wishes.

01-19-2017 05:38 PM

Where do the coefficients come from?

Can you use PROC SCORE to implement this somehow?

01-19-2017 06:31 PM

Hi Reeza,

Yes, it is the coefficents from SAS output.

I tried proc score as you suggestted, but i have categorical variable, say rank, which has 5 levels, when i score it, the coefficents become rank1 rank2 rank3 rank4 rank5...

How to resolve this problem?

Thank you very much!!!

01-19-2017 08:42 PM

PROC SCORE will handle the categorical variables as long as the data used for scoring is identical, in format, to the input data.

01-20-2017 01:24 PM

Dear Reeza,

This is the code I'm using,

**proc** **logistic** data=modeldata descending outest=modelname ;

model outcome= &continue_vars &class_vars;

output out=modeldata PREDICTED=prob;

**run**;

**proc** **score** data=dataname score=modelname out=dataname type=parms;

var &continue_vars &class_vars;

**run**;

&class_vars contains categorical variables, in the PROC SCORE, it just keep giveing me error messages, saying those variables (&class_vars) not found:

Thank you!!!

01-21-2017 12:09 AM

You need to score a logistic model using proc logistic not proc score but since this requires a stored model, not the parameter data set that won't work.

01-19-2017 09:15 PM