turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- proc score with no intercept

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-19-2017 09:56 PM

Hello all,

I have a question about proc score. i want to score a new dataset, but leaving the intercept out. how to do that?

i.e., for observations in new datasets, i wnat to calculate:

beta1*x1+beta2*x2+ ... + beta_p*x_p

instead of:

**intecept** + beta1*x1+beta2*x2+ ... + beta_p*x_p

how to do that?

I need to do this for many models, so mannully subtract the intercept from the result is not very ideal.

Thank you in advance!!

Best wishes.

Accepted Solutions

Solution

01-20-2017
02:09 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2017 01:24 PM

Post-fitting analyses such as scoring use a model that is computed and stored by a regression procedure. You can't expect to score a model that you have not fit.

To fit a regression model without an intercept, use the NOINT option on the model statement, like this

class &class_vars;

model outcome= &continue_vars &class_vars / NOINT;

If, for some strange reason, you ABSOLUTELY need to do what you are asking, you can use a DATA step view to recenter the response variable in the scoring data. That is, if you have fit the model

Y = Int + b1*x1 + b2*x2 + ...

and you define a new variable

YNew = Y + Int

then PROC SCORE (or PROC PLM) will score the data for

YNew = Int + b1*x1 + b2*x2 + ...

which is of course equivalent to

Y = b1*x1 + b2*x2 + ...

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-19-2017 10:40 PM

Why fit a model with intercept in the first place?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-19-2017 10:42 PM

You haven't posted any code so it's hard to say, different procs have different options.

Post what you're currently doing.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2017 01:12 PM

Dear Reeza,

It's hard to explain, but I have to fit the model with intercept using the original data, then score the fitted model with another dataset without intercept.

Now I'm trying to use PROC SCORE as you suggestted in my another post to realize it, here is the code I'm using:

**proc** **logistic** data=modeldata descending outest=modelname ;

model outcome= &continue_vars &class_vars;

output out=modeldata PREDICTED=prob;

**run**;

**proc** **score** data=dataname score=modelname out=dataname type=parms;

var &selected_variables;

**run**;

In the second part PROC SCORE, i want to remove the intercept.

Thank you very much!!!!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2017 01:16 PM

Examine the dataset called MODELNAME.

Solution

01-20-2017
02:09 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2017 01:24 PM

Post-fitting analyses such as scoring use a model that is computed and stored by a regression procedure. You can't expect to score a model that you have not fit.

To fit a regression model without an intercept, use the NOINT option on the model statement, like this

class &class_vars;

model outcome= &continue_vars &class_vars / NOINT;

If, for some strange reason, you ABSOLUTELY need to do what you are asking, you can use a DATA step view to recenter the response variable in the scoring data. That is, if you have fit the model

Y = Int + b1*x1 + b2*x2 + ...

and you define a new variable

YNew = Y + Int

then PROC SCORE (or PROC PLM) will score the data for

YNew = Int + b1*x1 + b2*x2 + ...

which is of course equivalent to

Y = b1*x1 + b2*x2 + ...

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2017 01:55 PM

Dear Rick,

So you are saying that I cannot force PROC SCORE to omit any variable that is in the original model, right?

Yeah, I know it's weird, but I really need to do what I asked......

So basically I need to subtract the intercept manually, right?

Then my question is, how to put a value from a dataset into macro variable? I want to put the intercept into a marco variable, so I can subtract it from the PROC SCORE's result, and don't need to manually type in the numbers.

Thank you!!!

So you are saying that I cannot force PROC SCORE to omit any variable that is in the original model, right?

Yeah, I know it's weird, but I really need to do what I asked......

So basically I need to subtract the intercept manually, right?

Then my question is, how to put a value from a dataset into macro variable? I want to put the intercept into a marco variable, so I can subtract it from the PROC SCORE's result, and don't need to manually type in the numbers.

Thank you!!!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2017 02:08 PM

never mind, I got it..

Thank you!!

Thank you!!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2017 02:20 PM

> So you are saying that I cannot force PROC SCORE to omit any variable that is in the original model, right?

I did not say that. For linear models, if you set the data for a variable to zero, then that is equivalent to having a zero estimate, which omits that term.

> Yeah, I know it's weird, but I really need to do what I asked......

So you say, but most analysts agree that "if you can't explain it then you shouldn't do it."

> So basically I need to subtract the intercept manually, right?

I recommend that you use the NOINT option in the model statement. My explanation was for a linear model. The example you give is a logistic model, which has a binary response and a (logit) link function. The link function makes the math more complicated.

> Then my question is, how to put a value from a dataset into macro variable?

You won't need it if you use NOINT. But the answer is to read the doc for the SYMPUTX subroutine.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2017 03:05 PM

>> So you are saying that I cannot force PROC SCORE to omit any variable that is in the original model, right?

>I did not say that. For linear models, if you set the data for a variable to zero, then that is equivalent to having a zero estimate, which omits that term.

Oh, is that means I have to modify the dataset to be scored, and cannot achive it by adding commands to the PROC SCORE?

>I recommend that you use the NOINT option in the model statement.

you mean add NOINT in proc logist or in proc score? is there a model statement in proc score?

>So you say, but most analysts agree that "if you can't explain it then you shouldn't do it."

To make it short, the reason why i do so is to adjust the baseline(intercept) for the new dataset.

thank you again!!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2017 03:29 PM

I don't think a straight subtraction works for a logistic regression model, it would for linear regression though. You have to be careful with how/when you do the conversion to probability.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2017 04:21 PM

Here's an example of how to 'remove' the intercept from a scored dataset for logistic regression. Note that to score logistic regression you should use the SCORE within PROC LOGISTIC. You can also use the CODE statement to generate your scoring program and then manually remove/reverse the prediction.

Note that the predicted probability is:

P = 1/(1-exp (-1* (intercept + B1*X1 + B2*X2 .. + BN*XN)));

P = 1/(1-exp(-1*X)));

You first need to recalculate the X, then remove the intercept and then re-apply the function.

`*Generate a model to test;`

proc logistic data=sashelp.heart outest=estimates;
model status = ageatStart height weight;
output out=withIntercept p=pred;
code file='c:\_localdata\score_ex.sas';
run;

*create macro variable with intercept;
proc sql noprint;
select intercept into :intercept
from estimates;
quit;

*Remove the intercept and calculate the probability with out the intercept;
data modify;
set withIntercept;
rescaled = -1*log((pred**-1) -1);
pred_no_intercept = 1/(1+exp(-1*(rescaled - &intercept)));
diff = pred-pred_no_intercept;
run;

*Display the results.
proc print data=modify (obs=20);
var pred rescaled pred_no_intercept diff ;
run;

And the results from the CODE statement, note that I manually changed the intercept here to 0. You could probably do this via a program but it seems like too much work for me.

```
*****************************************;
** SAS Scoring Code for PROC Logistic;
*****************************************;
data woIntercept_check;
set sashelp.heart;
length I_Status $ 5;
label I_Status = 'Into: Status' ;
length U_Status $ 5;
label U_Status = 'Unnormalized Into: Status' ;
label P_StatusAlive = 'Predicted: Status=Alive' ;
label P_StatusDead = 'Predicted: Status=Dead' ;
drop _LMR_BAD;
_LMR_BAD=0;
*** Check interval variables for missing values;
if nmiss(AgeAtStart,Height,Weight) then do;
_LMR_BAD=1;
goto _SKIP_000;
end;
*** Compute Linear Predictors;
drop _LP0;
_LP0 = 0;
*** Effect: AgeAtStart;
_LP0 = _LP0 + (-0.1204973720723) * AgeAtStart;
*** Effect: Height;
_LP0 = _LP0 + (-0.04255625097121) * Height;
*** Effect: Weight;
_LP0 = _LP0 + (-0.00633992910933) * Weight;
*** Predicted values;
drop _MAXP _IY _P0 _P1;
******************************************************************************;
Change intercept here to 0 from calculated value;
*_TEMP = 9.63617021104433 + _LP0;
*New Code;
_TEMP = 0 + _LP0;
******************************************************************************;
if (_TEMP < 0) then do;
_TEMP = exp(_TEMP);
_P0 = _TEMP / (1 + _TEMP);
end;
else _P0 = 1 / (1 + exp(-_TEMP));
_P1 = 1.0 - _P0;
P_StatusAlive = _P0;
_MAXP = _P0;
_IY = 1;
P_StatusDead = _P1;
if (_P1 > _MAXP + 1E-8) then do;
_MAXP = _P1;
_IY = 2;
end;
select( _IY );
when (1) do;
I_Status = 'Alive' ;
U_Status = 'Alive' ;
end;
when (2) do;
I_Status = 'Dead' ;
U_Status = 'Dead ' ;
end;
otherwise do;
I_Status = '';
U_Status = '';
end;
end;
_SKIP_000:
if _LMR_BAD = 1 then do;
I_Status = '';
U_Status = '';
P_StatusAlive = .;
P_StatusDead = .;
end;
drop _TEMP;
run;
```

In this case there's a significant difference when you remove the intercept.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2017 05:09 PM

Dear Reeza,

Thank you for the code!!!

I need to remove the intercept in the scoring dataset not in the model building dataset. So I think it's fine.

I wrote a similar code as yours to subtract the intercept from the result in PROC SCORE and calculated the predicted prob.

Thanks very much for your help!

Best wishes.

Thank you for the code!!!

I need to remove the intercept in the scoring dataset not in the model building dataset. So I think it's fine.

I wrote a similar code as yours to subtract the intercept from the result in PROC SCORE and calculated the predicted prob.

Thanks very much for your help!

Best wishes.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2017 05:45 PM - edited 01-20-2017 05:46 PM

A quick check is to make sure your probabilities/results are between 0 and 1. If not, its not correct.

EDIT: FYI- This did remove it in the scored dataset, I just used the same dataset to score/model because I didn't want to make up more data.