BookmarkSubscribeRSS Feed
Lahrie
Calcite | Level 5

Hi,

 

I used PROC LOGISTIC with OFFSET option, as I'm dealing with rare target data. Is it possible to score a new dataset without using inmodel option? How to do it?


The new dataset will be scored using another program, so I can't use any SAS function to score it.

 

Tks

7 REPLIES 7
PaigeMiller
Diamond | Level 26

The new dataset will be scored using another program, so I can't use any SAS function to score it.

 

Explain this. Why are you asking a SAS question and then say you can't use SAS to do the scoring.

 

 

--
Paige Miller
StatDave
SAS Super FREQ

See this note. As shown there, predicted probabilities can be produced by removing the offset from the x*beta computation and then applying the inverse link function: 1/1+exp(-xbeta) .  The inverse link function is implemented as the LOGISTIC function in the DATA step which is used in the note. 

Reeza
Super User
Use the CODE statement within PROC LOGISTIC to get the code, check the documentation for details. You'll still have to do some 'translation' but it's much easier. If you're using SAS Enterprise Miner it can give you code in several languages.
Lahrie
Calcite | Level 5
Hi. I used the CODE option and I saw that is necessary to inform the offset value. My new data have the real proportion of goods and bads. How I can get the code without using the offset value?

Tks
StatDave
SAS Super FREQ

If you were going to do the scoring in SAS, I would suggest that you just write a DATA step to compute x*beta omitting the offset and then apply the inverse link as I mentioned. You can see some example code in section 4 of this note. But since you say you are not going to use SAS to do the scoring, then you will have to do the equivalent of that in whatever you intend to use. Still, you might find it helpful to first do it in SAS and satisfy yourself that you get the correct scores before translating that into whatever you use to score.

Lahrie
Calcite | Level 5

If you were going to do the scoring in SAS, I would suggest that you just write a DATA step to compute x*beta omitting the offset and then apply the inverse link as I mentioned. You can see some example code in section 4 of this note. But since you say you are not going to use SAS to do the scoring, then you will have to do the equivalent of that in whatever you intend to use. Still, you might find it helpful to first do it in SAS and satisfy yourself that you get the correct scores before translating that into whatever you use to score.

Hi @StatDave ,

I've tested your suggestion but I couldn't get the correct prob. Let me describe my steps:

 

proc logistic data=datax;
model conc(event="0")= SU FIN gp5
/selection=stepwise offset=off;
run;

 

This logistic give me the following estimates:

Parameter Estimate
Intercept -7.212
SU          0.7096
FIN         1.5471
GP5        0.9827
off           1.000


So, I tried to score a new dataset:

 

data sample;
input SU FIN GP5;
datalines;
0 0 0
1 1 0
1 0 0
1 1 1
0 0 1
;

data z;
set sample;

xbeta= - 7.2120 +
    SU * 0.7096 +
   FIN * 1.5471 +
   GP5 * 0.9827 ; /* I've ommited the offset estimate */

p = 1 / (1 + exp(-xbeta));
run;

 

results below:
SU FIN GP5 xbeta p
0 0 0 -7.212 0.0007371365
1 1 0 -4.9553 0.0069966679
1 0 0 -6.5024 0.0014975891
1 1 1 -3.9726 0.0184766143
0 0 1 -6.2293 0.0019669545

 

the P is too much small and very different from those score in logistic step. Could you help me?

 

Tks

 

StatDave
SAS Super FREQ

Using the example in the first note I referred to (22601: Adjusting for oversampling the event level in a binary logistic model), these statements compute (in POFF2) essentially the same values as those computed using the logistic function after removing the offset. They differ slightly because the POFF2 computation does not use the full precision of the parameter estimates.

 

      proc logistic data=sub;
        model y(event="1")=x / offset=off;
        output out=out xbeta=xboff;
        title2 "Offset-adjusted Model";
        run;
      data out;
        set out;
        poff=logistic(xboff-off);
        poff2=1/(1+exp(-( -3.4596 + 2.3149*x )));
        run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1291 views
  • 4 likes
  • 4 in conversation