BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Pin013
Fluorite | Level 6

Greetings all.

I fit an logistic regression model which the variables contains treatment(binary) and some explanatory variable(continuous).
Now I want to edit the treatment estimate from the same model to calculate the scores for the purpose of obtaining the ROC analysis.
I saved model information to a data set with the OUTMODEL= option, and I'm trying to build new data set("model2") which replace the estimate and the standard error from previous model information ("mode11"), then use it to recalculate the scores.
But the following error is issued and the step terminates:
" ERROR: Computations are terminated because the INMODEL=WORK.model2 data set has lost information. "


My code is:

/*logistic model*/
proc logistic data=data1 descending outmodel=model1;
  class y t(ref="0")/param = ref;
  model y(event='1')=t c1 c2 u;
run;

 

/*output predict probability*/
proc logistic inmodel=model1;
  score data=data1 out=probpred1;
  format P_1 10.7;
run;

 

/*new model (have error?)*/
data model2(type=logismod);
  set model1;
  if (_TYPE_='E') AND (_NAME_='EFFECT') AND (_CATEGORY_='E') AND (_NAMEIDX_=0) AND (_CATIDX_=0) then _MISC_=0.7802293905;   /*replace treatment parameter estimate*/
  if (_TYPE_='E') AND (_NAME_='EFFECT') AND (_CATEGORY_='V') AND (_NAMEIDX_=.) AND (_CATIDX_=0) then _MISC_=0.123159491;    /*replace standard error*/
run;

 

/*recalculate score (have error!!)*/
proc logistic inmodel=mode12(type=logismod);
  score data = data1 out=probpred2;
run;


Any help would be appreciated! Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Here's an example of what you were trying to do and you can run this code.

 

https://communities.sas.com/t5/Statistical-Procedures/How-to-determine-logistic-regression-formula-f...

 

proc logistic data=Neuralgia2 outest=sample;
   class sex (ref='0') / param=ref ;
   model Pain (event='1') = age sex ;
   output out=pred p=phat
        predprob=(individual crossvalidate) ;
run ;


/* the formula I am trying to replicate in Excel is the 'myformula' variable
   in the below data set
                                                                            */
data logformula (keep= age sex pain ip_1 myformula difference);
    set pred;
    if _n_=1 then set sample (keep = intercept age sex1 rename = (age=age_estimate sex1=sex_estimate));
    length sex_estimate age_estimate intercept myformula 8. ;
   
    myformula = 1/(1+exp(-1*(intercept+(sex_estimate*sex) + (age_estimate*age)))); *<<< can't get this to match ip_1;
    difference=phat-myformula;
   
    format difference 12.8;
run;


proc print data=logformula (obs=25) ;
    var sex age pain ip_1 myformula ; * I need to be able to replicate ip_1 given MLE values;
run ;

You could probably generalize this, if it's something you do often. 

 


@Pin013 wrote:

Greetings all.

I fit an logistic regression model which the variables contains treatment(binary) and some explanatory variable(continuous).
Now I want to edit the treatment estimate from the same model to calculate the scores for the purpose of obtaining the ROC analysis.
I saved model information to a data set with the OUTMODEL= option, and I'm trying to build new data set("model2") which replace the estimate and the standard error from previous model information ("mode11"), then use it to recalculate the scores.
But the following error is issued and the step terminates:
" ERROR: Computations are terminated because the INMODEL=WORK.model2 data set has lost information. "


My code is:

/*logistic model*/
proc logistic data=data1 descending outmodel=model1;
  class y t(ref="0")/param = ref;
  model y(event='1')=t c1 c2 u;
run;

 

/*output predict probability*/
proc logistic inmodel=model1;
  score data=data1 out=probpred1;
  format P_1 10.7;
run;

 

/*new model (have error?)*/
data model2(type=logismod);
  set model1;
  if (_TYPE_='E') AND (_NAME_='EFFECT') AND (_CATEGORY_='E') AND (_NAMEIDX_=0) AND (_CATIDX_=0) then _MISC_=0.7802293905;   /*replace treatment parameter estimate*/
  if (_TYPE_='E') AND (_NAME_='EFFECT') AND (_CATEGORY_='V') AND (_NAMEIDX_=.) AND (_CATIDX_=0) then _MISC_=0.123159491;    /*replace standard error*/
run;

 

/*recalculate score (have error!!)*/
proc logistic inmodel=mode12(type=logismod);
  score data = data1 out=probpred2;
run;


Any help would be appreciated! Thanks!


 

View solution in original post

4 REPLIES 4
Rick_SAS
SAS Super FREQ

It sounds like you want to want to score a data set by using the model that fits the training data, then score the same data set by using a different (but perhaps similar) model. The LOGISTIC procedure supports the CODE statement, which generates DATA step code that you can use the score new observations. I suggest you :

  1. Use the CODE statement to generate the fitted model. Write the DATA step code to OrigModel.sas.
  2. Use a text editor to edit OrigModel.sas. Make whatever changes you want to create the second model. Save the file to NewModel.sas.
  3. You can now create two DATA steps to score the data. The first uses %INCLUDE OrigModel  and the second uses %INCLUDE NewModel.
Pin013
Fluorite | Level 6

I appreciate your message. @Rick_SAS 
I'm sorry I didn't explain it clear enough.
The treatment estimate doesn't generate from the fitted model because it summarizes information through combines the analysis results from other samples, I only want to replace the original treatment estimate with the new calibration estimate, and estimates from other explanatory variables are reserved.

The problem has been solved. Thank you for providing the another information about computing predicted values of the fitted model. 🙂

Reeza
Super User

Here's an example of what you were trying to do and you can run this code.

 

https://communities.sas.com/t5/Statistical-Procedures/How-to-determine-logistic-regression-formula-f...

 

proc logistic data=Neuralgia2 outest=sample;
   class sex (ref='0') / param=ref ;
   model Pain (event='1') = age sex ;
   output out=pred p=phat
        predprob=(individual crossvalidate) ;
run ;


/* the formula I am trying to replicate in Excel is the 'myformula' variable
   in the below data set
                                                                            */
data logformula (keep= age sex pain ip_1 myformula difference);
    set pred;
    if _n_=1 then set sample (keep = intercept age sex1 rename = (age=age_estimate sex1=sex_estimate));
    length sex_estimate age_estimate intercept myformula 8. ;
   
    myformula = 1/(1+exp(-1*(intercept+(sex_estimate*sex) + (age_estimate*age)))); *<<< can't get this to match ip_1;
    difference=phat-myformula;
   
    format difference 12.8;
run;


proc print data=logformula (obs=25) ;
    var sex age pain ip_1 myformula ; * I need to be able to replicate ip_1 given MLE values;
run ;

You could probably generalize this, if it's something you do often. 

 


@Pin013 wrote:

Greetings all.

I fit an logistic regression model which the variables contains treatment(binary) and some explanatory variable(continuous).
Now I want to edit the treatment estimate from the same model to calculate the scores for the purpose of obtaining the ROC analysis.
I saved model information to a data set with the OUTMODEL= option, and I'm trying to build new data set("model2") which replace the estimate and the standard error from previous model information ("mode11"), then use it to recalculate the scores.
But the following error is issued and the step terminates:
" ERROR: Computations are terminated because the INMODEL=WORK.model2 data set has lost information. "


My code is:

/*logistic model*/
proc logistic data=data1 descending outmodel=model1;
  class y t(ref="0")/param = ref;
  model y(event='1')=t c1 c2 u;
run;

 

/*output predict probability*/
proc logistic inmodel=model1;
  score data=data1 out=probpred1;
  format P_1 10.7;
run;

 

/*new model (have error?)*/
data model2(type=logismod);
  set model1;
  if (_TYPE_='E') AND (_NAME_='EFFECT') AND (_CATEGORY_='E') AND (_NAMEIDX_=0) AND (_CATIDX_=0) then _MISC_=0.7802293905;   /*replace treatment parameter estimate*/
  if (_TYPE_='E') AND (_NAME_='EFFECT') AND (_CATEGORY_='V') AND (_NAMEIDX_=.) AND (_CATIDX_=0) then _MISC_=0.123159491;    /*replace standard error*/
run;

 

/*recalculate score (have error!!)*/
proc logistic inmodel=mode12(type=logismod);
  score data = data1 out=probpred2;
run;


Any help would be appreciated! Thanks!


 

Pin013
Fluorite | Level 6

Thank you for your help @Reeza ! It works perfectly. 🙂

Best wishes.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1627 views
  • 4 likes
  • 3 in conversation