Programming the statistical procedures from SAS

Logistic regression with Score

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

Logistic regression with Score

Hi

I am new to SAS and to this forum but I would like some help on a query.

I have the following  code which works for me

data Data1;

   input purchase income age zipcode;

   datalines;

0 79.015 50 1

0 79.897 50 1

0 82.818 47 1

0 82.210 49 1

0 87.656 42 1

0 86.013 47 1

0 85.032 49 1

0 82.914 55 1

0 86.110 51 1

0 62.224 52 2

0 86.864 52 1

0 91.353 55 1

1 71.891 49 2

0 98.812 54 1

1 98.217 55 1

0 99.678 53 1

0 86.922 44 2

0 80.151 54 2

0 107.135 53 1

0 106.304 56 1

1 85.361 57 2

0 93.090 51 2

0 93.830 53 2

1 120.087 51 1

1 95.087 55 2

1 117.836 59 1

1 101.788 47 2

1 106.250 54 2

0 105.401 56 2

1 109.552 51 2

1 108.887 58 2

1 120.134 53 2

0 129.462 46 1

1 138.012 41 2

1 133.963 48 1

1 136.386 43 2

;

proc logistic data=Data1;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

 

run;

proc sort data=prob;

by phat;

run;

proc print data = prob;

var income age zipcode phat;

run;

What I would now like to do  is to insert new data in the model which I believe I can do using SCORE but from using the user guide I'm struggling to get it to work with my example.

using the above datalines I am doing this.

proc logistic data=Data1 outmodel=buymodel1;

   model purchase= income age zipcode;

   score data=data1 output out =prob predicted=phat ;

   run;

   data new;

    input income age zipcode;

   datalines;

85.21 44 1

250.45 28 2

130.54 36 2

;

proc logistic inmodel=sasuser.buymodel;

      score data=new   out=newscore predicted=phat;

   run;

proc print data=newscore;

var income age zipcode phat;

run;

I am getting the following errors in the log

WARNING: Data set WORK.BUYMODEL1 was not replaced because this step was stopped.

1427     model purchase= income age zipcode;

1428     score data=data1 output out =prob predicted=phat ;

                          ------ ---       ---------

                          22     202       22

                                           76

ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALPHA, CLM, DATA, FITSTAT, OUT,

              OUTROC, PRIOR, PRIOREVENT, ROCEPS.

ERROR 202-322: The option or parameter is not recognized and will be ignored.

ERROR 76-322: Syntax error, statement will be ignored.

1429     run;

ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALPHA, CLM, DATA, FITSTAT, OUT,

              OUTROC, PRIOR, PRIOREVENT, ROCEPS.

ERROR 76-322: Syntax error, statement will be ignored.

1439     run;

ERROR: Variable PHAT not found.

If anyone could help that would be great


Accepted Solutions
Solution
‎04-06-2013 02:39 PM
Occasional Contributor
Posts: 8

Re: Logistic regression with Score

Hi

I managed to get your code to run by doing this.

data Data1;

   input purchase income age zipcode;

   datalines;

0 79.015 50 1

0 79.897 50 1

0 82.818 47 1

0 82.210 49 1

0 87.656 42 1

0 86.013 47 1

0 85.032 49 1

0 82.914 55 1

0 86.110 51 1

0 62.224 52 2

0 86.864 52 1

0 91.353 55 1

1 71.891 49 2

0 98.812 54 1

1 98.217 55 1

0 99.678 53 1

0 86.922 44 2

0 80.151 54 2

0 107.135 53 1

0 106.304 56 1

1 85.361 57 2

0 93.090 51 2

0 93.830 53 2

1 120.087 51 1

1 95.087 55 2

1 117.836 59 1

1 101.788 47 2

1 106.250 54 2

0 105.401 56 2

1 109.552 51 2

1 108.887 58 2

1 120.134 53 2

0 129.462 46 1

1 138.012 41 2

1 133.963 48 1

1 136.386 43 2

;

data new;

    input income age zipcode;

   datalines;

85.21 44 1

250.45 28 2

130.54 36 2

;

proc logistic data=Data1 outmodel=sasuser.model2;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

score data=new out=newscore;

   run;

   proc print data=newscore;

   run;

The results were the same as my code which is

data Data1;

   input purchase income age zipcode;

   datalines;

0 79.015 50 1

0 79.897 50 1

0 82.818 47 1

0 82.210 49 1

0 87.656 42 1

0 86.013 47 1

0 85.032 49 1

0 82.914 55 1

0 86.110 51 1

0 62.224 52 2

0 86.864 52 1

0 91.353 55 1

1 71.891 49 2

0 98.812 54 1

1 98.217 55 1

0 99.678 53 1

0 86.922 44 2

0 80.151 54 2

0 107.135 53 1

0 106.304 56 1

1 85.361 57 2

0 93.090 51 2

0 93.830 53 2

1 120.087 51 1

1 95.087 55 2

1 117.836 59 1

1 101.788 47 2

1 106.250 54 2

0 105.401 56 2

1 109.552 51 2

1 108.887 58 2

1 120.134 53 2

0 129.462 46 1

1 138.012 41 2

1 133.963 48 1

1 136.386 43 2

;

proc logistic data=Data1 outmodel=sasuser.model2;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

   run;

 

data new;

    input income age zipcode;

   datalines;

85.21 44 1

250.45 28 2

130.54 36 2

;

proc logistic inmodel=sasuser.model2;

      score data=new out=newscore;

   run;

proc print data=newscore;

   run;

For my scenario I would use data1 as the test/validation set and then any new data would be fed in later.

Unless I have missed something the question has now been answered.

Thanks for your help.

View solution in original post


All Replies
Grand Advisor
Posts: 16,889

Re: Logistic regression with Score

It looks like your proc logistic code is incorrect for the first model, but its hard to tell with your code formatted like that.

Get your first proc logistic working properly then add a single line of code to you current model:

Score data=data_to_be_scored;

Message was edited by: Reeza You missed a semi colon I think Smiley Happy

Occasional Contributor
Posts: 8

Re: Logistic regression with Score

Hi Reeza sorry about the format

I seem to have came with some code that gives me the correct result, but if you could check it.

proc logistic data=Data1 outmodel=sasuser.model2;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

   run;

   data new;

    input income age zipcode;

   datalines;

85.21 44 1

250.45 28 2

130.54 36 2

;

proc logistic inmodel=sasuser.model2;

      score data=new out=newscore;

   run;

proc print data=newscore;

   run;

Grand Advisor
Posts: 16,889

Re: Logistic regression with Score

That looks fine to me. But I believe (not sure that you can score new data in the same step) if you already have the data.

proc logistic data=Data1 outmodel=sasuser.model2;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

score data=new out=newscore;

   run;

Occasional Contributor
Posts: 8

Re: Logistic regression with Score

Hi Reeza

When I run your code i get the following message

ERROR: File WORK.NEW.DATA does not exist.

Not sure why this would be?

Regards

Grand Advisor
Posts: 16,889

Re: Logistic regression with Score

Post the code you used. I don't have SAS to test it right now, but like I mentioned, from what I read it should work but not sure.

I'm assuming that your code runs fine now as well?

Solution
‎04-06-2013 02:39 PM
Occasional Contributor
Posts: 8

Re: Logistic regression with Score

Hi

I managed to get your code to run by doing this.

data Data1;

   input purchase income age zipcode;

   datalines;

0 79.015 50 1

0 79.897 50 1

0 82.818 47 1

0 82.210 49 1

0 87.656 42 1

0 86.013 47 1

0 85.032 49 1

0 82.914 55 1

0 86.110 51 1

0 62.224 52 2

0 86.864 52 1

0 91.353 55 1

1 71.891 49 2

0 98.812 54 1

1 98.217 55 1

0 99.678 53 1

0 86.922 44 2

0 80.151 54 2

0 107.135 53 1

0 106.304 56 1

1 85.361 57 2

0 93.090 51 2

0 93.830 53 2

1 120.087 51 1

1 95.087 55 2

1 117.836 59 1

1 101.788 47 2

1 106.250 54 2

0 105.401 56 2

1 109.552 51 2

1 108.887 58 2

1 120.134 53 2

0 129.462 46 1

1 138.012 41 2

1 133.963 48 1

1 136.386 43 2

;

data new;

    input income age zipcode;

   datalines;

85.21 44 1

250.45 28 2

130.54 36 2

;

proc logistic data=Data1 outmodel=sasuser.model2;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

score data=new out=newscore;

   run;

   proc print data=newscore;

   run;

The results were the same as my code which is

data Data1;

   input purchase income age zipcode;

   datalines;

0 79.015 50 1

0 79.897 50 1

0 82.818 47 1

0 82.210 49 1

0 87.656 42 1

0 86.013 47 1

0 85.032 49 1

0 82.914 55 1

0 86.110 51 1

0 62.224 52 2

0 86.864 52 1

0 91.353 55 1

1 71.891 49 2

0 98.812 54 1

1 98.217 55 1

0 99.678 53 1

0 86.922 44 2

0 80.151 54 2

0 107.135 53 1

0 106.304 56 1

1 85.361 57 2

0 93.090 51 2

0 93.830 53 2

1 120.087 51 1

1 95.087 55 2

1 117.836 59 1

1 101.788 47 2

1 106.250 54 2

0 105.401 56 2

1 109.552 51 2

1 108.887 58 2

1 120.134 53 2

0 129.462 46 1

1 138.012 41 2

1 133.963 48 1

1 136.386 43 2

;

proc logistic data=Data1 outmodel=sasuser.model2;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

   run;

 

data new;

    input income age zipcode;

   datalines;

85.21 44 1

250.45 28 2

130.54 36 2

;

proc logistic inmodel=sasuser.model2;

      score data=new out=newscore;

   run;

proc print data=newscore;

   run;

For my scenario I would use data1 as the test/validation set and then any new data would be fed in later.

Unless I have missed something the question has now been answered.

Thanks for your help.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 292 views
  • 0 likes
  • 2 in conversation