BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aivoryuk
Calcite | Level 5

Hi

I am new to SAS and to this forum but I would like some help on a query.

I have the following  code which works for me

data Data1;

   input purchase income age zipcode;

   datalines;

0 79.015 50 1

0 79.897 50 1

0 82.818 47 1

0 82.210 49 1

0 87.656 42 1

0 86.013 47 1

0 85.032 49 1

0 82.914 55 1

0 86.110 51 1

0 62.224 52 2

0 86.864 52 1

0 91.353 55 1

1 71.891 49 2

0 98.812 54 1

1 98.217 55 1

0 99.678 53 1

0 86.922 44 2

0 80.151 54 2

0 107.135 53 1

0 106.304 56 1

1 85.361 57 2

0 93.090 51 2

0 93.830 53 2

1 120.087 51 1

1 95.087 55 2

1 117.836 59 1

1 101.788 47 2

1 106.250 54 2

0 105.401 56 2

1 109.552 51 2

1 108.887 58 2

1 120.134 53 2

0 129.462 46 1

1 138.012 41 2

1 133.963 48 1

1 136.386 43 2

;

proc logistic data=Data1;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

 

run;

proc sort data=prob;

by phat;

run;

proc print data = prob;

var income age zipcode phat;

run;

What I would now like to do  is to insert new data in the model which I believe I can do using SCORE but from using the user guide I'm struggling to get it to work with my example.

using the above datalines I am doing this.

proc logistic data=Data1 outmodel=buymodel1;

   model purchase= income age zipcode;

   score data=data1 output out =prob predicted=phat ;

   run;

   data new;

    input income age zipcode;

   datalines;

85.21 44 1

250.45 28 2

130.54 36 2

;

proc logistic inmodel=sasuser.buymodel;

      score data=new   out=newscore predicted=phat;

   run;

proc print data=newscore;

var income age zipcode phat;

run;

I am getting the following errors in the log

WARNING: Data set WORK.BUYMODEL1 was not replaced because this step was stopped.

1427     model purchase= income age zipcode;

1428     score data=data1 output out =prob predicted=phat ;

                          ------ ---       ---------

                          22     202       22

                                           76

ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALPHA, CLM, DATA, FITSTAT, OUT,

              OUTROC, PRIOR, PRIOREVENT, ROCEPS.

ERROR 202-322: The option or parameter is not recognized and will be ignored.

ERROR 76-322: Syntax error, statement will be ignored.

1429     run;

ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALPHA, CLM, DATA, FITSTAT, OUT,

              OUTROC, PRIOR, PRIOREVENT, ROCEPS.

ERROR 76-322: Syntax error, statement will be ignored.

1439     run;

ERROR: Variable PHAT not found.

If anyone could help that would be great

1 ACCEPTED SOLUTION

Accepted Solutions
aivoryuk
Calcite | Level 5

Hi

I managed to get your code to run by doing this.

data Data1;

   input purchase income age zipcode;

   datalines;

0 79.015 50 1

0 79.897 50 1

0 82.818 47 1

0 82.210 49 1

0 87.656 42 1

0 86.013 47 1

0 85.032 49 1

0 82.914 55 1

0 86.110 51 1

0 62.224 52 2

0 86.864 52 1

0 91.353 55 1

1 71.891 49 2

0 98.812 54 1

1 98.217 55 1

0 99.678 53 1

0 86.922 44 2

0 80.151 54 2

0 107.135 53 1

0 106.304 56 1

1 85.361 57 2

0 93.090 51 2

0 93.830 53 2

1 120.087 51 1

1 95.087 55 2

1 117.836 59 1

1 101.788 47 2

1 106.250 54 2

0 105.401 56 2

1 109.552 51 2

1 108.887 58 2

1 120.134 53 2

0 129.462 46 1

1 138.012 41 2

1 133.963 48 1

1 136.386 43 2

;

data new;

    input income age zipcode;

   datalines;

85.21 44 1

250.45 28 2

130.54 36 2

;

proc logistic data=Data1 outmodel=sasuser.model2;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

score data=new out=newscore;

   run;

   proc print data=newscore;

   run;

The results were the same as my code which is

data Data1;

   input purchase income age zipcode;

   datalines;

0 79.015 50 1

0 79.897 50 1

0 82.818 47 1

0 82.210 49 1

0 87.656 42 1

0 86.013 47 1

0 85.032 49 1

0 82.914 55 1

0 86.110 51 1

0 62.224 52 2

0 86.864 52 1

0 91.353 55 1

1 71.891 49 2

0 98.812 54 1

1 98.217 55 1

0 99.678 53 1

0 86.922 44 2

0 80.151 54 2

0 107.135 53 1

0 106.304 56 1

1 85.361 57 2

0 93.090 51 2

0 93.830 53 2

1 120.087 51 1

1 95.087 55 2

1 117.836 59 1

1 101.788 47 2

1 106.250 54 2

0 105.401 56 2

1 109.552 51 2

1 108.887 58 2

1 120.134 53 2

0 129.462 46 1

1 138.012 41 2

1 133.963 48 1

1 136.386 43 2

;

proc logistic data=Data1 outmodel=sasuser.model2;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

   run;

 

data new;

    input income age zipcode;

   datalines;

85.21 44 1

250.45 28 2

130.54 36 2

;

proc logistic inmodel=sasuser.model2;

      score data=new out=newscore;

   run;

proc print data=newscore;

   run;

For my scenario I would use data1 as the test/validation set and then any new data would be fed in later.

Unless I have missed something the question has now been answered.

Thanks for your help.

View solution in original post

6 REPLIES 6
Reeza
Super User

It looks like your proc logistic code is incorrect for the first model, but its hard to tell with your code formatted like that.

Get your first proc logistic working properly then add a single line of code to you current model:

Score data=data_to_be_scored;

Message was edited by: Reeza You missed a semi colon I think 🙂

aivoryuk
Calcite | Level 5

Hi Reeza sorry about the format

I seem to have came with some code that gives me the correct result, but if you could check it.

proc logistic data=Data1 outmodel=sasuser.model2;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

   run;

   data new;

    input income age zipcode;

   datalines;

85.21 44 1

250.45 28 2

130.54 36 2

;

proc logistic inmodel=sasuser.model2;

      score data=new out=newscore;

   run;

proc print data=newscore;

   run;

Reeza
Super User

That looks fine to me. But I believe (not sure that you can score new data in the same step) if you already have the data.

proc logistic data=Data1 outmodel=sasuser.model2;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

score data=new out=newscore;

   run;

aivoryuk
Calcite | Level 5

Hi Reeza

When I run your code i get the following message

ERROR: File WORK.NEW.DATA does not exist.

Not sure why this would be?

Regards

Reeza
Super User

Post the code you used. I don't have SAS to test it right now, but like I mentioned, from what I read it should work but not sure.

I'm assuming that your code runs fine now as well?

aivoryuk
Calcite | Level 5

Hi

I managed to get your code to run by doing this.

data Data1;

   input purchase income age zipcode;

   datalines;

0 79.015 50 1

0 79.897 50 1

0 82.818 47 1

0 82.210 49 1

0 87.656 42 1

0 86.013 47 1

0 85.032 49 1

0 82.914 55 1

0 86.110 51 1

0 62.224 52 2

0 86.864 52 1

0 91.353 55 1

1 71.891 49 2

0 98.812 54 1

1 98.217 55 1

0 99.678 53 1

0 86.922 44 2

0 80.151 54 2

0 107.135 53 1

0 106.304 56 1

1 85.361 57 2

0 93.090 51 2

0 93.830 53 2

1 120.087 51 1

1 95.087 55 2

1 117.836 59 1

1 101.788 47 2

1 106.250 54 2

0 105.401 56 2

1 109.552 51 2

1 108.887 58 2

1 120.134 53 2

0 129.462 46 1

1 138.012 41 2

1 133.963 48 1

1 136.386 43 2

;

data new;

    input income age zipcode;

   datalines;

85.21 44 1

250.45 28 2

130.54 36 2

;

proc logistic data=Data1 outmodel=sasuser.model2;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

score data=new out=newscore;

   run;

   proc print data=newscore;

   run;

The results were the same as my code which is

data Data1;

   input purchase income age zipcode;

   datalines;

0 79.015 50 1

0 79.897 50 1

0 82.818 47 1

0 82.210 49 1

0 87.656 42 1

0 86.013 47 1

0 85.032 49 1

0 82.914 55 1

0 86.110 51 1

0 62.224 52 2

0 86.864 52 1

0 91.353 55 1

1 71.891 49 2

0 98.812 54 1

1 98.217 55 1

0 99.678 53 1

0 86.922 44 2

0 80.151 54 2

0 107.135 53 1

0 106.304 56 1

1 85.361 57 2

0 93.090 51 2

0 93.830 53 2

1 120.087 51 1

1 95.087 55 2

1 117.836 59 1

1 101.788 47 2

1 106.250 54 2

0 105.401 56 2

1 109.552 51 2

1 108.887 58 2

1 120.134 53 2

0 129.462 46 1

1 138.012 41 2

1 133.963 48 1

1 136.386 43 2

;

proc logistic data=Data1 outmodel=sasuser.model2;

   model purchase= income age zipcode;

   output out =prob predicted=phat;

   run;

 

data new;

    input income age zipcode;

   datalines;

85.21 44 1

250.45 28 2

130.54 36 2

;

proc logistic inmodel=sasuser.model2;

      score data=new out=newscore;

   run;

proc print data=newscore;

   run;

For my scenario I would use data1 as the test/validation set and then any new data would be fed in later.

Unless I have missed something the question has now been answered.

Thanks for your help.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2381 views
  • 0 likes
  • 2 in conversation