BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
trekvana
Calcite | Level 5

so i created a simple data set and ordinal logistic model to illustrate the issue i am having. i am modeling my data (xxx) and then create a new data set (predict) for which i want to score and get estimated probabilities. everything works out fine ( i am able to get the estimated probabilities in the scores data set) but i keep getting these warning messages that I dont know how to fix. see below for warning messages

data xxx;

input y x1 x2;

datalines;

1 1 1

1 2 1

1 2 2

2 1 1

3 2 2

2 1 2

3 1 1

;

run;

proc format;

value y 1='satisfied' 2='neither' 3='dissatisfied';

value xone 1='no' 2='yes';

value xtwo 1='cat' 2='dog';

run;

proc sort data=xxx; by y; run;

proc logistic data=xxx order=data outmodel=estimates;

format y y. x1 xone. x2 xtwo.;

class  x1(ref='no') x2(ref='cat') / param=ref;

model y = x1 x2;

run;

data predict;

do x1=1 to 2 by 1;

    do x2=1 to 2 by 1; output;

    end;

end;

run;

proc logistic inmodel=estimates;

score data=predict out=scores;

format x1 xone. x2 xtwo.;

run;

sas keeps telling me that x1 and x2 are not in the estimates dataset but if you look into the dataset you can clearly see the two variables are in there.


7052  proc logistic inmodel=estimates;

7053  score data=predict out=scores;

7054  format x1 xone. x2 xtwo.;

WARNING: Variable X1 not found in data set WORK.ESTIMATES.

WARNING: Variable X2 not found in data set WORK.ESTIMATES.

7055  run;

NOTE: The data set WORK.SCORES has 4 observations and 6 variables.

never the less I still get the predicted probabilites in the score dataset. however when I reverse the score and format statements i get this error:

7056  proc logistic inmodel=estimates;

7057  format x1 xone. x2 xtwo.;score data=predict out=scores;

WARNING: Variable X1 not found in data set WORK.ESTIMATES.

WARNING: Variable X2 not found in data set WORK.ESTIMATES.

7058

7059  run;

NOTE: The scored data contains missing values due to a class level which is not in the training data set.

NOTE: The data set WORK.SCORES has 4 observations and 6 variables.

the predicted probabilities do not get calculated.

if i leave out the format statement the predicted probabilities still do not get calculated:

7060  proc logistic inmodel=estimates;

7061  score data=predict out=scores;

7062  run;

NOTE: The scored data contains missing values due to a class level which is not in the training data set.

NOTE: The data set WORK.SCORES has 4 observations and 6 variables.

i am thinking it has something to do with the format statment. its driving me nuts. any help?

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

From the SAS DOCs :

FORMAT statements are not allowed when the INMODEL= data set is specified; variables in the DATA= and PRIOR= data sets in the SCORE statement should be formatted within the data sets.

Smiley Sad

View solution in original post

5 REPLIES 5
Reeza
Super User

Where does that estimates dataset come from? I see the data set XXX and predict have x1 and x2, but the estimates dataset kinda pops up outta nowhere, or is that just a typo for the demo here?

Your also missing a semi-colon after the run statement (with the proc sort).

trekvana
Calcite | Level 5

reeza- oops my mistake. in the original proc logistic it should be outmodel=estimates instead of outmodel=scores. i have corrected this

edit: ok corrected the semicolon. i am stilling getting those warnings. any suggestions?

Reeza
Super User

From the SAS DOCs :

FORMAT statements are not allowed when the INMODEL= data set is specified; variables in the DATA= and PRIOR= data sets in the SCORE statement should be formatted within the data sets.

Smiley Sad

trekvana
Calcite | Level 5

would you be able to tell me how i could format the variables within the data sets?

trekvana
Calcite | Level 5

haha. the answer was so simple:

data predict;

do x1=1 to 2 by 1;

    do x2=1 to 2 by 1; output;

    end;

end;

format x1 xone. x2 xtwo.;

run;

and now everything works ok. thanks reeza for the eye opener

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 4409 views
  • 0 likes
  • 2 in conversation