so i created a simple data set and ordinal logistic model to illustrate the issue i am having. i am modeling my data (xxx) and then create a new data set (predict) for which i want to score and get estimated probabilities. everything works out fine ( i am able to get the estimated probabilities in the scores data set) but i keep getting these warning messages that I dont know how to fix. see below for warning messages
data xxx;
input y x1 x2;
datalines;
1 1 1
1 2 1
1 2 2
2 1 1
3 2 2
2 1 2
3 1 1
;
run;
proc format;
value y 1='satisfied' 2='neither' 3='dissatisfied';
value xone 1='no' 2='yes';
value xtwo 1='cat' 2='dog';
run;
proc sort data=xxx; by y; run;
proc logistic data=xxx order=data outmodel=estimates;
format y y. x1 xone. x2 xtwo.;
class x1(ref='no') x2(ref='cat') / param=ref;
model y = x1 x2;
run;
data predict;
do x1=1 to 2 by 1;
do x2=1 to 2 by 1; output;
end;
end;
run;
proc logistic inmodel=estimates;
score data=predict out=scores;
format x1 xone. x2 xtwo.;
run;
sas keeps telling me that x1 and x2 are not in the estimates dataset but if you look into the dataset you can clearly see the two variables are in there.
7052 proc logistic inmodel=estimates;
7053 score data=predict out=scores;
7054 format x1 xone. x2 xtwo.;
WARNING: Variable X1 not found in data set WORK.ESTIMATES.
WARNING: Variable X2 not found in data set WORK.ESTIMATES.
7055 run;
NOTE: The data set WORK.SCORES has 4 observations and 6 variables.
never the less I still get the predicted probabilites in the score dataset. however when I reverse the score and format statements i get this error:
7056 proc logistic inmodel=estimates;
7057 format x1 xone. x2 xtwo.;score data=predict out=scores;
WARNING: Variable X1 not found in data set WORK.ESTIMATES.
WARNING: Variable X2 not found in data set WORK.ESTIMATES.
7058
7059 run;
NOTE: The scored data contains missing values due to a class level which is not in the training data set.
NOTE: The data set WORK.SCORES has 4 observations and 6 variables.
the predicted probabilities do not get calculated.
if i leave out the format statement the predicted probabilities still do not get calculated:
7060 proc logistic inmodel=estimates;
7061 score data=predict out=scores;
7062 run;
NOTE: The scored data contains missing values due to a class level which is not in the training data set.
NOTE: The data set WORK.SCORES has 4 observations and 6 variables.
i am thinking it has something to do with the format statment. its driving me nuts. any help?
Where does that estimates dataset come from? I see the data set XXX and predict have x1 and x2, but the estimates dataset kinda pops up outta nowhere, or is that just a typo for the demo here?
Your also missing a semi-colon after the run statement (with the proc sort).
reeza- oops my mistake. in the original proc logistic it should be outmodel=estimates instead of outmodel=scores. i have corrected this
edit: ok corrected the semicolon. i am stilling getting those warnings. any suggestions?
would you be able to tell me how i could format the variables within the data sets?
haha. the answer was so simple:
data predict;
do x1=1 to 2 by 1;
do x2=1 to 2 by 1; output;
end;
end;
format x1 xone. x2 xtwo.;
run;
and now everything works ok. thanks reeza for the eye opener
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.