BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RobertWF1
Quartz | Level 8

For a work project I'm building a zero-inflated negative binomial regression model with proc genmod and then attempting to score the model on a test dataset using proc plm.

 

For example, using the Trajan dataset provided here:

 

proc genmod data=Trajan;
   class bap photoperiod;
   model roots = bap photoperiod / dist=zinb;
   zeromodel;
   output out=zinb predicted=pred pzero=pzero;
   ods output ParameterEstimates=zinbparms;
   ods output Modelfit=fit;
   store out=zinb_model;
run;

data test;
input roots bap photoperiod;
cards;
1 10 20
2 30 40
;
run;

proc plm source=zinb_model;
	score data=test out=preds_zinb pred=pred;
run;

The code runs fine, no errors in the log, but the output dataset preds_zinb contains missing values (".") in the pred column.

 

Is proc plm not compatible with proc genmod? It should work - there's a data scoring example on the SAS website here: https://support.sas.com/kb/33/307.html.

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

PROC PLM can score new data for this model as discussed and illustrated in this note. The reason that it doesn't work for your TEST data set is that your model has only CLASS predictors and with a CLASS predictor, the model contains only a parameter for each of the observed values in the data used to fit the model. The values in your TEST data set are not values in the observed data. If the variables were not in the CLASS statement, and therefore treated as continuous rather than categorical, then you would get predicted values. 

 

See this note (particularly point 3) on why missing values occur when scoring new data. 

View solution in original post

2 REPLIES 2
StatDave
SAS Super FREQ

PROC PLM can score new data for this model as discussed and illustrated in this note. The reason that it doesn't work for your TEST data set is that your model has only CLASS predictors and with a CLASS predictor, the model contains only a parameter for each of the observed values in the data used to fit the model. The values in your TEST data set are not values in the observed data. If the variables were not in the CLASS statement, and therefore treated as continuous rather than categorical, then you would get predicted values. 

 

See this note (particularly point 3) on why missing values occur when scoring new data. 

RobertWF1
Quartz | Level 8
Beautiful! Thank you - I missed that detail about class variables.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 615 views
  • 5 likes
  • 2 in conversation