BookmarkSubscribeRSS Feed
ShenBose
Calcite | Level 5

I used a FMM model for my data with a continuous outcome variable (ordered from 0-1000 with almost 93% valued being 0) using the followng statements: 

proc fmm data=training;
model y = x1 x2 x3 x4/ dist=WEIBULL k=2;
probmodel x1 x2 x3 x4
output out = modelone residual pred;
run;

 

I am wondering if I can use the beta estimates created using above procedure to calculate predicted scores in the validation data. I have used this method to score estimates from other regression models. So the equation would be:

 

data stats;

set validation;

log_y = exp(intercept+b1*x1+b2*x2+b3*x3+b4*x4);
y = exp(log_y); 

run;

 

Is this a correct method to create predicted scores here? I am new to FMM procedure and after reading lot of articles, it seems to be an appropriate method for 0-inflated data. However, I am not sure how to use that further to create predicted scores and then compare against the observed/actual outcome. 

 

Apprecite your help.

 

Thanks

Shen 

3 REPLIES 3
Rick_SAS
SAS Super FREQ

I would use the "missing value trick" and let PROC FMM generate the predicted values itself.

The code would look something like this (NOT TESTED):

 

/* 1. Concatenate the original data with the score data */
data C;
set training validation(in=v rename=(y=OrigY));
if v then do;
   y = .;             /* y=. for all obs in validation data */
   type = "Validation";
end;
else
   type = "Training  ";
run;
 
/* 2. Run a regression. The model is fit to the original data. */ 
proc fmm data=C;
model y = ...;
output out=Pred residual pred;
quit;

The scored validation data set is the one WHERE type="Validation";

ShenBose
Calcite | Level 5

Thanks so much! That works.

 

Just wondering what the equation is to calculate the predicted scores. 

Rick_SAS
SAS Super FREQ

I don't have time right now, but now that you know the correct predicted values in a data set, you can try various equations until you get the same predictions.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1728 views
  • 0 likes
  • 2 in conversation