I am close to achieving my Master's degree in Data Analysis and I am working on a Logistic analysis problem. I have noticed that Proc Logistic supports the store command to save the analysis. However, it is unclear how to use the saved binary to score another data set similar to doing so with Proc PLM. Proc PLM will use the stored data but it apparently doesn't score the same way as Proc Logistic as the predicted probabilities are significantly different. I assume it is still scoring like Proc GLM.
proc plm restore=cap.stored;
score data=cap.fnma_sf2017_prepared out=cap.fnma_st2017_scored;
run;
> I will read through them later when I have time.
It's your choice, but you might discover that KSharp's suggestion will save you time. To focus your attention, the article
"4 reasons to use PROC PLM for linear regression models in SAS"
seems relevant. The examples demonstrate that you should use the ILINK option if you want the predicted probabilities. Otherwise, as you say, you are getting the linear predictions before applying the inverse logit transformation.
The article "Predicted values in generalized linear models: The ILINK option in SAS" provides additional discussion and a logistic example that demonstrates the difference between using and not using the ILINK option when coring new data.
Yes. you are right PROC LOGISTIC is still scoring like Proc REG . Like:
predict_Y=a*x1+b*x2+..........
BTW, @Rick_SAS wrote many blogs about this topic .
Thank you for your reply and your suggestion to check out Rick's posts. I will read through them later when I have time. For now, if there is a way to use proc logistic to utilize saved binary parameters from the store statement, I would like to see an example of the syntax. If logistic scoring of new data using this saved data is currently missing functionality, that's fine. I would just like to know one way or the other before I finish my final report so I can be confident I know what I am talking about when I am demonstrating SAS abilities. I come from a background of Software Development / Engineering (mostly in C/C++) and dealing with the data involved, so the main thing I am adding to my skill set is the use of the advanced statistics. If you can answer whether it is possible / practical to use the saved parameters for logistic analysis on new data sets with an example of the syntax (if so), then I am satisfied for the moment.
In case you are curious, my project involves 2-3 yearly data sets each with over 2 million records where the target of my hypothesis is available in all 3. There is actually several more data set years available but I am limiting myself to those. I am splitting the first year for training and validation and testing with a different year by applying the scoring. Since I can always compare the prediction to the actual value of the target, I used a few sas procedures (data step and proc means / proc freq) to see how well it did. In the process I also computed a value to consider using for cutoff for the predicted values. It appeared I could use proc plm results to come somewhat close to the proc logistic results if I used a different cutoff for both. However the predicted values from proc plm could go negative etc while the proc logistic results were nicely bounded between 0 and 1, so it was obvious proc plm doesn't use logistic analysis unless there is a option I don't know about yet. If you want to see the steps here, I have to switch laptops. I am not running SAS University edition on this one.
> I will read through them later when I have time.
It's your choice, but you might discover that KSharp's suggestion will save you time. To focus your attention, the article
"4 reasons to use PROC PLM for linear regression models in SAS"
seems relevant. The examples demonstrate that you should use the ILINK option if you want the predicted probabilities. Otherwise, as you say, you are getting the linear predictions before applying the inverse logit transformation.
The article "Predicted values in generalized linear models: The ILINK option in SAS" provides additional discussion and a logistic example that demonstrates the difference between using and not using the ILINK option when coring new data.
" If you can answer whether it is possible / practical to use the saved parameters for logistic analysis on new data sets with an example of the syntax (if so), then I am satisfied for the moment."
You can use OUTEST= option to save parameter estimator ,and use it to score a new or test data.
proc logistic data=sashelp.class outest=est;
model sex=age weight height;
run;
data test;
set sashelp.class;
run;
data score_test;
if _n_=1 then
set est(keep=intercept age weight height rename=(age=a weight=w height=h));
set test;
Y=intercept+a*age+w*weight+h*height;
/*as Rick said,if you want Prob ,using LOGIT() reverse Y*/
prob=logit(Y);
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.