BookmarkSubscribeRSS Feed
Demographer
Pyrite | Level 9

Hi,

I have parameters from a logit model predicting the labor force participation in the dataset lfp.csv. The dataset is produced using the outset statement from proc logit. The model is stratified by sex (0 1) and has some interaction variables (edu*grage). I want to use those parameters to predict the probability of the event to individuals in another dataset pop_2010 (which is not the dataset used to estimate parameters).

I read that the proc score can only work with parameters from linear regression, but not with logit. The store statement within the proc logit cannot be used too, since I don’t want to predict on the same dataset (and the prediction should be done without having access to the dataset used to estimate parameters).

 

Is there a way to do it quick and short?

9 REPLIES 9
Reeza
Super User

Check either the CODE statement in PROC LOGISTIC (logit isn't a proc as far as I know so assuming you're referring to logistic). I think GLM has a similar statement.
https://documentation.sas.com/?docsetId=statug&docsetVersion=15.1&docsetTarget=statug_logistic_synta...


Or use PROC PLM and make sure to specify that the iLink option.

https://blogs.sas.com/content/iml/2019/02/11/proc-plm-regression-models-sas.html

 

And a few more worked examples of scoring data for logistic regression:

https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_logistic_examples20.htm&docsetVer...

Demographer
Pyrite | Level 9

Is it possible to use a csv file for parameters in the restore statement of proc PLM?

Reeza
Super User
No idea, but if you have a CSV files why not just write a basic data step? The formula is pretty straightforward for logistic regression. If you can re-run the code to develop the model, the CODE option will create the data step for you.

Here's an example of how you can replicate it from scratch if you have to.

https://communities.sas.com/t5/Statistical-Procedures/How-to-determine-logistic-regression-formula-f...

Rather than drive from the proc logistic output, you can drive it from the imported CSV file. though you may need to restructure the file to get it as needed.
Reeza
Super User
FYI - this isn't correct, you can use the STORE statement to score the model with a new data set.

The store statement within the proc logit cannot be used too, since I don’t want to predict on the same dataset (and the prediction should be done without having access to the dataset used to estimate parameters).
Demographer
Pyrite | Level 9

Thanks. But the prediction should be done without having access to the dataset used to estimate parameters... so I don't see how I can used the proc logistic or glm. I'll read about the PLM procedure.

StatDave
SAS Super FREQ

If you are able to use SAS to fit the model and do the prediction for new data, then you can absolutely use PROC LOGISTIC for both. See the example titled "Scoring data sets" in the PROC LOGISTIC documentation. As shown there, you can fit the model in one PROC LOGISTIC step using the OUTMODEL= option to save a special data set containing the model. You can then predict (score) new data with that model in a subsequent PROC LOGISTIC step (even in a different SAS session) by specifying the save model data in the INMODEL= option and using the SCORE statement to score the new data. The example shows another way to do exactly the same thing using the STORE statement (instead of OUTMODEL=) and then PROC PLM (instead of the second PROC LOGISTIC step) to do the scoring. You can do either.

Demographer
Pyrite | Level 9

I can't. The model is estimated in a secured lab, while the dataset for the prediction is on my laptop.

StatDave
SAS Super FREQ

... and the "dataset for the prediction" was generated by the OUTEST= option in PROC LOGISTIC? If so, can the lab rerun the model using either the OUTMODEL= option or the STORE statement instead? If they can provide the resulting file from either of those, then you can use PROC LOGISTIC to score new data as in the example I referred to.

Reeza
Super User
You should be allowed to extract your code though, including the scoring code generated from a CODE statement? That's not any different than taking the parameter estimates out.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 767 views
  • 0 likes
  • 3 in conversation