BookmarkSubscribeRSS Feed
crobinson58
Calcite | Level 5

Hello,

 

I'm relatively new to SAS and I am trying to figure out how to run a linear probability model on my data. I keep seeing different answers on what the correct code is to run in order to achieve this so I would really appreciate any help! I can provide more information if needed as well. 

 

Thank you

4 REPLIES 4
StatDave
SAS Super FREQ

A linear probability model directly models binomial probabilities. An example can be seen in this note using PROC GENMOD by specifying DIST=BINOMIAL and LINK=IDENTITY in the MODEL statement. However, it is rarely a good idea unless the observed probabilities are all in the mid-range (such as 0.25 to 0.75). Otherwise, the model typically does not fit well and very often runs into estimation problems because this model allows predicted values to fall outside of the valid [0,1] range. That is why a link function such as the LOGIT or PROBIT link is typically used as they both restrict the predictions to the valid range. If your reason for wanting to fit a linear probability model is to use the model to predict probabilities or to estimate differences on the probability scale, this can still be easily done using a logit model that can fit well. The above note shows how that can be done using the Margins macro or using PROC LOGISTIC followed by the NLMeans macro.

crobinson58
Calcite | Level 5

I'm looking to model a dataset that would be able to predict whether someone is having a heart stroke, where the dependent variable is just yes or no. Would the probit link function be the most ideal for this scenario?

crobinson58
Calcite | Level 5
Not necessarily predict it but give me the probability of it.
StatDave
SAS Super FREQ
That sort of thing is most typically done using a logistic (or logit-linked) model. This is best done using PROC LOGISTIC. See the discussion and examples in the LOGISTIC documentation. The probability of the response event can be produced in various ways such as with the PRED= option in the OUTPUT statement or using the LSMEANS statement with the ILINK option. The LSMEANS statement is also shown in the note I referred to.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 723 views
  • 5 likes
  • 2 in conversation