Re: Probability of binary event proc logistic

tuni_jt · Posted 09-08-2021 11:16 AM

Hi,

I have a dataset with information on buildings and structural failure risk. I am a bit new to SAS and statistical modelling, so apologies if this is a poor question.

data have;
   input Buildtype $ RoofType $ Failure $;
   datalines;
Detached Mansard  No
Detached Flat Yes
Senidetached Pitched  No
Apartment   Flat No
;

I am interested in the probability of failure of different building types and roofs, and the uncertainty estimates of this probability. So far, I have been using proc logistic:

proc logistic data=have;
  class Buildtype RoofType / param=ref ;
  model Failure (event='Yes') = Buildtype RoofType;
  Output out=want lower=lower upper=upper predicted=predicted;
run;

This outputs:

Printed odds ratios and maximum likelihood estimates relative to a reference value. This is useful, but I would like to show a risk of failure relative to the average for all dwellings, if that makes sense.
A dataset want that is the same as the have dataset, with predicted failures and confidence intervals added as columns at the end. This is what I want, but is it possible just to get the list of independent variables and their predictions, instead of the entire dataset? The original dataset is quite large...

PaigeMiller · Posted 09-08-2021 11:26 AM

Printed odds ratios and maximum likelihood estimates relative to a reference value. This is useful, but I would like to show a risk of failure relative to the average for all dwellings, if that makes sense.

Average for all dwellings generally would not make sense in this case unless the data/experiment was completely balanced, so each buildtype*rooftype occurs an equal number of times. Usually, the comparison is indeed the levels of buildtype to each other and the levels of rooftype to each other (and if desired, the interaction levels as well compared to each other)

A dataset want that is the same as the have dataset, with predicted failures and confidence intervals added as columns at the end. This is what I want, but is it possible just to get the list of independent variables and their predictions, instead of the entire dataset? The original dataset is quite large...

I think what you want is the output from the LSMEANS statement with the ILINK option.

--
Paige Miller

tuni_jt · Posted 09-14-2021 07:56 AM

Thanks to everyone for their really quick replies - they are incredibly helpful.

I think odds ratios are probably a good option here, so let's go for that.

I was hoping to output a table with just the dependent variables and their odds ratios, since I find it a bit easier to customise the charts using sgplot for example. I can output the odds ratios by using this statement:

ods output OddsRatiosWald= ORPlot;

But I am not sure how to do the same if I want to use PL insteads of Wald?

Thanks,

Jon

Reeza · Posted 09-08-2021 11:27 AM

is it possible just to get the list of independent variables and their predictions

Take a look at EFFECTPLOT or SLICES but I think odds ratio for each variable is essentially telling you what you want to know.

Have you walked through this example in the tutorials?

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_logistic_examples02.htm

sbxkoenk · Posted 09-08-2021 11:27 AM

Hello,

You can use a KEEP= data set option on the WANT dataset to specify the columns you want to keep:

Like:

output out=want(keep=Failure Buildtype RoofType lower upper predicted) 
lower=lower upper=upper predicted=predicted;

Koen

StatDave · Posted 09-08-2021 01:02 PM

There are two basic statistics for this - LS-means (as mentioned earlier) and predictive margins. LS-means can be obtained for any categorical predictor which is specified in the CLASS statement. Note that you need to specify the PARAM=GLM option in the CLASS statement in order to use the LSMEANS statement. The following provides estimates the event probabilities and confidence intervals for each level of each predictor while holding the other predictor constant. The ILINK option gives the estimates on the mean (probability) scale, the CL option gives the confidence limits, and the E option shows the coefficients on the parameters that define each LS-mean and allows you to see how the other predictor(s) are fixed.

  class Buildtype RoofType / param=glm;
  model Failure (event='Yes') = Buildtype RoofType;
  lsmeans Buildtype RoofType / ilink cl e;

Margins for a predictor do not hold the other predictor(s) constant but rather averages the predicted values. Predictive margins can be obtained for categorical predictors and marginal effects for continuous predictors. These are provided by the Margins macro. See the discussion and examples in its documentation.

Probability of binary event proc logistic