Fluorite | Level 6

## How to do multivariate analysis in SAS (proc logistic)

I've been reading about multivariate analysis and proc logistic, and although there are some online descriptions of multivariate analysis there are few that describe how to do it in SAS. I need something that takes me step by step through the output to determine what adjustments I need to make (i.e. When to exclude a given independent variable).

From what I've read and been told, it's my interpretation that if the p-value of any independent variable is above .25, I should exclude the variable with the highest p-value until all p-values are are below .25. Is that a standard and accepted approach?

Any help is greatly appreciated.

Thanks.
4 REPLIES 4
Quartz | Level 8

## Re: How to do multivariate analysis in SAS (proc logistic)

Hard to answer any of this without a more detailed description of what your predictors are and what your dependent variables are, and what you hope to learn from this analysis.

Also, based on my understanding of the word "multivariate", PROC LOGISTIC does not do multivariate analyses. To me, multivariate means multiple response variables, analyzed with respect to their joint (correlated) distributions. Maybe you are using this word to mean something than what I think it means? Message was edited by: Paige
Fluorite | Level 6

## Re: How to do multivariate analysis in SAS (proc logistic)

I'm probably using the work multivariate incorrectly.

This is the code I wrote to test the relationship of some binary (1=Yes, 2=No) independent variables on the dependent variable BreastFeeding (binary as well).

proc logistic data=nbscrBirthVars;
class NoCollege (ref="1") cesarean (ref="1") PreTerm (ref="1") LBW (ref="1") NICU (ref="1") TenStep (ref="1")/ param=ref;
model BreastFeeding (event="2")= NoCollege cesarean PreTerm LBW NICU TenStep;
run;

The output is below. So, my understanding is that I would remove Macrosomia from the model because the Pr > Chisq in the Type 3 analysis is greater than 0.25 (0.6956). Is that the standard way of determining what to remove?

Thanks.

The LOGISTIC Procedure

Model Information

Data Set WORK.NBSCRBIRTHVARS
Response Variable FormulaSupp
Number of Response Levels 2
Model binary logit
Optimization Technique Fisher's scoring

Number of Observations Used 99826

Response Profile

Ordered Formula Total
Value Supp Frequency

1 1 18503
2 2 81323

Probability modeled is FormulaSupp=2.

NOTE: 6875 observations were deleted due to missing values for the response or explanatory variables.

Class Level Information

Design
Class Value Variables

NoCollege 1 0
2 1

cesarean 1 0
2 1

PreTerm 1 0
2 1

LBW 1 0
2 1

NICU 1 0
2 1

Macrosomia 1 0
2 1

TenStep 1 0
2 1

------------------------------------------------------------------------------------------------------
The LOGISTIC Procedure

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Intercept
Intercept and
Criterion Only Covariates

AIC 95717.853 93430.154
SC 95727.364 93506.243
-2 Log L 95715.853 93414.154

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 2301.6993 7 <.0001
Score 2338.5007 7 <.0001
Wald 2265.3540 7 <.0001

Type 3 Analysis of Effects

Wald
Effect DF Chi-Square Pr > ChiSq

NoCollege 1 462.7169 <.0001
cesarean 1 47.0002 <.0001
PreTerm 1 13.8791 0.0002
LBW 1 3.6452 0.0562
NICU 1 229.8353 <.0001
Macrosomia 1 0.1531 0.6956
TenStep 1 1166.5014 <.0001

Analysis of Maximum Likelihood Estimates

Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 0.7874 0.0964 66.7686 <.0001
NoCollege 2 1 0.3688 0.0171 462.7169 <.0001
cesarean 2 1 0.1185 0.0173 47.0002 <.0001
PreTerm 2 1 0.1106 0.0297 13.8791 0.0002
LBW 2 1 0.0706 0.0370 3.6452 0.0562
NICU 2 1 0.5175 0.0341 229.8353 <.0001
Macrosomia 2 1 0.0348 0.0890 0.1531 0.6956
TenStep 2 1 -0.5757 0.0169 1166.5014 <.0001

------------------------------------------------------------------------------------------------------
9

The LOGISTIC Procedure

Odds Ratio Estimates

Point 95% Wald
Effect Estimate Confidence Limits

NoCollege 2 vs 1 1.446 1.398 1.495
cesarean 2 vs 1 1.126 1.088 1.165
PreTerm 2 vs 1 1.117 1.054 1.184
LBW 2 vs 1 1.073 0.998 1.154
NICU 2 vs 1 1.678 1.569 1.794
Macrosomia 2 vs 1 1.035 0.870 1.233
TenStep 2 vs 1 0.562 0.544 0.581

Association of Predicted Probabilities and Observed Responses

Percent Concordant 56.6 Somers' D 0.232
Percent Discordant 33.3 Gamma 0.259
Percent Tied 10.1 Tau-a 0.070
Pairs 1504719469 c 0.616
Quartz | Level 8

## Re: How to do multivariate analysis in SAS (proc logistic)

While I am not familiar with the advice to use 0.25 as your cutoff, I would use 0.05 as the cutoff. In any event, it seems reasonable to remove Macrosomia from the model.
Rhodochrosite | Level 12

## Re: How to do multivariate analysis in SAS (proc logistic)

There are many stepwise variable-selection options in proc logistic. Check out the documentation for the model statement. But note: one should be cautious with all of these methods. Use them as an exploratory guide, not as a final model-selection method.Model selection (i.e., variable selection in a model) is a complex endeavor.
Discussion stats
• 4 replies
• 2816 views
• 0 likes
• 3 in conversation