Programming the statistical procedures from SAS

Interaction effects in proc logistics

Reply
Frequent Contributor
Posts: 137

Interaction effects in proc logistics

Hi,

 

I am using proc logistic to make predictions.

 

I have found that 2 input vars are higly correlated (spearman=0.78).

This two vars are important in the model and the target var has dependency with both.

How can I manage teh correlation effect?

 

- Eliminate one of the model, the less important based on Wald chi-square

- Use both in the model and add an interaction effect between vars (var1*var2)

 

Are this good solutions?, any advice will be greatly apreciated

 

Thanks

Respected Advisor
Posts: 3,000

Re: Interaction effects in proc logistics

Posted in reply to juanvg1972

@juanvg1972 wrote:

Hi,

 

I am using proc logistic to make predictions.

 

This two vars are important in the model and the target var has dependency with both.

How can I manage teh correlation effect?



I'm not sure this is a meaningful question. What do you mean by "manage"??

 

A better question would be — what is the best fitting model? You could fit a model with the interaction and see if it is significant, then you probably ought to leave the interaction in the model. You could also compare the models with a single predictor to the model with two predictors to see if adding the second term into the model to see if the fit improves noticeably.

--
Paige Miller
Frequent Contributor
Posts: 137

Re: Interaction effects in proc logistics

Posted in reply to PaigeMiller

Sometimes I have heard that correlation betwenn input vars can cause problems in a model, then I ask anyway to deal with this problem.

Respected Advisor
Posts: 3,000

Re: Interaction effects in proc logistics

Posted in reply to juanvg1972

Yes, correlation between the inputs can cause problems, specifically that the variability of your regression estimates, and variability of predicted values, can be inflated. Sometimes, the problem is so severe that your regression coefficients can have the wrong sign. If you have only two variables in the model, and the fit is good and the signs of the regression coefficients are in the right direction, then I think that's all the checking you have to do. You could also check the Variance Inflation Factor from PROC REG on your data, large numbers are indicative of problems.

 

However, you still have to be careful using such a model, if you have a new data point and you want to predict it's value, and the inputs are not in the same region as the data you used to create the model, then you are extrapolating and shouldn't trust the prediction.

 

 

--
Paige Miller
SAS Employee
Posts: 386

Re: Interaction effects in proc logistics

Posted in reply to juanvg1972

Correlation between two variables is not necessarily a problem at all. As discussed in this note, problems with instability of the model occur when there is strong collinearity among the weighted predictors. As shown there, this can be checked using a weighted regression in PROC REG. 

Ask a Question
Discussion stats
  • 4 replies
  • 127 views
  • 2 likes
  • 3 in conversation