BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dr2014
Quartz | Level 8

Does anyone have any resource on the estimates or process for looking for evidence for confounding in proc logistic? I know this is a stats question, but any leads/suggestions, resources would be helpful for interpreting the proc logistic output for confounders.

1 ACCEPTED SOLUTION

Accepted Solutions
dcruik
Lapis Lazuli | Level 10

You could convert your character variables to numeric with dummy variables, then take that data set and run a proc reg with a VIF options that will give you the multicollinearity diagnostics by providing you with variation inflation factors (vif).  Any parameter estimate with high variation will tell you that you may have some confounding issues.  Not sure if that's what you're looking for exactly, but just one idea.

View solution in original post

9 REPLIES 9
PhilC
Rhodochrosite | Level 12

PROC CORR.

dr2014
Quartz | Level 8

I have categorical variables. proc corr will tell me about the linear relationship between 2 numeric variables. I have already sorted out measures of association using  chisq.

PhilC
Rhodochrosite | Level 12

right...  So logistic regressions do not use categorical variables either, so the categorical variables are converted to dummy variables...  I don't know something that can make this easy.  I must defer to another community member.  It would be great if you could make a dataset that contains the dummy variables created by PROC LOGISTIC.

dr2014
Quartz | Level 8

Yes, you are right. I did convert the categorical variables to dummy variables. let me see if I can get the dataset with the dummy variables...

dcruik
Lapis Lazuli | Level 10

You could convert your character variables to numeric with dummy variables, then take that data set and run a proc reg with a VIF options that will give you the multicollinearity diagnostics by providing you with variation inflation factors (vif).  Any parameter estimate with high variation will tell you that you may have some confounding issues.  Not sure if that's what you're looking for exactly, but just one idea.

dr2014
Quartz | Level 8

Thanks for your reply @dcruik. I see what you mean @philc. Yes, that is part of what I am looking for. I did come across your idea on the internet but the process wasn't clear. So just to understand it better...this is my model for proc logistic regression...

proc logistic data=lr;

  class.....etc

model z= a b c d e f g h;

run;

g and h are control variables. Also, all the variables in the model are dummy variables.. Its still a numeric variable with a discrete outcome? So I run a proc reg only on the control  variables to i.e.

proc reg data=lr;

model z=  g h /vif;

run;

The reason I am asking is because I have already run chisq to look for significance of association for the other predictor variables.

dcruik
Lapis Lazuli | Level 10

I would recommend checking out this book by Paul Allison that discusses Logistic Regression using SAS.  In Chapter 3.5, he discusses about Multicollinearity and checking the diagnostics using this proc reg with the vif option.  There's an example that he uses that could help bring to light the purpose of what you're trying to do, and how to apply it with your data.  Hope this can be a better reference.

https://books.google.com/books?hl=en&lr=&id=NF9kwF1lOF4C&oi=fnd&pg=PR3&dq=paul+allison+multicollinea...

PhilC
Rhodochrosite | Level 12

Looking at a correlation matrix is also advised by Gareth James, et al. "An Introduction to Statistical Learning".  Collinearity is discussed starting around page 113.

If you are going to use VIF or correlation matrices, you want to consider all of your independent variables.  The use of the word independence is meaningful because this confounding is typically because the independent variables are not truly independent of each other, yet true independence is an assumption that is assumed to be true when one performs any linear regression.

dr2014
Quartz | Level 8

Thanks @dcruik and @PhilC. I did realize I have to include all the independent variables in for the VIF. Just couldn't get back yesterday to add a comment The explanations make sense @PhilC. I wanted something very precise to help me in my decision. This helped a lot. I will refer to the books suggested,

Best,

D R.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 3901 views
  • 3 likes
  • 3 in conversation