<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Logistic Regression Collinearity in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423533#M104168</link>
    <description>&lt;P&gt;I have a large number of observations, 200,000 weighted, so there should be no issue with the 20 variables from that stand point.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am also just trying to find associations between the independent variables and the dependent variable, and am not interested in building a powerful model. However, when I add or remove some of the variables, it causes a few of the other variables to change significance drastically, sometimes becoming significant only after adding another variable to the model. I don't want to come up with an association that may differ from what someone else may find if they look for they same associations (for example, if they have a slightly different selection of variables and show difference in significance from what I have shown, that would make my study seem inaccurate).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
    <pubDate>Tue, 26 Dec 2017 01:13:20 GMT</pubDate>
    <dc:creator>sasnewbie12</dc:creator>
    <dc:date>2017-12-26T01:13:20Z</dc:date>
    <item>
      <title>Logistic Regression Collinearity</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423491#M104160</link>
      <description>&lt;P&gt;I am trying to run a model with logistic regression containing about 20&amp;nbsp;independent variables, both categorical and continuous.&lt;/P&gt;&lt;P&gt;However, I am finding that the significance varies depending on which variables I include and exclude, and I believe that there is association and collinearity among the variables.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As I am a new SAS user, is there any simple way to check for association among the variables in logistic regression?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank You&lt;/P&gt;</description>
      <pubDate>Sun, 24 Dec 2017 22:05:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423491#M104160</guid>
      <dc:creator>sasnewbie12</dc:creator>
      <dc:date>2017-12-24T22:05:35Z</dc:date>
    </item>
    <item>
      <title>Re: Logistic Regression Collinearity</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423493#M104161</link>
      <description>&lt;P&gt;Not my area of expertise, but the following might help:&amp;nbsp;&lt;A href="http://support.sas.com/kb/32/471.html" target="_blank"&gt;http://support.sas.com/kb/32/471.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 24 Dec 2017 23:29:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423493#M104161</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2017-12-24T23:29:07Z</dc:date>
    </item>
    <item>
      <title>Re: Logistic Regression Collinearity</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423496#M104162</link>
      <description>&lt;P&gt;without knowing much about it, eg how many obs you have, 20 variables sounds like a lot and could be affecting things. There are some rules-of-thumb out there eg in survival analysis i think they call it 'failures per variable' (FPV) and 10 is considered sufficient. There would be something analogous for logistic regression i guess. Regarding associations among the variables, normally this would be based on an understanding of the data, ie it would be anticipated and a priori rather than data-dependent. But if you want to examine correlations among the variables then that could be done, even if the variables are of different types eg proc corr will give the the biserial correlation i think, or there's a macro for it: &lt;A href="http://support.sas.com/kb/24/991.html" target="_blank"&gt;http://support.sas.com/kb/24/991.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 25 Dec 2017 06:17:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423496#M104162</guid>
      <dc:creator>pau13rown</dc:creator>
      <dc:date>2017-12-25T06:17:17Z</dc:date>
    </item>
    <item>
      <title>Re: Logistic Regression Collinearity</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423515#M104163</link>
      <description>&lt;P&gt;proc logistic is modeling by MLE , unlike proc reg by OLS.&lt;/P&gt;
&lt;P&gt;Usually sas would do it for you automatically. Check PROC HPGENSELECT ,there are many selection method about variables,like CV , LASSO ....&lt;/P&gt;</description>
      <pubDate>Mon, 25 Dec 2017 15:02:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423515#M104163</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2017-12-25T15:02:52Z</dc:date>
    </item>
    <item>
      <title>Re: Logistic Regression Collinearity</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423533#M104168</link>
      <description>&lt;P&gt;I have a large number of observations, 200,000 weighted, so there should be no issue with the 20 variables from that stand point.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am also just trying to find associations between the independent variables and the dependent variable, and am not interested in building a powerful model. However, when I add or remove some of the variables, it causes a few of the other variables to change significance drastically, sometimes becoming significant only after adding another variable to the model. I don't want to come up with an association that may differ from what someone else may find if they look for they same associations (for example, if they have a slightly different selection of variables and show difference in significance from what I have shown, that would make my study seem inaccurate).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
      <pubDate>Tue, 26 Dec 2017 01:13:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423533#M104168</guid>
      <dc:creator>sasnewbie12</dc:creator>
      <dc:date>2017-12-26T01:13:20Z</dc:date>
    </item>
    <item>
      <title>Re: Logistic Regression Collinearity</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423541#M104169</link>
      <description>&lt;P&gt;in that case, the first thing i'd do (maybe you have already) is write a macro that fits the model for a single independent variable, and then run this macro for each of the 20 variables (some call these 'univariate models'), just to get a sense of things and to see which are the strongest predictors on their own. You could stop here because you are "not interested in building a powerful model". But if you want to see if any variables are superfluous you could then attempt a 'multivariate model' (a misnomer but this is how some people describe it) using only those variables that looked good in the univariate models. Although with 200,000 obs maybe every variable shows a small p-value, ie this approach is common in medical research but it really depends on what you're doing. Eg, in the methods section in this article, see the 6 steps they describe: &lt;A href="https://www.nature.com/articles/7211492" target="_blank"&gt;https://www.nature.com/articles/7211492&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Edit: regarding whether others can reproduce your results, as long as you layout your approach as they do in that article, then i'd say it's fine&lt;/P&gt;</description>
      <pubDate>Tue, 26 Dec 2017 04:51:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423541#M104169</guid>
      <dc:creator>pau13rown</dc:creator>
      <dc:date>2017-12-26T04:51:48Z</dc:date>
    </item>
    <item>
      <title>Re: Logistic Regression Collinearity</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423581#M104181</link>
      <description>&lt;P&gt;"&lt;SPAN&gt;&amp;nbsp;fits the model for a single independent variable,&amp;nbsp;"&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;That is called perfect model. That is not right according to statistical theory.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;I suggest to use PROC HPGENSELECT to let sas&amp;nbsp; select variables for you .&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Don't use selection=stepwise/forward/backward, try CV/LASSO/LASTIC&amp;nbsp; ....,more info check doc of PROC HPGENSELECT .&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Dec 2017 13:36:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423581#M104181</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2017-12-26T13:36:19Z</dc:date>
    </item>
    <item>
      <title>Re: Logistic Regression Collinearity</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423719#M104235</link>
      <description>&lt;P&gt;A followup question, say that&amp;nbsp;an independant variables&amp;nbsp;has significant association on the "univariate" analysis, and non-significant on "multivariate" analysis, will I be able to make any use of the adjusted odds-ratio for that variable, if the p-value is non-significant ?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have seen studies where they list the adjusted odd-ratio without a p-value, so I am wondering if it holds any importance when it is non-significant?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank You&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Dec 2017 13:33:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423719#M104235</guid>
      <dc:creator>sasnewbie12</dc:creator>
      <dc:date>2017-12-27T13:33:38Z</dc:date>
    </item>
    <item>
      <title>Re: Logistic Regression Collinearity</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423721#M104237</link>
      <description>&lt;P&gt;Another question, if I find a categorical variable has&amp;nbsp;non-significant&amp;nbsp;association on multivariate analysis under "analysis of likelihood estimates", but the "Type 3 analysis of effects" shows that it is significant, what does that mean and how can it be interpreted?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Dec 2017 13:41:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423721#M104237</guid>
      <dc:creator>sasnewbie12</dc:creator>
      <dc:date>2017-12-27T13:41:14Z</dc:date>
    </item>
    <item>
      <title>Re: Logistic Regression Collinearity</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423734#M104243</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/169765"&gt;@sasnewbie12&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;I am trying to run a model with logistic regression containing about 20&amp;nbsp;independent variables, both categorical and continuous.&lt;/P&gt;
&lt;P&gt;However, I am finding that the significance varies depending on which variables I include and exclude, and I believe that there is association and collinearity among the variables.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As I am a new SAS user, is there any simple way to check for association among the variables in logistic regression?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank You&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;You keep asking the same questions over and over, and my answers don't change, just because you ask again three weeks later. I repeat my answer given here: &lt;A href="https://communities.sas.com/t5/SAS-Statistical-Procedures/multivariate-logistic-regression-variable-troubleshooting/m-p/419337#M22061" target="_blank"&gt;https://communities.sas.com/t5/SAS-Statistical-Procedures/multivariate-logistic-regression-variable-troubleshooting/m-p/419337#M22061&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Variable selection is fundamentally a poor approach when you have many correlated variables. It doesn't matter if you are new to SAS or experienced in SAS or using R or Python or Minitab. It is not the software that makes it a poor approach.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;At that link, I reference a method of performing Logistic Partial Least Squares regression, fundamentally a superior approach. There is R code to do this, but I am not aware of SAS code to do this. However, since you can &lt;A href="http://support.sas.com/documentation/cdl/en/imlug/63541/HTML/default/viewer.htm#imlug_r_sect012.htm" target="_self"&gt;run R code through SAS PROC IML&lt;/A&gt;, that seems to be the approach I would take.&lt;/P&gt;</description>
      <pubDate>Wed, 27 Dec 2017 16:54:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Logistic-Regression-Collinearity/m-p/423734#M104243</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2017-12-27T16:54:59Z</dc:date>
    </item>
  </channel>
</rss>

