<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How can I perform principal component analysis for logistic regression via SAS? in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/862255#M42624</link>
    <description>&lt;P&gt;Thank you for your reply!&lt;/P&gt;
&lt;P&gt;First of all, I would like to answer my own question that was raised a few days ago (so that other users of the Community can save their time on looking up information): diagnostics of collinearity should&amp;nbsp;&lt;STRONG&gt;precede&lt;/STRONG&gt; the variable selection process. I looked up a book on multivariate statistics and referred to the part on collinearity (of multivariate linear regression), it said (original text not in English): "Multicollinearity (an alias of collinearity) is the distortion of the model estimation or inability of accurate estimation caused by the precise correlation of or the fact that strong correlation&amp;nbsp;exists among the covariates (can be interpreted as "independent variables" in this setting) in the linear regression model. Therefore,&amp;nbsp;&lt;U&gt;&lt;STRONG&gt;prior to regression&lt;/STRONG&gt;&lt;/U&gt;, knowing the relationship among the covariates is of great importance". The text I translated clear pointed out that diagnostics of collinearity should precede the variable selection process. I also consulted one of my teachers responsible for teaching us SAS. She also stated that diagnostics of collinearity should precede the variable selection process.&lt;/P&gt;
&lt;P&gt;As for the reason underlying my failure to diminish collinearity, I myself searched for an answer after raising my question. You mentioned that an inappropriate setting of dummy variables may be one of the underlying causes of the problem. I am gratitude for your pointing out that issue (so I will never make such a mistake in my upcoming data analysis), but unfortunately this was not the case in my problem.&lt;/P&gt;
&lt;P&gt;I read the note you had mentioned again and noticed that the underlying cause of collinearity between the independent variable and the intercept is the disproportionally small standard deviation of the variable that exhibit collinearity with the intercept. I reviewed my model and found out that I put one continuous independent variable alongside dummy variables in the model (the reason why I did so was that compared with other variables, the range of the continuous variable is relatively small, so in order to retain more information of my data, I put the continuous variable directly in the model) and that the mean of other independent variables (dummy variables) are close to 0.5, with their standard deviation ranging from 0.5-0.6. However, the continuous variable had a mean of around 6 and a standard deviation of around 1. The standard deviation of the continuous variable was&amp;nbsp;&lt;STRONG&gt;smaller&lt;/STRONG&gt; than its mean while&amp;nbsp;the standard deviations of the discrete (dummy) variables were&amp;nbsp;&lt;STRONG&gt;larger&lt;/STRONG&gt; than their means.&amp;nbsp;In other words, compared with the other independent variables, the standard deviation of the continuous variable was disproportionally small.&lt;/P&gt;
&lt;P&gt;In the first place, I tackled the problem by standardizing all the variables into the variables with 1 as their standard deviation (just like the case the note you referred). However, the largest condition index computed from the weighted information matrix was 11 &lt;STRONG&gt;prior to&lt;/STRONG&gt; variable standardization (the second largest was 8, so there is no need to concern about that);&amp;nbsp;the largest condition index computed from the weighted information matrix was 12 &lt;STRONG&gt;after&lt;/STRONG&gt; variable standardization, with the very same variable still exhibiting collinearity with the intercept and that no collinearity was observed when intercept was removed from analysis. In other words, using 1 as the standard deviation in PROC STANDARD did not help to reduce collinearity at all.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I decided to try several different standard deviations in PROC STANDARD. First, I tried 0.5, producing even more severe collinearity (the largest condition index computed from the weighted information matrix was 29). Then, I tried 3. This time, collinearity disappeared.&lt;/P&gt;
&lt;P&gt;So, in conclusion, &lt;U&gt;&lt;STRONG&gt;choosing the right standard deviation in PROC STANDARD is of vital importance in dealing with collinearity with the variable standardization method&lt;/STRONG&gt;&lt;/U&gt;. One should not choose the standard deviation in PROC STANDARD arbitrarily, i.e. without observing the exact number of the statistics.&lt;/P&gt;</description>
    <pubDate>Sat, 04 Mar 2023 05:23:47 GMT</pubDate>
    <dc:creator>Season</dc:creator>
    <dc:date>2023-03-04T05:23:47Z</dc:date>
    <item>
      <title>How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860792#M42531</link>
      <description>&lt;P&gt;I am currently building a logistic regression model whose dependent variable follows a binomial distribution. Based upon my professional knowledge, I assume that collinearity exists among the independent variables. Therefore, I wish to perform principal component analysis to detect possible&amp;nbsp;collinearities and to lower the dimension of the independent variables. How can I do this via SAS? Thanks!&lt;/P&gt;</description>
      <pubDate>Sat, 25 Feb 2023 02:22:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860792#M42531</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2023-02-25T02:22:06Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860831#M42534</link>
      <description>&lt;P&gt;PROC PRINCOMP will do this.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It will find reduced dimensions you can use, but CAUTION: some of those reduced dimensions may not be good predictors.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A better procedure, in my mind, is Logistic Partial Least Squares regression, which will find reduced dimensions that are good predictors (as good as the data will allow). While (non-logistic) Partial Least Squares regression is available in PROC PLS, &lt;A href="https://cedric.cnam.fr/fichiers/RC906.pdf" target="_self"&gt;Logistic Partial Least Squares&lt;/A&gt; is not available in SAS but is available as a package in R.&lt;/P&gt;</description>
      <pubDate>Sat, 25 Feb 2023 11:22:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860831#M42534</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2023-02-25T11:22:44Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860835#M42535</link>
      <description>&lt;P&gt;Thank you very, very much, Paige, for your kind help! Actually, I have not been that familiar with principal component analysis as well as PROC PRINCOMP. Therefore, I previously thought that PROC PRINCOMP only supports principal component analysis for models whose independent variable is a continuous one.&lt;/P&gt;
&lt;P&gt;One issue that bothers me much is the lack of information on how to perform&amp;nbsp;principal component analysis for logistic regression via SAS. Since SAS Help has not provided an example on how to perform&amp;nbsp;principal component analysis for logistic regression and I retrieved no results for my question after browsing SAS Community Library, could you please provide some hint on the detailed procedure of doing so? Or perhaps a tutorial written by someone else?&lt;/P&gt;
&lt;P&gt;Many thanks!&lt;/P&gt;</description>
      <pubDate>Sat, 25 Feb 2023 12:32:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860835#M42535</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2023-02-25T12:32:08Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860836#M42536</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/437457"&gt;@Season&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Thank you very, very much, Paige, for your kind help! Actually, I have not been that familiar with principal component analysis as well as PROC PRINCOMP. Therefore, I previously thought that PROC PRINCOMP only supports principal component analysis for models whose independent variable is a continuous one.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Principal components does not use a Y-variable. Therefore, you can use it on the X-variables with either continuous Y-variables or categorical Y-variables, it doesn't matter.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;One issue that bothers me much is the lack of information on how to perform&amp;nbsp;principal component analysis for logistic regression via SAS. Since SAS Help has not provided an example on how to perform&amp;nbsp;principal component analysis for logistic regression and I retrieved no results for my question after browsing SAS Community Library, could you please provide some hint on the detailed procedure of doing so? Or perhaps a tutorial written by someone else?&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It is no different than performing Principal Components for continuous Y. The Y-variable(s) are simply not used by PCA. As I stated above, (some of) the dimensions it finds may not be good predictors of Y.&lt;/P&gt;</description>
      <pubDate>Sat, 25 Feb 2023 12:52:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860836#M42536</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2023-02-25T12:52:26Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860838#M42537</link>
      <description>&lt;P&gt;Ok, thank you very much for your help. Actually, the current model I have built works not bad. Still, some of the parameters that have been proved to be associated with the independent variables by professional knowledge have been tested as statistically insignificant in my analysis. Therefore, for the sake of improving my model, I have come to seek help to examine if the&amp;nbsp;insignificances were caused by collinearities, by the lack of samples, or by other issues (e.g. outliers).&lt;/P&gt;
&lt;P&gt;You have repeated reminded me that in the circumstance I am consulting, principal component analysis may not be the best choice. Thank you for your reminder. Actually, I have only systematically studied statistics and the mathematical knowledge it bases upon for an entire year. Therefore, I can only use SAS right now. I will try Logistic Partial Least Squares method if principal component analysis failed to tackle this problem.&lt;/P&gt;
&lt;P&gt;Thank you very much again!&lt;/P&gt;</description>
      <pubDate>Sat, 25 Feb 2023 13:16:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860838#M42537</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2023-02-25T13:16:57Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860854#M42539</link>
      <description>&lt;P&gt;Another thing to consider is a penalty-based model selection process such as LASSO which is available in PROC HPGENSELECT and selects a subset of the candidate predictors rather than combine them all into a small number of functions. Also, note that if the concern is more about collinearity causing ill-conditioning of the information matrix used in the model-fitting process than dimension reduction of your predictors, then that can be addressed as discussed in &lt;A href="https://support.sas.com/kb/32/471.html" target="_self"&gt;this note&lt;/A&gt;.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 25 Feb 2023 16:23:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860854#M42539</guid>
      <dc:creator>StatDave</dc:creator>
      <dc:date>2023-02-25T16:23:02Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860887#M42547</link>
      <description>&lt;P&gt;Thank you for the help you have offered! Currently, my most important concern is the diagnostics of collinearity. I will take a look at LASSO and the note you have mentioned.&lt;/P&gt;</description>
      <pubDate>Sun, 26 Feb 2023 03:28:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860887#M42547</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2023-02-26T03:28:33Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860888#M42548</link>
      <description>&lt;P&gt;Oh, by the way, I have another problem concerning the diagnostics (discovery) of collinearities in logistic regression. In linear regression models, tolerance, variance inflation factor (VIF), as well as condition index (computed from eigenvalues) can serve as indicators of collinearities among the independent variables in the model. The aforementioned three statistics can be computed in PROC REG upon request. However, they are not available in the modules that build logistic regression models (i.e. PROC LOGISTIC, PROC GENMOD, PROC HPLOGISTIC, etc.). Therefore, diagnostics of collinearity in logistic regression is not that easy.&lt;/P&gt;
&lt;P&gt;I tried PROC PRINCOMP in my data today and found out that PROC PRINCOMP does not compute the three statistics either. Instead, it produces a correlation matrix of the variables I wish to analyze. There is no surprise that "strong" correlations exist among the variables I put in the logistic regression model, with some of the correlation coefficient reaching 0.6154. I guess that collinearities must exist in this situation.&lt;/P&gt;
&lt;P&gt;So here are my questions: when it comes to diagnostics of collinearity, can correlation coefficients serve as surrogate statistics for tolerance, VIF and condition index in logistic regression? If not, what statistic(s) can do this job? Also, how can I compute&amp;nbsp;tolerance, VIF and condition index in logistic regression?&lt;/P&gt;
&lt;P&gt;Could&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;,&amp;nbsp;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13633"&gt;@StatDave&lt;/a&gt;&amp;nbsp;or someone else kindly give me a hand?&lt;/P&gt;
&lt;P&gt;Thank you all very much!&lt;/P&gt;</description>
      <pubDate>Sun, 26 Feb 2023 03:50:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860888#M42548</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2023-02-26T03:50:38Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860891#M42550</link>
      <description>The note I referred you to in my last post specifically discusses and shows how to get collinearity diagnostics for a logistic (or other generalized linear model). I suggest you read the collinearity section of that note and use the method shown. As noted there, correlation among your predictors by themselves is not necessarily a problem. But as I also mentioned, you might not even need to bother with the diagnostics if you use the penalty-based LASSO selection method to just pick out the important predictors.</description>
      <pubDate>Sun, 26 Feb 2023 04:01:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860891#M42550</guid>
      <dc:creator>StatDave</dc:creator>
      <dc:date>2023-02-26T04:01:37Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860894#M42551</link>
      <description>&lt;P&gt;OK, thank you very much for your help! I will read the note you have mentioned carefully and try LASSO as well to compare the two methods. It's too bad that SAS Community only supports accepting merely one reply as the solution. I think that your replies and the replies given by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;are all very fruitful for not only me, but also all of those that are troubled by collinearity in logistic regression. After all, I have retrieved nearly zero article discussing the solution of the collinearity problem in my search for articles on the Internet. Instead of discussing much about mathematical or statistical theories prior to providing a solution (like most articles do), your replies get straight to the point-- provide answers to the problem directly. I myself deem your replies as wonderful "concise textbooks" to the problem. I am sure that your replies can benefit other researchers who are struggling to find a solution to that problem and spending much time on searching for information instead of data analysis itself.&lt;/P&gt;
&lt;P&gt;By the way, I major in medicine and is familiar with a few search engines that specialize in searching for articles on medicine (e.g. PubMed). Could you please introduce the search engine statisticians frequently use (aside from Google Scholar) or a few prestigious journals on statistics?&lt;/P&gt;
&lt;P&gt;Thank you both for your kind help again!&lt;/P&gt;</description>
      <pubDate>Sun, 26 Feb 2023 04:39:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860894#M42551</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2023-02-26T04:39:24Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860920#M42553</link>
      <description>&lt;P&gt;To use the VIF in PROC REG, you create a made up variable that is a continuous Y and use your X-variables. The VIF does not depend on the Y variable.&lt;/P&gt;</description>
      <pubDate>Sun, 26 Feb 2023 11:45:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860920#M42553</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2023-02-26T11:45:24Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860922#M42554</link>
      <description>&lt;P&gt;OK, I see. Computing VIF in PROC REG when the dependent variable is a continuous one is easy. Yet the question I raised earlier is the computation of VIF in a logistic regression model. Can SAS do that? Thanks!&lt;/P&gt;</description>
      <pubDate>Sun, 26 Feb 2023 12:10:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860922#M42554</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2023-02-26T12:10:04Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860945#M42555</link>
      <description>If you read through the note I referred to then you should have learned that collinearity diagnostics (like VIF) for a logistic model (or any generalized linear model) requires using appropriate weights in PROC REG. As is specifically illustrated for a logistic model in that note, the weights can be obtained by first fitting the model in PROC GENMOD and saving the HESSWGT= values. When you then fit the model (to any response values) in PROC REG using those weights, you get the appropriate collinearity diagnostics for assessing your logistic model.</description>
      <pubDate>Sun, 26 Feb 2023 16:09:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860945#M42555</guid>
      <dc:creator>StatDave</dc:creator>
      <dc:date>2023-02-26T16:09:05Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860975#M42556</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/437457"&gt;@Season&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;OK, I see. Computing VIF in PROC REG when the dependent variable is a continuous one is easy. Yet the question I raised earlier is the computation of VIF in a logistic regression model. Can SAS do that? Thanks!&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;No you don't see. I said: "To use the VIF in PROC REG, you create a made up variable that is a continuous Y and use your X-variables. &lt;STRONG&gt;The VIF does not depend on the Y variable.&lt;/STRONG&gt;"&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And of course, there's also the point made by &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13633"&gt;@StatDave&lt;/a&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 26 Feb 2023 23:15:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/860975#M42556</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2023-02-26T23:15:12Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/861114#M42564</link>
      <description>&lt;P&gt;Thank you for your kind and repetitive reminder. In fact, I had just begun reading the note you mentioned when I was replying to &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;yesterday. I am now fully informed of the fact that weights should be multiplied when it comes to diagnosing collinearity in generalized linear models.&lt;/P&gt;
&lt;P&gt;Still, I have some questions:&lt;/P&gt;
&lt;P&gt;(1) I noticed that the var argument of PROC STANDARD standardizes all of the independent variables in the logistic model (li, temp and cell). Now that collinearity exists only between variable temp and the intercept, does all of the independent variables have to be standardized?&lt;/P&gt;
&lt;P&gt;(2) The means of obliterating (or at least reducing) collinearity in a logistic regression model demonstrated here is variable standardization. In a complete model building process, what follows the PROC STANDARD procedure is using these standardized variables to perform logistic regression modeling. Eventually, the user may wish to transform the standardized variables into unstandardized ones. When I was a student studying statistics, my teacher demonstrated an example of using SAS to perform principal component analysis for multivariate linear regression. She completed the final process (i.e. transform the standardized variables back to the unstandardized ones after the entire model building process) by writing down the equation in hand and perform&amp;nbsp;arithmetic calculations on her own.&lt;/P&gt;
&lt;P&gt;Is there an automatic way of doing that final transformation process by SAS?&lt;/P&gt;
&lt;P&gt;(3) The circumstance illustrated in the note you provided was one where one independent variable collinears with the intercept. What if the independent variables collinear with each other? Aside from deviating from the original model (i.e. switching to penalty-based model selection process like LASSO or other methods like Logistic Partial Least Squares Regression, etc.) and simply deleting one or more variables involved in collinearity, is variable standardization still a solution to that problem? If so, should the researcher standardize all the independent variables, as is the case in the note you provided; or just the independent variables that are involved in collinearity?&lt;/P&gt;
&lt;P&gt;Many thanks!&lt;/P&gt;</description>
      <pubDate>Mon, 27 Feb 2023 15:23:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/861114#M42564</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2023-02-27T15:23:04Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/861116#M42565</link>
      <description>&lt;P&gt;Oh, yes, you were totally correct (laugh oh laugh), I surely did not see yesterday.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/437457"&gt;@Season&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;OK, I see. Computing VIF in PROC REG when the dependent variable is a continuous one is easy. Yet the question I raised earlier is the computation of VIF in a logistic regression model. Can SAS do that? Thanks!&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;No you don't see. I said: "To use the VIF in PROC REG, you create a made up variable that is a continuous Y and use your X-variables. &lt;STRONG&gt;The VIF does not depend on the Y variable.&lt;/STRONG&gt;"&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I did not completely understand what you mean yesterday and was merely focusing one the word "continuous" before the letter "Y". I thought you must have misunderstood me, since I made a lengthy reply yesterday, with my questions hidden between the lines. As a result, you may had just skimmed through my reply, without noticing that I was modeling a discrete independent variable. So that's the underlying reason for my further reply you quoted.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;And of course, there's also the point made by &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13633"&gt;@StatDave&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Now I think that I have truly seen what you meant. But other questions emerged when I was reading the note&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13633"&gt;@StatDave&lt;/a&gt;&amp;nbsp;cited. I have already raised my questions in my latest reply to&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13633"&gt;@StatDave&lt;/a&gt;. I wonder if you could offer your suggestion to the three questions, if you don't mind.&lt;/P&gt;
&lt;P&gt;Thank you very much for your patience and your time spent on my questions again!&lt;/P&gt;</description>
      <pubDate>Mon, 27 Feb 2023 15:12:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/861116#M42565</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2023-02-27T15:12:02Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/861123#M42567</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/437457"&gt;@Season&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Thank you for your kind and repetitive reminder. In fact, I had just begun reading the note you mentioned when I was replying to &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;yesterday. I am now fully informed of the fact that weights should be multiplied when it comes to diagnosing collinearity in generalized linear models.&lt;/P&gt;
&lt;P&gt;Still, I have some questions:&lt;/P&gt;
&lt;P&gt;(1) I noticed that the var argument of PROC STANDARD standardizes all of the independent variables in the logistic model (li, temp and cell). Now that collinearity exists only between variable temp and the intercept, does all of the independent variables have to be standardized?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The link talks about a case where one of the predictor variables is highly correlated with the Intercept. It says: "&lt;SPAN&gt;The variation proportions associated with this large condition index suggest that TEMP is collinear with the intercept." and it also says&amp;nbsp;&lt;/SPAN&gt;"&lt;SPAN&gt;By rescaling the predictors, the collinearity with the intercept can be removed."&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;So standardizing the variables will remove the effect of high correlations if there are &lt;STRONG&gt;only&lt;/STRONG&gt; large correlations with the intercept. If there are high correlations between one predictor variable&amp;nbsp;(not the intercept) and another predictor variable (not the intercept), then standardizing the variables does not help remove this high correlation. (And so, the link is correct for the specific case of correlation with the intercept, it does not apply in general to high correlations between predictor variables).&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 27 Feb 2023 15:53:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/861123#M42567</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2023-02-27T15:53:50Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/861149#M42568</link>
      <description>&lt;P&gt;I don't think that standardization is a general solution to collinearity and its effect on the model.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I take another look at that cancer remission example in my note on &lt;A href="https://support.sas.com/kb/60/240.html" target="_self"&gt;penalty-based selection methods&lt;/A&gt; (LASSO, ridging, and elastic net). See the introductory text and, in particular, Example 3 which shows how the problem found in the first note on collinearity can be addressed using penalty-based selection. The LASSO method is shown using NLMIXED, which shows how the penalty is applied to the binomial likelihood when fitting the model, and with HPGENSELECT which simplifies its use. It also shows how the dual-penalty (LASSO+ridging), elastic net method can be applied using NLMIXED. The point is that collinearity, whether involving the intercept or other predictors, can be avoided using these shrinkage methods to select effects to stay in the model.&lt;/P&gt;</description>
      <pubDate>Mon, 27 Feb 2023 16:52:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/861149#M42568</guid>
      <dc:creator>StatDave</dc:creator>
      <dc:date>2023-02-27T16:52:22Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/861429#M42579</link>
      <description>&lt;P&gt;Thank you, Paige, for your reply! Another issue I concern is the nomenclature of the statistics generated in the process we have been focused on in the past few days.&lt;/P&gt;
&lt;P&gt;&lt;A title="Another user" href="https://communities.sas.com/t5/Statistical-Procedures/How-to-compute-GVIF-in-SAS/m-p/856156#M42324" target="_blank" rel="noopener"&gt;Another user&lt;/A&gt;&amp;nbsp;of the community raised a question of the computation of GVIF in logistic regression model. I wonder if the VIF computed from the weighted information matrix can still be termed as "VIF". Should VIF calculated from weighted information matrix in generalized linear models be termed as "GVIF"?&lt;/P&gt;</description>
      <pubDate>Tue, 28 Feb 2023 15:21:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/861429#M42579</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2023-02-28T15:21:52Z</dc:date>
    </item>
    <item>
      <title>Re: How can I perform principal component analysis for logistic regression via SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/861449#M42580</link>
      <description>&lt;P&gt;Thank you for your reply! With the help of you and &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;, I think that I have a grasped the flowchart of dealing with collinearity in generalized linear models. Thank you both for your patience, kindness and expertise!&lt;/P&gt;
&lt;P&gt;I have been insisting on consulting the problem of dealing with collinearity in the context of using the "ordinary" logistic regression model. The underlying reasons for this are as follows: (1) I hardly know anything about ridge regression and LASSO other than the names of the statistical methods and the most basic knowledge about them (they do not adopt unbiased estimation and that they are suitable for dealing with collinearity). (2) It has been&amp;nbsp;&lt;A title="https://www.bmj.com/content/368/bmj.m441.long" href="https://www.bmj.com/content/368/bmj.m441.long" target="_blank" rel="noopener"&gt;reported&lt;/A&gt;&amp;nbsp;that a larger sample is required for modern penalization methods like LASSO and elastic net when it comes to building clinical prediction models, but the author of the paper I cited simply stated that "further research on sample size requirement on methods like LASSO is required", thus possibly implying that no conclusion has been reached on that issue. In this case, sample size and power calculation may be a tough issue, at least for researchers building logistic regression as a clinical prediction model.&lt;/P&gt;</description>
      <pubDate>Tue, 28 Feb 2023 15:45:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-perform-principal-component-analysis-for-logistic/m-p/861449#M42580</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2023-02-28T15:45:34Z</dc:date>
    </item>
  </channel>
</rss>

