<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: PROC CORR: Calculating the Spearman Partial Correlation Coefficient in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178195#M9246</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Eric,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I think I am missing something.&amp;nbsp; Doesn't calculating the Pearson correlation on ranks give the same result as the Spearman correlation?&amp;nbsp; If that is the case, then Method A is certainly more robust to outliers and possibly to distributional assumptions.&amp;nbsp; However, I hesitate to say which will result in a lower MSE.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Steve Denham&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Fri, 30 May 2014 14:03:45 GMT</pubDate>
    <dc:creator>SteveDenham</dc:creator>
    <dc:date>2014-05-30T14:03:45Z</dc:date>
    <item>
      <title>PROC CORR: Calculating the Spearman Partial Correlation Coefficient</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178194#M9245</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;Dear Community,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;Given 3 continuous variables, X, Y, and Z, the partial correlation between X and Y while controlling for Z can be calculated in the following steps:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;1) Perform linear regression with X as the response and Z as the predictor. Denote the residuals from this regression as Rx.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;2) Perform linear regression with Y as the response and Z as the predictor. Denote the residuals from this regression as Ry.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;3) Calculate the correlation between Rx and Ry. This is the partial correlation between X and Y while controlling for Z.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;The usual way of doing Step #3 is to use the Pearson correlation coefficient. My question DOES NOT concern this usual way, because I am interested in calculating partial correlation for data with outliers or for non-normal Rx/Ry.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;There are 2 other ways to calculate partial correlation that can overcome outliers or non-normal residuals, and I'm trying to determine which of these is better.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;Method A: &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;- Perform Steps #1-2 (i.e. the regression) with the ranks of the data rather than the data themselves. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;- Then, perform Step #3 using the Pearson correlation coefficient.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;Method B:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;- Perform Steps #1-2 (i.e. the regression) in the usual way with the data.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;- Then, perform Step #3 using the Spearman correlation coefficient.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;My question to you: Which is better - Method A or Method B?&amp;nbsp; &lt;A href="http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_corr_sect017.htm"&gt;PROC CORR uses Method A&lt;/A&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;Perhaps a more specific way to phrase my question is: Which achieves a lower mean-squared error (MSE) - Method A or Method B? Recall that the MSE of a point estimator, theta-hat, is&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;MSE(theta-hat) = [Bias(theta-hat)]^2 + Variance(theta-hat)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;Thanks,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-family: Helvetica, Arial, sans-serif; background-color: #ffffff;"&gt;Eric&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 30 May 2014 04:54:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178194#M9245</guid>
      <dc:creator>EricCai</dc:creator>
      <dc:date>2014-05-30T04:54:00Z</dc:date>
    </item>
    <item>
      <title>Re: PROC CORR: Calculating the Spearman Partial Correlation Coefficient</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178195#M9246</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Eric,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I think I am missing something.&amp;nbsp; Doesn't calculating the Pearson correlation on ranks give the same result as the Spearman correlation?&amp;nbsp; If that is the case, then Method A is certainly more robust to outliers and possibly to distributional assumptions.&amp;nbsp; However, I hesitate to say which will result in a lower MSE.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Steve Denham&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 30 May 2014 14:03:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178195#M9246</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2014-05-30T14:03:45Z</dc:date>
    </item>
    <item>
      <title>Re: PROC CORR: Calculating the Spearman Partial Correlation Coefficient</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178196#M9247</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I bet Method A. since the residual of it is also a rank that also has the power of the spearman rank correlation.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 30 May 2014 14:31:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178196#M9247</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2014-05-30T14:31:58Z</dc:date>
    </item>
    <item>
      <title>Re: PROC CORR: Calculating the Spearman Partial Correlation Coefficient</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178197#M9248</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks, Ksharp and Steve. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Just to add my thoughts, I don't like Method A because it reduces information from the data into ranks BEFORE the regression is done.&amp;nbsp; Method B uses the full data to perform the regression, so more information is retained. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;However, I'm still stuck on my original question: Which method is better?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 30 May 2014 17:19:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178197#M9248</guid>
      <dc:creator>EricCai</dc:creator>
      <dc:date>2014-05-30T17:19:48Z</dc:date>
    </item>
    <item>
      <title>Re: PROC CORR: Calculating the Spearman Partial Correlation Coefficient</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178198#M9249</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I'll agree that B retains more information.&amp;nbsp; However, it is much more sensitive to outliers and, in smaller datasets especially, lead to completely spurious results.&amp;nbsp; Consider the following:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data whass:&lt;/P&gt;&lt;P&gt;input x y z;&lt;/P&gt;&lt;P&gt;datalines;&lt;/P&gt;&lt;P&gt;1 4 3&lt;/P&gt;&lt;P&gt;2 3 4&lt;/P&gt;&lt;P&gt;3 2.2 5.6&lt;/P&gt;&lt;P&gt;4 1 7&lt;/P&gt;&lt;P&gt;;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Note that Ry is negative.&amp;nbsp; Now suppose a data entry error was made, and that last line was 4 1000 7000 (somebody dropped a decimal point).&amp;nbsp; Now Ry is positive and a very strong correlation is found.&amp;nbsp; However, if you transform to ranks before calculating Ry, it is still positive, but everything moves closer to zero, which you have to admit is closer to the true situation than what was found with the outlier values included.&amp;nbsp; The regression coefficient is amazingly dependent on extreme values, whether as influential or high leverage points.&amp;nbsp; If your data is moderately contaminated, or from a highly skewed distribution, these points can easily result in counterintuitive results.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Steve Denham&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 30 May 2014 17:55:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178198#M9249</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2014-05-30T17:55:00Z</dc:date>
    </item>
    <item>
      <title>Re: PROC CORR: Calculating the Spearman Partial Correlation Coefficient</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178199#M9250</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Agree with Doc Steve. If there are not outliers I would definitely choose B.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Xia Keshan&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Message was edited by: xia keshan&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 31 May 2014 05:19:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-CORR-Calculating-the-Spearman-Partial-Correlation/m-p/178199#M9250</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2014-05-31T05:19:41Z</dc:date>
    </item>
  </channel>
</rss>

