turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- proc corr two sided

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-30-2015 09:48 AM - edited 11-30-2015 09:51 AM

I am looking to replicate code from Spss..to SAS

*Correlation Matrix.

CORRELATIONS

/VARIABLES=A B C D E F G

/PRINT=TWOTAIL NOSIG

/MISSING=PAIRWISE.

```
ODS OUTPUT PearsonCorr=TEST1 /*;
PROC CORR DATA=TEST OUTP=CORRS;
VAR A B C D E F G;
RUN;
ODS LISTING;
```

When i run the above code in spss and sas. Although the Correlations are the same the signifiance levels differ..How do you specify the twotail method in SAS.

Accepted Solutions

Solution

12-01-2015
06:13 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to timkill1982

12-01-2015 06:00 AM

Thanks, @timkill1982, for providing the SPSS results.

The explanation for the different p-values is: SPSS uses the sum of the weights, 0.3+0.8+1.1+1.6+2.89=**6.69, **minus 2 as the (fractional) number of degrees of freedom in the calculation of the p-value (based on the t distribution), whereas SAS uses the number of observations, i.e. **5**, minus 2, as shown below:

```
data _null_;
r = 0.08795657670103; /* Pearson correlation coefficient */
n_sas = 5; /* number of observations */
n_spss = 6.69; /* sum of weights */
p_sas = 2*(1-probt(sqrt((n_sas -2)*r**2/(1-r**2)),n_sas -2));
p_spss = 2*(1-probt(sqrt((n_spss-2)*r**2/(1-r**2)),n_spss-2));
put 'p_sas = ' p_sas;
put 'p_spss = ' p_spss;
run;
```

The formulas can be found in the respective documentation: SPSS, SAS.

Interestingly, SPSS says N=7 in the correlations table and it is not clear from the example whether this comes from rounding 6.69 or from adding rounded weights (0+1+1+2+3=7).

In fact, neither the WEIGHT statement nor the FREQ statement of PROC CORR can replicate the SPSS result, because the WEIGHT statement does not alter the *n* and the FREQ statement would truncate the fractional weights (hence use *n*=0+0+1+1+2=4 in our example).

But if you need to compute the p-value based on the t distribution with, e.g, 6.69 - 2 = 4.69 degrees of freedom, you can simply add the weights and apply the code suggested above using the PROBT function (or the CDF function if you like).

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to timkill1982

11-30-2015 10:18 AM

As far as I know, the significance tests in PROC CORR are always two-sided.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to timkill1982

11-30-2015 10:18 AM

I believe the default is two-sided p-values.

Is it possible that SPSS is using a different statistic calculate the p-values? For example, it might be applying Fisher's Z transformation with a bias adjustment? If so, try using the FISHER option on the PROC CORR statement. The doc shows how to control the Fisher transformation parameters.

By default, PROC CORR using a t statistic compute the p-values for Pearson's correlation.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to timkill1982

11-30-2015 10:31 AM

I don't have SPSS, but I found an example on the web: davidmlane.com/SPSS/correlation.html. Exactly the same SPSS syntax is used there as in your code.

I analyzed the data for Y1 and Y2 with PROC CORR and obtained not only the same correlations but also the same p-values, up to rounding in both cases: 0.11607 vs. .116 and 0.8526 vs. .853.

Do you get different results with either SPSS or SAS on that simple dataset? Or can you provide sample data and p-values from your example?

```
data ttt;
input y1 y2;
cards;
5 8.5
6 9.5
3 7
7 6.5
5 5.75
;
ods output PearsonCorr=test1;
proc corr data=ttt outp=corrs;
var y1 y2;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to FreelanceReinhard

11-30-2015 10:40 AM

Using that simple example the outputs match...i better take a closer look to see whats happening with my data.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to timkill1982

11-30-2015 11:23 AM - edited 11-30-2015 11:27 AM

the spss code applies a weight.

*Weight data.

WEIGHT BY testweight.

*Correlation Matrix.

CORRELATIONS

/VARIABLES=A B C D E F G

/PRINT=TWOTAIL NOSIG

/MISSING=PAIRWISE.

So if i add the weight statement in SAS, i would expect the same answer as spss but it stays the same as previous!!

```
ODS OUTPUT PearsonCorr=TEST1 /*;
PROC CORR DATA=TEST OUTP=CORRS;
weight testweight;
VAR A B C D E F G;
RUN;
ODS LISTING;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to timkill1982

11-30-2015 11:37 AM - edited 11-30-2015 11:38 AM

I had the same problem with TTEST last week that the WEIGHT/FREQ statement doesn't work correctly. Most of the testweights range from 0.3-2.89

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to timkill1982

11-30-2015 01:07 PM

Hello,

FREQ statement in PROC CORR; read this:

WEIGHT statement in PROC CORR; read this:

both excerpts coming from:

Base SAS(R) 9.4 Procedures Guide: Statistical Procedures, Fourth Edition

The CORR Procedure

Nothing wrong with the weight statement in PROC CORR!

I see that your code has unbalanced comment marks (a trailing '/*'). Could that be the reason nothing changes?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to timkill1982

11-30-2015 04:01 PM

SPSS documentation about the WEIGHT statement says that "*... **some procedures, such as Frequencies, Crosstabs, and Custom Tables, will use fractional weight values. However, most procedures treat the weighting variable as a replication weight and will simply round fractional weights to the nearest integer. Some procedures ignore the weighting variable completely ...*"

Regardless whether SPSS CORRELATIONS uses rounded or fractional weights, I'm sure we'll be able to clarify this if you provide us with the SPSS result (correlation coefficient, p-value and whatever else it may report) for the following example data (which are the same as we had earlier today, just a weight variable W added):

```
data ttt;
input y1 y2 w;
cards;
5 8.5 0.3
6 9.5 0.8
3 7 1.1
7 6.5 1.6
5 5.75 2.89
;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to FreelanceReinhard

12-01-2015 04:44 AM

Correlations | |||

y1 | y2 | ||

y1 | Pearson Correlation | 1 | .08795657670102520 |

Sig. (2-tailed) | .85635055753627600 | ||

N | 7 | 7 | |

y2 | Pearson Correlation | .08795657670103 | 1 |

Sig. (2-tailed) | .856350557536276 | ||

N | 7 | 7 |

Spss Output

Variable | y1 | y2 | Py1 | Py2 |

y1 | 1 | 0.08796 | _ | 0.8882 |

y2 | 0.08796 | 1 | 0.8882 | _ |

sas output

The correlations are the same but the P values differ!

Solution

12-01-2015
06:13 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to timkill1982

12-01-2015 06:00 AM

Thanks, @timkill1982, for providing the SPSS results.

The explanation for the different p-values is: SPSS uses the sum of the weights, 0.3+0.8+1.1+1.6+2.89=**6.69, **minus 2 as the (fractional) number of degrees of freedom in the calculation of the p-value (based on the t distribution), whereas SAS uses the number of observations, i.e. **5**, minus 2, as shown below:

```
data _null_;
r = 0.08795657670103; /* Pearson correlation coefficient */
n_sas = 5; /* number of observations */
n_spss = 6.69; /* sum of weights */
p_sas = 2*(1-probt(sqrt((n_sas -2)*r**2/(1-r**2)),n_sas -2));
p_spss = 2*(1-probt(sqrt((n_spss-2)*r**2/(1-r**2)),n_spss-2));
put 'p_sas = ' p_sas;
put 'p_spss = ' p_spss;
run;
```

The formulas can be found in the respective documentation: SPSS, SAS.

Interestingly, SPSS says N=7 in the correlations table and it is not clear from the example whether this comes from rounding 6.69 or from adding rounded weights (0+1+1+2+3=7).

In fact, neither the WEIGHT statement nor the FREQ statement of PROC CORR can replicate the SPSS result, because the WEIGHT statement does not alter the *n* and the FREQ statement would truncate the fractional weights (hence use *n*=0+0+1+1+2=4 in our example).

But if you need to compute the p-value based on the t distribution with, e.g, 6.69 - 2 = 4.69 degrees of freedom, you can simply add the weights and apply the code suggested above using the PROBT function (or the CDF function if you like).

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to FreelanceReinhard

12-01-2015 06:14 AM

Great thanks so much i can use this work around