Statistical Procedures

timkill1982 · Posted 11-30-2015 09:48 AM

I am looking to replicate code from Spss..to SAS

*Correlation Matrix.
CORRELATIONS
/VARIABLES=A B C D E F G
/PRINT=TWOTAIL NOSIG
/MISSING=PAIRWISE.

ODS OUTPUT PearsonCorr=TEST1 /*;
PROC CORR DATA=TEST  OUTP=CORRS;
 VAR A B C D E F G;
RUN;
ODS LISTING;

When i run the above code in spss and sas. Although the Correlations are the same the signifiance levels differ..How do you specify the twotail method in SAS.

FreelanceReinh · Posted 12-01-2015 06:00 AM

Thanks, @timkill1982, for providing the SPSS results.

The explanation for the different p-values is: SPSS uses the sum of the weights, 0.3+0.8+1.1+1.6+2.89=6.69, minus 2 as the (fractional) number of degrees of freedom in the calculation of the p-value (based on the t distribution), whereas SAS uses the number of observations, i.e. 5, minus 2, as shown below:

data _null_;
r = 0.08795657670103; /* Pearson correlation coefficient */
n_sas  = 5;     /* number of observations */
n_spss = 6.69;  /* sum of weights */
p_sas  = 2*(1-probt(sqrt((n_sas -2)*r**2/(1-r**2)),n_sas -2));
p_spss = 2*(1-probt(sqrt((n_spss-2)*r**2/(1-r**2)),n_spss-2));
put 'p_sas  = ' p_sas;
put 'p_spss = ' p_spss;
run;

The formulas can be found in the respective documentation: SPSS, SAS.

Interestingly, SPSS says N=7 in the correlations table and it is not clear from the example whether this comes from rounding 6.69 or from adding rounded weights (0+1+1+2+3=7).

In fact, neither the WEIGHT statement nor the FREQ statement of PROC CORR can replicate the SPSS result, because the WEIGHT statement does not alter the n and the FREQ statement would truncate the fractional weights (hence use n=0+0+1+1+2=4 in our example).

But if you need to compute the p-value based on the t distribution with, e.g, 6.69 - 2 = 4.69 degrees of freedom, you can simply add the weights and apply the code suggested above using the PROBT function (or the CDF function if you like).

View solution in original post

PaigeMiller · Posted 11-30-2015 10:18 AM

As far as I know, the significance tests in PROC CORR are always two-sided.

--
Paige Miller

Rick_SAS · Posted 11-30-2015 10:18 AM

I believe the default is two-sided p-values.

Is it possible that SPSS is using a different statistic calculate the p-values? For example, it might be applying Fisher's Z transformation with a bias adjustment? If so, try using the FISHER option on the PROC CORR statement. The doc shows how to control the Fisher transformation parameters.

By default, PROC CORR using a t statistic compute the p-values for Pearson's correlation.

FreelanceReinh · Posted 11-30-2015 10:31 AM

I don't have SPSS, but I found an example on the web: davidmlane.com/SPSS/correlation.html. Exactly the same SPSS syntax is used there as in your code.

I analyzed the data for Y1 and Y2 with PROC CORR and obtained not only the same correlations but also the same p-values, up to rounding in both cases: 0.11607 vs. .116 and 0.8526 vs. .853.

Do you get different results with either SPSS or SAS on that simple dataset? Or can you provide sample data and p-values from your example?

data ttt;
input y1 y2;
cards;
5 8.5
6 9.5
3 7
7 6.5
5 5.75
;

ods output PearsonCorr=test1;
proc corr data=ttt outp=corrs;
var y1 y2;
run;

timkill1982 · Posted 11-30-2015 10:40 AM

Using that simple example the outputs match...i better take a closer look to see whats happening with my data.

timkill1982 · Posted 11-30-2015 11:23 AM

the spss code applies a weight.

*Weight data.
WEIGHT BY testweight.

*Correlation Matrix.
CORRELATIONS
/VARIABLES=A B C D E F G
/PRINT=TWOTAIL NOSIG
/MISSING=PAIRWISE.

So if i add the weight statement in SAS, i would expect the same answer as spss but it stays the same as previous!!

ODS OUTPUT PearsonCorr=TEST1 /*;
PROC CORR DATA=TEST  OUTP=CORRS;
weight testweight;
 VAR A B C D E F G;
RUN;
ODS LISTING;

timkill1982 · Posted 11-30-2015 11:37 AM

I had the same problem with TTEST last week that the WEIGHT/FREQ statement doesn't work correctly. Most of the testweights range from 0.3-2.89

sbxkoenk · Posted 11-30-2015 01:07 PM

Hello,

FREQ statement in PROC CORR; read this:

http://support.sas.com/documentation/cdl/en/procstat/68142/HTML/default/viewer.htm#procstat_corr_syn...

WEIGHT statement in PROC CORR; read this:

http://support.sas.com/documentation/cdl/en/procstat/68142/HTML/default/viewer.htm#procstat_corr_syn...

both excerpts coming from:

Base SAS(R) 9.4 Procedures Guide: Statistical Procedures, Fourth Edition

The CORR Procedure

Nothing wrong with the weight statement in PROC CORR!

I see that your code has unbalanced comment marks (a trailing '/*'). Could that be the reason nothing changes?

FreelanceReinh · Posted 11-30-2015 04:01 PM

SPSS documentation about the WEIGHT statement says that "... some procedures, such as Frequencies, Crosstabs, and Custom Tables, will use fractional weight values. However, most procedures treat the weighting variable as a replication weight and will simply round fractional weights to the nearest integer. Some procedures ignore the weighting variable completely ..."

Regardless whether SPSS CORRELATIONS uses rounded or fractional weights, I'm sure we'll be able to clarify this if you provide us with the SPSS result (correlation coefficient, p-value and whatever else it may report) for the following example data (which are the same as we had earlier today, just a weight variable W added):

data ttt;
input y1 y2 w;
cards;
5 8.5  0.3
6 9.5  0.8
3 7    1.1
7 6.5  1.6
5 5.75 2.89 
;

timkill1982 · Posted 12-01-2015 04:44 AM

Correlations
	y1	y2
y1	Pearson Correlation	1	.08795657670102520
Sig. (2-tailed)		.85635055753627600
N	7	7
y2	Pearson Correlation	.08795657670103	1
Sig. (2-tailed)	.856350557536276
N	7	7

Spss Output

Variable	y1	y2	Py1	Py2
y1	1	0.08796	_	0.8882
y2	0.08796	1	0.8882	_

sas output

The correlations are the same but the P values differ!

FreelanceReinh · Posted 12-01-2015 06:00 AM

Thanks, @timkill1982, for providing the SPSS results.

The explanation for the different p-values is: SPSS uses the sum of the weights, 0.3+0.8+1.1+1.6+2.89=6.69, minus 2 as the (fractional) number of degrees of freedom in the calculation of the p-value (based on the t distribution), whereas SAS uses the number of observations, i.e. 5, minus 2, as shown below:

data _null_;
r = 0.08795657670103; /* Pearson correlation coefficient */
n_sas  = 5;     /* number of observations */
n_spss = 6.69;  /* sum of weights */
p_sas  = 2*(1-probt(sqrt((n_sas -2)*r**2/(1-r**2)),n_sas -2));
p_spss = 2*(1-probt(sqrt((n_spss-2)*r**2/(1-r**2)),n_spss-2));
put 'p_sas  = ' p_sas;
put 'p_spss = ' p_spss;
run;

The formulas can be found in the respective documentation: SPSS, SAS.

Interestingly, SPSS says N=7 in the correlations table and it is not clear from the example whether this comes from rounding 6.69 or from adding rounded weights (0+1+1+2+3=7).

In fact, neither the WEIGHT statement nor the FREQ statement of PROC CORR can replicate the SPSS result, because the WEIGHT statement does not alter the n and the FREQ statement would truncate the fractional weights (hence use n=0+0+1+1+2=4 in our example).

But if you need to compute the p-value based on the t distribution with, e.g, 6.69 - 2 = 4.69 degrees of freedom, you can simply add the weights and apply the code suggested above using the PROBT function (or the CDF function if you like).

timkill1982 · Posted 12-01-2015 06:14 AM

Great thanks so much i can use this work around

Statistical Procedures

proc corr two sided

Re: proc corr two sided

Re: proc corr two sided

Re: proc corr two sided

Re: proc corr two sided

Re: proc corr two sided

Re: proc corr two sided

Re: proc corr two sided

Re: proc corr two sided

Re: proc corr two sided

Re: proc corr two sided

Re: proc corr two sided

Re: proc corr two sided

Option table control, side-by-side view, + 8 new steps | SAS Viya Sept...

Proc corr results

proc corr formatting

Proc corr / output dataset with CORR and p-value in separate columns

Question about PROC CORR

Follow Us

What is...

Statistical Procedures

Our biggest data and AI event of the year.

Follow Us

What is...