Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Correlation between interval variables and binary variables

Accepted Solution Solved
Reply
Contributor
Posts: 65
Accepted Solution

Correlation between interval variables and binary variables

Hi

Could please helmp me on a problem

Which of the following  correlation method listed in table analysis node is the right one to test correlation between a continuous variable and a binary variable....

Continuity Adj. Chi-Square
Mantel-Haenszel Chi-Square
Phi Coefficient
Contingency Coefficient
Cramer's V

Accepted Solutions
Solution
‎07-07-2017 01:25 PM
SAS Employee
Posts: 121

Re: Correlation between interval variables and binary variables

The type of correlation you are describing is often referred to as a biserial correlation.  There are 3 different types of biserial correlations--biserial, point biserial, and rank biserial.  Each of these 3 types of biserial correlations are described in SAS Note 22925.  I suspect you need to compute either the biserial or the point biserial correlation.  The difference between these two, as described in the aforementioned SAS Note, depends on the binary variable.  If the binary variable has an underlying continuous distribution, but is measured as binary, then you should compute a "biserial correlation."  If the binary variable is truly dichotomous, then a "point biserial correlation" should be used.  (The "rank biserial correlation" measures the relationship between a binary variable and a rankings (ie. ordinal) variable.)

 

If your binary variables are truly dichotomous (as opposed to discretized continuous variables), then you can compute the point biserial correlations directly in PROC CORR.  The point biserial correlation is equivalent to the Pearson product moment correlation between two variables where the dichotomous variable is given any two numeric values.  This information is also mentioned in our FASTats link under Correlation>  Point Biserial.  PROC CORR prints the Pearson product moment correlation by default, so no additional options are required.

 

If your binary variables are dichotomized continuous variables, then you will need to compute biserial correlations between each of these binary variables and your continuous variable.  These correlations are only available through our %BISERIAL macro.  SAS Note 24991 describes this macro and includes the source code for the macro in the Downloads tab. 

View solution in original post


All Replies
Super Contributor
Posts: 336

Re: Correlation between interval variables and binary variables

Hey Omer,

Here a great resource that summarizes statistical tests and how to code them in SAS.

Choosing the Correct Statistical Test in SAS, Stata and SPSS

I hope it helps,

Miguel

Contributor
Posts: 65

Re: Correlation between interval variables and  binary variables

miguel this is truly a perfect guide for my problem

thank you very much for sharing the page with me...

Solution
‎07-07-2017 01:25 PM
SAS Employee
Posts: 121

Re: Correlation between interval variables and binary variables

The type of correlation you are describing is often referred to as a biserial correlation.  There are 3 different types of biserial correlations--biserial, point biserial, and rank biserial.  Each of these 3 types of biserial correlations are described in SAS Note 22925.  I suspect you need to compute either the biserial or the point biserial correlation.  The difference between these two, as described in the aforementioned SAS Note, depends on the binary variable.  If the binary variable has an underlying continuous distribution, but is measured as binary, then you should compute a "biserial correlation."  If the binary variable is truly dichotomous, then a "point biserial correlation" should be used.  (The "rank biserial correlation" measures the relationship between a binary variable and a rankings (ie. ordinal) variable.)

 

If your binary variables are truly dichotomous (as opposed to discretized continuous variables), then you can compute the point biserial correlations directly in PROC CORR.  The point biserial correlation is equivalent to the Pearson product moment correlation between two variables where the dichotomous variable is given any two numeric values.  This information is also mentioned in our FASTats link under Correlation>  Point Biserial.  PROC CORR prints the Pearson product moment correlation by default, so no additional options are required.

 

If your binary variables are dichotomized continuous variables, then you will need to compute biserial correlations between each of these binary variables and your continuous variable.  These correlations are only available through our %BISERIAL macro.  SAS Note 24991 describes this macro and includes the source code for the macro in the Downloads tab. 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 2123 views
  • 1 like
  • 3 in conversation