Hello,
Being somewhat a new entrant into the world of measures of dependence measurement, I apologize if my question lacks sophistication. I was wondering if there are any procedures yet or any programming methods to compute the Distance correlation and covariance between bivariate data (each dataset comprising a maximum of ~150 datapoints). I see that R has scripts that were suggested by the authors of the distance correlation method (Szekely...); anything equivalent in SAS?
There are many ways to compute distances between observations in SAS.
The Mahalanobis distance is a correlation/covariance based distance. You can also use PROC DISTANCE to compute various distances in conjunction with PROC PRINCOMP. You can also use PROC PLS to compute the Mahalanobis distance (it is listed as the TSQUARE option for Hotelling's-T2 statistic) You can also compute robust distances by using PROC ROBUSTREG or the MCD function in SAS/IML. If you have spatial data, you can use PROC VARIOGRAM and PROC KRIGE2D to compute various distance-based analyses.
Not within SAS itself, at least that I'm aware of, but take a look at: www.lexjansen.com/wuss/2016/19_Final_Paper_PDF.pdf
Art, CEO, AnalystFinder.com
There are many ways to compute distances between observations in SAS.
The Mahalanobis distance is a correlation/covariance based distance. You can also use PROC DISTANCE to compute various distances in conjunction with PROC PRINCOMP. You can also use PROC PLS to compute the Mahalanobis distance (it is listed as the TSQUARE option for Hotelling's-T2 statistic) You can also compute robust distances by using PROC ROBUSTREG or the MCD function in SAS/IML. If you have spatial data, you can use PROC VARIOGRAM and PROC KRIGE2D to compute various distance-based analyses.
@Rick_SAS
Thanks for the comprehensive answer!. While it may take me a while to figure out what fits my needs the best, I've got to thank you for guiding me for I know better now which way I should be headed.
In a gist, I seek to assess correlation and cross-correlation between several variables, two variables at a time. The datapoints are temporal in nature. What has been observed is that the residuals of a few such variables, after compensation for trends and autocorrelation effects, possibly have non-linear correlation/cross-correlation.
I only seek a measure of association that is robust to such non-linear associations; Pearson's falls short owing to it sensitivity to linear associations whereas Spearman's is not necessarily sensitive to associations that are not monotonic (e.g. quadratic). In my research (I admit not as deep as it should be), I found distance correlation to be a method that fits my requirements and has shown promising results. This especially so when I cannot visually check residuals on a case-by-case basis owing to the fact that I need to run the test across several datasets.
The following research encapsulates in essence what I seek to achieve.
MEASURING AND TESTING DEPENDENCE BY CORRELATION OF DISTANCES By Gabor J. Szekely,Maria L. Rizzo and Nail K. Bakirov
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.