BookmarkSubscribeRSS Feed
yac431
Calcite | Level 5

Hi Data Experts, 

 

I am writing to ask, while working on a data set the measures are always aggregated as Sum as a default. What should we consider if we want to make the Aggregation as Average. And why? I am creating a correlation matrix for all independant variables in a Housing Data set. The independant variables are Characteristics of the houses. Example - no. of bathrooms, Plot size Sqft, Bedrooms Above Grade, Etc. My next task is to create a cluster analysis out of these variables and then match one cluster ID with the Dependent variable. I.e. the Sales prices of the Houses. The main purpose of this is to do a cluster analysis of the data advising the client to gain insights on the characteristics of different types of houses Sold. 

 

Should I be taking the Average of these measures to make a correlation matrix? 

2 REPLIES 2
PaigeMiller
Diamond | Level 26

Correlations are not done on sums, correlations are not done on averages. Correlations are computed from individual (un-summed, un-averaged) data. So, I don't understand the question.

--
Paige Miller
Stu_SAS
SAS Employee

Hey @yac431@PaigeMiller is correct. For this particular object, aggregations do not affect it. The correlations are derived on un-summed data based on the Pearson Product Moment Correlation Coefficient. This formula is:

Stu_SAS_0-1686751998408.png

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 639 views
  • 0 likes
  • 3 in conversation