BookmarkSubscribeRSS Feed
yac431
Calcite | Level 5

Hi Data Experts, 

 

I am writing to ask, while working on a data set the measures are always aggregated as Sum as a default. What should we consider if we want to make the Aggregation as Average. And why? I am creating a correlation matrix for all independant variables in a Housing Data set. The independant variables are Characteristics of the houses. Example - no. of bathrooms, Plot size Sqft, Bedrooms Above Grade, Etc. My next task is to create a cluster analysis out of these variables and then match one cluster ID with the Dependent variable. I.e. the Sales prices of the Houses. The main purpose of this is to do a cluster analysis of the data advising the client to gain insights on the characteristics of different types of houses Sold. 

 

Should I be taking the Average of these measures to make a correlation matrix? 

2 REPLIES 2
PaigeMiller
Diamond | Level 26

Correlations are not done on sums, correlations are not done on averages. Correlations are computed from individual (un-summed, un-averaged) data. So, I don't understand the question.

--
Paige Miller
Stu_SAS
SAS Employee

Hey @yac431@PaigeMiller is correct. For this particular object, aggregations do not affect it. The correlations are derived on un-summed data based on the Pearson Product Moment Correlation Coefficient. This formula is:

Stu_SAS_0-1686751998408.png

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 384 views
  • 0 likes
  • 3 in conversation