BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
joebacon
Pyrite | Level 9

Hi all!

 

I am trying to create a correlation matrix where I get the pearson value for each of the ExpenseType's against the others using the "Value" variable. However, I only want these for quarters (months 1-4, 5-8, etc..). A snap shot of my data is below:

 

 input STUDYNO:32. PID:BEST. Day_1826:DATE9. Day_InitialIW:DATE9. TRICHOT_1YR:BEST12. Personal_consumption_expenditure:BEST12. Month:32. Interval:$5. TimeFrame:$20. ExpenseType:$20. Value:32.;
  format PID BEST. Day_1826 DATE9. Day_InitialIW DATE9. TRICHOT_1YR BEST12. Personal_consumption_expenditure BEST12.;
  label STUDYNO="Sample size: 55,144,41,185,191, Total:616" PID="SUBJID#" Day_1826="Day1826" Day_InitialIW="IWInitial" TRICHOT_1YR="RA 273, UR 140, RNA 80, Missing 123";
datalines;
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr House 3164.5029543
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Consume 820.0784735
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Durable .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Health 332.71026714
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Insure .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Loan 0
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Tax .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Educa .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Gift 16262.542769
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Enter 209.70802449
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Recre 0
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr FinPlan .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Legala 0
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr IncomeTL 61368.08592
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Trans 3334.881462
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr AlcSpend 72.87460203
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr House 3075.6491125
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr Consume 808.7741655
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr Durable .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr Health 316.74626874

The code that I have been tinkering with looks like this: 

 

proc corr data= Sub.Tall_M_IndEXP_Better_JB_011119;
var value;
by ExpenseType;
where Month between 1 and 4;
run;

 However, this gives me the 15 separate correlations when I want a matrix.

 

Any suggestions? Am i going to have to transpose these expense types into variables to able to get a correlation between them?

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
Am i going to have to transpose these expense types into variables to able to get a correlation between them?

Unfortunately yes.

View solution in original post

5 REPLIES 5
Reeza
Super User
Am i going to have to transpose these expense types into variables to able to get a correlation between them?

Unfortunately yes.
ballardw
Super User

@joebacon wrote:

Hi all!

 

I am trying to create a correlation matrix where I get the pearson value for each of the ExpenseType's against the others using the "Value" variable. However, I only want these for quarters (months 1-4, 5-8, etc..). A snap shot of my data is below:

 

 


Quarters when referring to dates generally are 3-month long intervals. You are indicating thirds, groups of 4-month long periods. Which is correct quarter with months 1-3, 4-6, 7-9 and 10-12 or thirds?

Next, months of which date variable, Day_1826 or Day_Initial or that Month variable that has not apparent connection to any date?

 

A matrix, if I understand your desire, likely means that you need to have multiple variables on either the VAR or WITH statements, or possibly both. If you want a correlation between House and Consume expenditures then you need one variable for each on the same record. Which is why @Reeza says you'll need to transpose your data somewhat.

 

You may get a more concrete solution if you provide data for only 3 of the expense types and multiple months so there is more data to work with.

joebacon
Pyrite | Level 9
Oops. You are absolutely correct in that I was indicating thirds instead of quarters.

I had the other Day_ variables in there from another analysis. I would need the month variables (i.e. 1-3, 3-6, etc.).

I would want multiple variables and wanted to see if I could essentially turn one of my variables into several by utilizing by group processing.

However, @Reeza has informed me of my need to transpose. Now to research how to do that.

Thanks all. You both were very helpful. 🙂
Reeza
Super User
PROC TRANSPOSE.
joebacon
Pyrite | Level 9
Haha! Shortly after a google search, I figured it out. Not too complicated.

I am currently trying to color code the correlation matrix.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1026 views
  • 2 likes
  • 3 in conversation