BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
joebacon
Pyrite | Level 9

Hi all!

 

I am trying to create a correlation matrix where I get the pearson value for each of the ExpenseType's against the others using the "Value" variable. However, I only want these for quarters (months 1-4, 5-8, etc..). A snap shot of my data is below:

 

 input STUDYNO:32. PID:BEST. Day_1826:DATE9. Day_InitialIW:DATE9. TRICHOT_1YR:BEST12. Personal_consumption_expenditure:BEST12. Month:32. Interval:$5. TimeFrame:$20. ExpenseType:$20. Value:32.;
  format PID BEST. Day_1826 DATE9. Day_InitialIW DATE9. TRICHOT_1YR BEST12. Personal_consumption_expenditure BEST12.;
  label STUDYNO="Sample size: 55,144,41,185,191, Total:616" PID="SUBJID#" Day_1826="Day1826" Day_InitialIW="IWInitial" TRICHOT_1YR="RA 273, UR 140, RNA 80, Missing 123";
datalines;
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr House 3164.5029543
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Consume 820.0784735
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Durable .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Health 332.71026714
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Insure .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Loan 0
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Tax .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Educa .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Gift 16262.542769
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Enter 209.70802449
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Recre 0
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr FinPlan .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Legala 0
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr IncomeTL 61368.08592
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Trans 3334.881462
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr AlcSpend 72.87460203
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr House 3075.6491125
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr Consume 808.7741655
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr Durable .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr Health 316.74626874

The code that I have been tinkering with looks like this: 

 

proc corr data= Sub.Tall_M_IndEXP_Better_JB_011119;
var value;
by ExpenseType;
where Month between 1 and 4;
run;

 However, this gives me the 15 separate correlations when I want a matrix.

 

Any suggestions? Am i going to have to transpose these expense types into variables to able to get a correlation between them?

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
Am i going to have to transpose these expense types into variables to able to get a correlation between them?

Unfortunately yes.

View solution in original post

5 REPLIES 5
Reeza
Super User
Am i going to have to transpose these expense types into variables to able to get a correlation between them?

Unfortunately yes.
ballardw
Super User

@joebacon wrote:

Hi all!

 

I am trying to create a correlation matrix where I get the pearson value for each of the ExpenseType's against the others using the "Value" variable. However, I only want these for quarters (months 1-4, 5-8, etc..). A snap shot of my data is below:

 

 


Quarters when referring to dates generally are 3-month long intervals. You are indicating thirds, groups of 4-month long periods. Which is correct quarter with months 1-3, 4-6, 7-9 and 10-12 or thirds?

Next, months of which date variable, Day_1826 or Day_Initial or that Month variable that has not apparent connection to any date?

 

A matrix, if I understand your desire, likely means that you need to have multiple variables on either the VAR or WITH statements, or possibly both. If you want a correlation between House and Consume expenditures then you need one variable for each on the same record. Which is why @Reeza says you'll need to transpose your data somewhat.

 

You may get a more concrete solution if you provide data for only 3 of the expense types and multiple months so there is more data to work with.

joebacon
Pyrite | Level 9
Oops. You are absolutely correct in that I was indicating thirds instead of quarters.

I had the other Day_ variables in there from another analysis. I would need the month variables (i.e. 1-3, 3-6, etc.).

I would want multiple variables and wanted to see if I could essentially turn one of my variables into several by utilizing by group processing.

However, @Reeza has informed me of my need to transpose. Now to research how to do that.

Thanks all. You both were very helpful. 🙂
Reeza
Super User
PROC TRANSPOSE.
joebacon
Pyrite | Level 9
Haha! Shortly after a google search, I figured it out. Not too complicated.

I am currently trying to color code the correlation matrix.