Hi all!
I am trying to create a correlation matrix where I get the pearson value for each of the ExpenseType's against the others using the "Value" variable. However, I only want these for quarters (months 1-4, 5-8, etc..). A snap shot of my data is below:
input STUDYNO:32. PID:BEST. Day_1826:DATE9. Day_InitialIW:DATE9. TRICHOT_1YR:BEST12. Personal_consumption_expenditure:BEST12. Month:32. Interval:$5. TimeFrame:$20. ExpenseType:$20. Value:32.;
format PID BEST. Day_1826 DATE9. Day_InitialIW DATE9. TRICHOT_1YR BEST12. Personal_consumption_expenditure BEST12.;
label STUDYNO="Sample size: 55,144,41,185,191, Total:616" PID="SUBJID#" Day_1826="Day1826" Day_InitialIW="IWInitial" TRICHOT_1YR="RA 273, UR 140, RNA 80, Missing 123";
datalines;
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr House 3164.5029543
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Consume 820.0784735
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Durable .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Health 332.71026714
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Insure .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Loan 0
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Tax .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Educa .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Gift 16262.542769
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Enter 209.70802449
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Recre 0
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr FinPlan .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Legala 0
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr IncomeTL 61368.08592
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Trans 3334.881462
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr AlcSpend 72.87460203
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr House 3075.6491125
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr Consume 808.7741655
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr Durable .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr Health 316.74626874
The code that I have been tinkering with looks like this:
proc corr data= Sub.Tall_M_IndEXP_Better_JB_011119;
var value;
by ExpenseType;
where Month between 1 and 4;
run;
However, this gives me the 15 separate correlations when I want a matrix.
Any suggestions? Am i going to have to transpose these expense types into variables to able to get a correlation between them?
@joebacon wrote:
Hi all!
I am trying to create a correlation matrix where I get the pearson value for each of the ExpenseType's against the others using the "Value" variable. However, I only want these for quarters (months 1-4, 5-8, etc..). A snap shot of my data is below:
Quarters when referring to dates generally are 3-month long intervals. You are indicating thirds, groups of 4-month long periods. Which is correct quarter with months 1-3, 4-6, 7-9 and 10-12 or thirds?
Next, months of which date variable, Day_1826 or Day_Initial or that Month variable that has not apparent connection to any date?
A matrix, if I understand your desire, likely means that you need to have multiple variables on either the VAR or WITH statements, or possibly both. If you want a correlation between House and Consume expenditures then you need one variable for each on the same record. Which is why @Reeza says you'll need to transpose your data somewhat.
You may get a more concrete solution if you provide data for only 3 of the expense types and multiple months so there is more data to work with.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.