Hi all!
I am trying to create a correlation matrix where I get the pearson value for each of the ExpenseType's against the others using the "Value" variable. However, I only want these for quarters (months 1-4, 5-8, etc..). A snap shot of my data is below:
input STUDYNO:32. PID:BEST. Day_1826:DATE9. Day_InitialIW:DATE9. TRICHOT_1YR:BEST12. Personal_consumption_expenditure:BEST12. Month:32. Interval:$5. TimeFrame:$20. ExpenseType:$20. Value:32.;
format PID BEST. Day_1826 DATE9. Day_InitialIW DATE9. TRICHOT_1YR BEST12. Personal_consumption_expenditure BEST12.;
label STUDYNO="Sample size: 55,144,41,185,191, Total:616" PID="SUBJID#" Day_1826="Day1826" Day_InitialIW="IWInitial" TRICHOT_1YR="RA 273, UR 140, RNA 80, Missing 123";
datalines;
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr House 3164.5029543
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Consume 820.0784735
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Durable .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Health 332.71026714
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Insure .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Loan 0
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Tax .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Educa .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Gift 16262.542769
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Enter 209.70802449
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Recre 0
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr FinPlan .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Legala 0
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr IncomeTL 61368.08592
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr Trans 3334.881462
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PreResYr AlcSpend 72.87460203
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr House 3075.6491125
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr Consume 808.7741655
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr Durable .
2 486 30DEC1992 05OCT1993 1 68.892 1 Month PostResYr Health 316.74626874The code that I have been tinkering with looks like this:
proc corr data= Sub.Tall_M_IndEXP_Better_JB_011119;
var value;
by ExpenseType;
where Month between 1 and 4;
run;However, this gives me the 15 separate correlations when I want a matrix.
Any suggestions? Am i going to have to transpose these expense types into variables to able to get a correlation between them?
@joebacon wrote:
Hi all!
I am trying to create a correlation matrix where I get the pearson value for each of the ExpenseType's against the others using the "Value" variable. However, I only want these for quarters (months 1-4, 5-8, etc..). A snap shot of my data is below:
Quarters when referring to dates generally are 3-month long intervals. You are indicating thirds, groups of 4-month long periods. Which is correct quarter with months 1-3, 4-6, 7-9 and 10-12 or thirds?
Next, months of which date variable, Day_1826 or Day_Initial or that Month variable that has not apparent connection to any date?
A matrix, if I understand your desire, likely means that you need to have multiple variables on either the VAR or WITH statements, or possibly both. If you want a correlation between House and Consume expenditures then you need one variable for each on the same record. Which is why @Reeza says you'll need to transpose your data somewhat.
You may get a more concrete solution if you provide data for only 3 of the expense types and multiple months so there is more data to work with.
Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.
Explore Now →SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.