Hi all,
I am following up on my previous question on creating macro variables for the change variables (last_3month, prev_3month, last_6month, prev_6month, last_12month, prev_12month). Below is a summary of what I wanted to do before:
Given a macro variable &latest_date in the format MONYY (e.g. SEP14), I would like a way to make 6 macro variables:
Last_3month JUL14-SEP14
Prev_3month APR14-JUN14
Last_6month APR14-SEP14
Prev_6month OCT14-MAR14
Last_12month OCT13-SEP14
Prev_12month OCT12-SEP13
Now I have something a bit challenging. Instead of having a fixed &latest_date for each observation, now I have a new date variable called EVAL_BEFORE_DATE-ddmmmyyyy11. format (along with other numeric variables JAN14--DEC14,JAN13-DEC13), based on which I would like create these change variables for each observation. For example,
For the first observation,
if month(01-01-2013)=JAN, then
Last_3month NOV12-JAN13
Prev_3month AUG12-OCT12
Last_6month AUG12-JAN13
Prev_6month FEB11-JUL12
Last_12month FEB12-JAN13
Prev_12month FEB11-JAN12
For the second observation,
if month(01-09-2014)=SEP, then
Last_3month JUL14-SEP14
Prev_3month APR14-JUN14
Last_6month APR14-SEP14
Prev_6month OCT14-MAR14
Last_12month OCT13-SEP14
Prev_12month OCT12-SEP13
Automating the process of creating those change variables would be the same as before, but I am a bit confused how to automate the getting the month part. Any suggestions would be greatly appreciated. Thanks very much.
Here's an example that uses VvalueX and the INTNX function to generate the months. You can set up separate arrays for each lag period and recalculated as needed.
*Generate sample data;
data have;
array Y2(12) Jan14 Feb14 Mar14 Apr14 May14 Jun14 Jul14 Aug14 Sep14 Oct14 Nov14 Dec14;
format last_date date9.;
do j=1 to 2;
do i=1 to 12;
y2(i)=rand('normal', 52, 5);
end;
if j=1 then last_date='01May2014'd;
else last_date='01Sep2014'd;
output;
end;
drop j i;
run;
*Sample output for previous 3 months;
data want;
set have;
array v_3(3) v1 v2 v3;
do i=0 to 2;
var=put(intnx('month', last_date, -1*i), monyy5.);
v_3(i+1)=vvaluex(var);
end;
month3_mean=mean(of v_3(*));
month3_std=std(of v_3(*));
run;
Depending on what you're doing in your future process I'm going to generally recommend against this method. Macro variables are good but this might be over usage.
Have you considered an array with the vname and/or vnext functions?
Thanks for your reply Reeza. I see. An array might a good option too. But do you mean creating an array of SAS variables (JAN14--DEC14) or the change variables (e.g., last previous months) ? Can you please provide a bit more hints on using arrays? Thanks,
can you provide sample data and output? I have no idea of what you're using it for so can only provide vague suggestions.
Hi Reeza,
Below are the sample data and output.
sample data:
id JAN11--DEC11 JAN12--DEC12 JAN13 --DEC13 JAN14--DEC14 EVAL_BEFORE_DATE
1 23 45 34 23 41 27 51 65 01-01-2013
2 24 43 21 32 34 56 76 21 01-09-2014
Sample output
id JAN11--DEC11 JAN12--DEC12 JAN13 --DEC13 JAN14--DEC14 EVAL_BEFORE_DATE last_3month prev_3month last_6moth prev_6month
1 23 45 34 23 41 27 51 65 01-01-2013 NOV12-JAN13 AUG12-OCT12 AUG12-JAN13 FEB11-JUL12
2 24 43 21 32 34 56 76 21 01-09-2014 JUL14-SEP14 APR14-JUN14 APR14-SEP14 OCT14-MAR14
last_12month prev_12month
FEB12-JAN13 FEB11-JAN12
OCT13-SEP14 OCT12-SEP13
where last_3month--prev_12month are obtained based on the EVAL_BEFORE_DATE. For the sake of clarification, I put those change variables last_3month--prev_12month in the output data. But I would do some summary statistics like calculating the mean on those change variables instead. Thank you.
What are you using the last_12month etc variables for? My theory is that you don't actually need them.
Hi Reeza,
I want to obtain some summary statistics on those change variables (e.g.,last_12month). So those added columns on the output dataset should be those summary statistics (mean or standard deviations). I put those values (e.g.,NOV12-JAN13) just for the sake of illustration.
Change your variable name structure to V201201 -- V201212 then you can use the method suggested here:
SAS Sum specific column that the name are stored in other column - Stack Overflow
The issue/annoyance to me would be the manual standard deviation calculation.
I'd consider flipping the data and then using proc means instead myself.
Here's an example that uses VvalueX and the INTNX function to generate the months. You can set up separate arrays for each lag period and recalculated as needed.
*Generate sample data;
data have;
array Y2(12) Jan14 Feb14 Mar14 Apr14 May14 Jun14 Jul14 Aug14 Sep14 Oct14 Nov14 Dec14;
format last_date date9.;
do j=1 to 2;
do i=1 to 12;
y2(i)=rand('normal', 52, 5);
end;
if j=1 then last_date='01May2014'd;
else last_date='01Sep2014'd;
output;
end;
drop j i;
run;
*Sample output for previous 3 months;
data want;
set have;
array v_3(3) v1 v2 v3;
do i=0 to 2;
var=put(intnx('month', last_date, -1*i), monyy5.);
v_3(i+1)=vvaluex(var);
end;
month3_mean=mean(of v_3(*));
month3_std=std(of v_3(*));
run;
As an alternative, try normalizing your data. Merely changing the structure from wide to long sometimes helps greatly.
ID DATE VAL
1 01JAN11 23
1 01DEC11 45
1 01JAN12 34
...
Then its a simple case of using some SQL queries to roll the data up.
Hi Reeza,
Thanks a lot for pointing out this method. It worked out for me. Below are the code I extended for my purposes.
*Sample output for change variables months;
data wantFinal (drop=v:);
set have;
/*last_3month*/
array l_3(3) v1 v2 v3;
do i=-3, -2, -1;
var=put(intnx('month', last_date, i), monyy5.);
l_3(i+4)=vvaluex(var);
end;
month3l_mean=mean(of l_3(*));
month3l_std=std(of l_3(*));
*previous 3 month;
array p_3(3) v4 v5 v6;
do i=-6,-5,-4;
var=put(intnx('month', last_date, i), monyy5.);
p_3(i+7)=vvaluex(var);
end;
month3p_mean=mean(of p_3(*));
month3p_std=std(of p_3(*));
*last 6 month;
array l_6(6) v61 v62 v63 v64 v65 v66;
do i=-6,-5,-4,-3,-2,-1;
var=put(intnx('month', last_date, i), monyy5.);
l_6(i+7)=vvaluex(var);
end;
month6l_mean=mean(of l_6(*));
month6l_std=std(of l_6(*));
*previous 6 month;
array p_6(6) v67 v68 v69 v610 v611 v612;
do i=-12,-11,-10,-9,-8,-7;
var=put(intnx('month', last_date, i), monyy5.);
p_6(i+13)=vvaluex(var);
end;
month6p_mean=mean(of p_6(*));
month6p_std=std(of p_6(*));
*last 12 month;
array l_12(12) v121 v122 v123 v124 v125 v126 v127 v128 v129 v1210 v1211 v1212;
do i=-12,-11,-10,-9,-8,-7,-6,-5,-4,-3,-2,-1;
var=put(intnx('month', last_date, i), monyy5.);
l_12(i+13)=vvaluex(var);
end;
month12l_mean=mean(of l_12(*));
month12l_std=std(of l_12(*));
*previous 12 month;
array p_12(12) v1210 v1220 v1230 v1240 v1250 v1260 v1270 v1280 v1290 v12100 v12110 v12120;
do i=-24,-23,-22,-21,-20,-19,-18,-17,-16,-15,-14,-13;
var=put(intnx('month', last_date, i), monyy5.);
p_12(i+25)=vvaluex(var);
end;
month12p_mean=mean(of p_12(*));
month12p_std=std(of p_12(*));
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
