Hi,
I'd like to use proc expand to generate a moving maxium number between lagged 3 and lagged windows. For example, the data I have is like this:
id | year | amt |
1 | 1991 | 1 |
1 | 1992 | 2 |
1 | 1993 | 3 |
1 | 1994 | 4 |
1 | 1995 | 5 |
1 | 1996 | 6 |
1 | 1997 | 7 |
1 | 1998 | 8 |
1 | 1999 | 9 |
I want to generate a new column maxamt, equal to max(lagged 5, lagged 4, lagged 3) like this:
id | year | amt | maxamt |
1 | 1991 | 1 | |
1 | 1992 | 2 | |
1 | 1993 | 3 | |
1 | 1994 | 4 | |
1 | 1995 | 5 | |
1 | 1996 | 6 | 3 |
1 | 1997 | 7 | 4 |
1 | 1998 | 8 | 5 |
1 | 1999 | 9 | 6 |
Can it be realized through proc expand?
Thanks!
Why not a DATA step?
data want;
set have;
by id;
if first.id
then count = 1;
else count + 1;
maxamt = max(lag3(amt),lag4(amt),lag5(amt));
if count le 5 then maxamt = .;
drop count;
run;
Untested, posted from my tablet.
Hello,
Why is maxamt empty for 1995??
Anyway, you can do this in one step by using:
With PROC EXPAND, you will need a post-processing data step (2 steps in total).
See here:
data have;
input id year amt;
yeardt=MDY(12,31,year);
format yeardt date9.;
datalines;
1 1991 1
1 1992 2
1 1993 3
1 1994 4
1 1995 5
1 1996 6
1 1997 7
1 1998 8
1 1999 9
;
run;
proc expand data=have out=want method=none;
id yeardt;
convert amt;
convert amt = amt_lag3 / transformout=(lag 3);
convert amt = amt_lag4 / transformout=(lag 4);
convert amt = amt_lag5 / transformout=(lag 4);
run;
data want;
set want;
if (amt_lag3=. or amt_lag4=. or amt_lag5=.) then maxamt=.;
else maxamt=max(amt_lag3,amt_lag4,amt_lag5);
run;
/* end of program */
Koen
OK, the data set named 'work.want_array' is what you want.
PROC TIMEDATA is also known as "the Data Step for Time Series".
data have;
input id year amt;
yeardt=MDY(12,31,year);
format yeardt date9.;
datalines;
1 1991 1
1 1992 2
1 1993 3
1 1994 4
1 1995 5
1 1996 6
1 1997 7
1 1998 8
1 1999 9
;
run;
proc timedata data=have out=want outarray=work.want_array print=(arrays);
id yeardt interval=year accumulate=average format=yymmdd.;
vars amt;
outarrays amt_lag3 amt_lag4 amt_lag5 maxamt;
do t = 1 to dim(amt);
amt_lag3[t] = amt[t-3];
amt_lag4[t] = amt[t-4];
amt_lag5[t] = amt[t-5];
if (amt_lag3[t]=. or amt_lag4[t]=. or amt_lag5[t]=.) then maxamt[t]=.;
else maxamt[t] = max(amt_lag3[t],amt_lag4[t],amt_lag5[t]);
end;
run;
/* end of program */
Koen
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.