Hi,
I'd like to use proc expand to generate a moving maxium number between lagged 3 and lagged windows. For example, the data I have is like this:
id | year | amt |
1 | 1991 | 1 |
1 | 1992 | 2 |
1 | 1993 | 3 |
1 | 1994 | 4 |
1 | 1995 | 5 |
1 | 1996 | 6 |
1 | 1997 | 7 |
1 | 1998 | 8 |
1 | 1999 | 9 |
I want to generate a new column maxamt, equal to max(lagged 5, lagged 4, lagged 3) like this:
id | year | amt | maxamt |
1 | 1991 | 1 | |
1 | 1992 | 2 | |
1 | 1993 | 3 | |
1 | 1994 | 4 | |
1 | 1995 | 5 | |
1 | 1996 | 6 | 3 |
1 | 1997 | 7 | 4 |
1 | 1998 | 8 | 5 |
1 | 1999 | 9 | 6 |
Can it be realized through proc expand?
Thanks!
Why not a DATA step?
data want;
set have;
by id;
if first.id
then count = 1;
else count + 1;
maxamt = max(lag3(amt),lag4(amt),lag5(amt));
if count le 5 then maxamt = .;
drop count;
run;
Untested, posted from my tablet.
Hello,
Why is maxamt empty for 1995??
Anyway, you can do this in one step by using:
With PROC EXPAND, you will need a post-processing data step (2 steps in total).
See here:
data have;
input id year amt;
yeardt=MDY(12,31,year);
format yeardt date9.;
datalines;
1 1991 1
1 1992 2
1 1993 3
1 1994 4
1 1995 5
1 1996 6
1 1997 7
1 1998 8
1 1999 9
;
run;
proc expand data=have out=want method=none;
id yeardt;
convert amt;
convert amt = amt_lag3 / transformout=(lag 3);
convert amt = amt_lag4 / transformout=(lag 4);
convert amt = amt_lag5 / transformout=(lag 4);
run;
data want;
set want;
if (amt_lag3=. or amt_lag4=. or amt_lag5=.) then maxamt=.;
else maxamt=max(amt_lag3,amt_lag4,amt_lag5);
run;
/* end of program */
Koen
OK, the data set named 'work.want_array' is what you want.
PROC TIMEDATA is also known as "the Data Step for Time Series".
data have;
input id year amt;
yeardt=MDY(12,31,year);
format yeardt date9.;
datalines;
1 1991 1
1 1992 2
1 1993 3
1 1994 4
1 1995 5
1 1996 6
1 1997 7
1 1998 8
1 1999 9
;
run;
proc timedata data=have out=want outarray=work.want_array print=(arrays);
id yeardt interval=year accumulate=average format=yymmdd.;
vars amt;
outarrays amt_lag3 amt_lag4 amt_lag5 maxamt;
do t = 1 to dim(amt);
amt_lag3[t] = amt[t-3];
amt_lag4[t] = amt[t-4];
amt_lag5[t] = amt[t-5];
if (amt_lag3[t]=. or amt_lag4[t]=. or amt_lag5[t]=.) then maxamt[t]=.;
else maxamt[t] = max(amt_lag3[t],amt_lag4[t],amt_lag5[t]);
end;
run;
/* end of program */
Koen
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.