I want to calculate 3 lag values of invest i.e. lag1, lag2, lag3, but the problem is some of the years in my data are missing, how could it be treated
year | CUSIP | Invest |
2003 | BN5522 | 3109000 |
2004 | BN5522 | 2997000 |
2005 | BN5522 | 3496000 |
2006 | BN5522 | 2998873 |
2009 | BN5522 | 4642000 |
2010 | BN5522 | 5006000 |
2011 | BN5522 | 4837000 |
2012 | BN5522 | 6312000 |
2003 | APC6748L | 26679 |
2004 | APC6748L | 29661 |
2005 | APC6748L | 65606 |
2007 | APC6748L | 28439 |
2008 | APC6748L | 18375 |
2011 | APC6748L | 62001 |
2012 | APC6748L | 39696 |
2013 | APC6748L | 39492 |
2003 | BM01045 | . |
2004 | BM01045 | 0 |
2005 | BM01045 | 741997 |
2008 | BM01045 | 80947 |
2009 | BM01045 | 142666 |
If you have SAS/ETS use PROC EXPAND. You may need to use PROC TIMESERIES to complete the series.
Instead of using LAG it will be easier to transpose the dataset and then
define array of yearly invests. Try:
proc transpose data=have
out=temp(drop=_name_) profix=yr;
by CUSIP;
id YEAR;
var INVEST;
run;
data temp1;
set temp;
array y_invest yr2000-yr2015;
... any other code to do the job ...
run;
I take it that you only want exact 1-year, 2-year, and 3-year lags. This program does that.
Note I use nested IFN's. That's because every alternative outcome in an IFN is evaluated, whether the if-condition is true or not (i.e. regardless of which outcome is chosen). This ensures that the fifo queues underlying the LAG functions are always properly synchronized.
data want (drop=seq);
set have;
by cusip;
seq+1;
if first.cusip then seq=1;
lgyears1= ifn(seq>1,ifn(lag1(year)=year-1,lag1(invest),.),.);
lgyears2= ifn(seq>2,ifn(lag2(year)=year-2,lag2(invest),.),.);
lgyears3= ifn(seq>3,ifn(lag3(year)=year-3,lag3(invest),.),.);
run;
Now, if you only want to keep years which have a complete set of non-missing lags, then put a subsetting IF statement
if nmiss(lgyears1,lgyears2,lgyears3)=0;
just before the RUN statement.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.