Generate lags for missing data

Solved
Occasional Contributor
Posts: 6

Generate lags for missing data

I have data A with loss of data and data B with full trading days, after merging I get data C. For days of loss data and only in the boundaries of trading days of stock S001, S002, S00..in data A, the return in those days will equal to the returns of the previous days (lag-1). To be more specific, for stock S001 I would like to fill the value for loss data days of 7/4/2012 and 7/12-7/14/2012 (in red colors and to be .2 .21 .21 .21 respectively), but not for days outside the boundaries of 6/26/2012 and 7/22/2012 when the stocks are not listed. How could I do this for the data C and extend to n stocks?

Thank you and best regards.

A                                                                                                           B                                                                                    C

 STOCK DATE RETURN STOCK DATE STOCK DATE RETURN S001 6/27/2012 0.19 S001 6/21/2012 S001 6/21/2012 S001 6/28/2012 0.19 S001 6/22/2012 S001 6/22/2012 S001 6/29/2012 0.2 S001 6/23/2012 S001 6/23/2012 S001 6/30/2012 0.2 S001 6/24/2012 S001 6/24/2012 S001 7/1/2012 0.21 S001 6/25/2012 S001 6/25/2012 S001 7/2/2012 0.21 S001 6/26/2012 S001 6/26/2012 S001 7/3/2012 0.2 S001 6/27/2012 S001 6/27/2012 0.19 S001 7/5/2012 0.19 S001 6/28/2012 S001 6/28/2012 0.19 S001 7/6/2012 0.19 S001 6/29/2012 S001 6/29/2012 0.2 S001 7/7/2012 0.19 S001 6/30/2012 S001 6/30/2012 0.2 S001 7/8/2012 0.2 S001 7/1/2012 S001 7/1/2012 0.21 S001 7/9/2012 0.2 S001 7/2/2012 S001 7/2/2012 0.21 S001 7/10/2012 0.21 S001 7/3/2012 S001 7/3/2012 0.2 S001 7/11/2012 0.21 S001 7/4/2012 S001 7/4/2012 S001 7/15/2012 0.19 S001 7/5/2012 S001 7/5/2012 0.19 S001 7/16/2012 0.2 S001 7/6/2012 S001 7/6/2012 0.19 S001 7/17/2012 0.21 S001 7/7/2012 S001 7/7/2012 0.19 S001 7/18/2012 0.21 S001 7/8/2012 S001 7/8/2012 0.2 S001 7/19/2012 0.21 S001 7/9/2012 S001 7/9/2012 0.2 S001 7/20/2012 0.24 S001 7/10/2012 S001 7/10/2012 0.21 S001 7/21/2012 0.25 S001 7/11/2012 S001 7/11/2012 0.21 S002 6/27/2012 0.19 S001 7/12/2012 S001 7/12/2012 S002 6/28/2012 0.19 S001 7/13/2012 S001 7/13/2012 S002 6/29/2012 0.2 S001 7/14/2012 S001 7/14/2012 S002 6/30/2012 0.2 S001 7/15/2012 S001 7/15/2012 0.19 S002 7/1/2012 0.21 S001 7/16/2012 S001 7/16/2012 0.2 S002 7/2/2012 0.21 S001 7/17/2012 S001 7/17/2012 0.21 S002 7/3/2012 0.2 S001 7/18/2012 S001 7/18/2012 0.21 S002 7/4/2012 0.19 S001 7/19/2012 S001 7/19/2012 0.21 S002 7/5/2012 0.19 S001 7/20/2012 S001 7/20/2012 0.24 S002 7/6/2012 0.19 S001 7/21/2012 S001 7/21/2012 0.25 S002 7/7/2012 0.19 S001 7/22/2012 S001 7/22/2012 S002 7/8/2012 0.2 S001 7/23/2012 S001 7/23/2012 S002 7/9/2012 0.2 S002 6/21/2012 S002 6/21/2012 S002 7/10/2012 0.21 S002 6/22/2012 S002 6/22/2012 S002 7/11/2012 0.21 S002 6/23/2012 S002 6/23/2012 S002 7/15/2012 0.19 S002 6/24/2012 S002 6/24/2012 S002 7/16/2012 0.2 S002 6/25/2012 S002 6/25/2012 S002 7/17/2012 0.21 S002 6/26/2012 S002 6/26/2012 S002 7/18/2012 0.21 S002 6/27/2012 S002 6/27/2012 0.19 S002 7/19/2012 0.21 S002 6/28/2012 S002 6/28/2012 0.19 S002 7/20/2012 0.24 S002 6/29/2012 S002 6/29/2012 0.2 S002 6/30/2012 S002 6/30/2012 0.2 S002 7/1/2012 S002 7/1/2012 0.21 S002 7/2/2012 S002 7/2/2012 0.21 S002 7/3/2012 S002 7/3/2012 0.2 S002 7/4/2012 S002 7/4/2012 0.19 S002 7/5/2012 S002 7/5/2012 0.19 S002 7/6/2012 S002 7/6/2012 0.19 S002 7/7/2012 S002 7/7/2012 0.19 S002 7/8/2012 S002 7/8/2012 0.2 S002 7/9/2012 S002 7/9/2012 0.2 S002 7/10/2012 S002 7/10/2012 0.21 S002 7/11/2012 S002 7/11/2012 0.21 S002 7/12/2012 S002 7/12/2012 S002 7/13/2012 S002 7/13/2012 S002 7/14/2012 S002 7/14/2012 S002 7/15/2012 S002 7/15/2012 0.19 S002 7/16/2012 S002 7/16/2012 0.2 S002 7/17/2012 S002 7/17/2012 0.21 S002 7/18/2012 S002 7/18/2012 0.21 S002 7/19/2012 S002 7/19/2012 0.21 S002 7/20/2012 S002 7/20/2012 0.24 S002 7/21/2012 S002 7/21/2012 S002 7/22/2012 S002 7/22/2012 S002 7/23/2012 S002 7/23/2012

Accepted Solutions
Solution
‎06-30-2012 08:41 AM
Posts: 4,737

Re: Generate lags for missing data

One way to go:

data A;
infile datalines dsd;
input STOCK \$ DATE:mmddyy. RETURN 32.;
format date date9.;
datalines;
S001,6/27/2012,0.19
S001,6/28/2012,0.19
S001,6/29/2012,0.2
S001,6/30/2012,0.2
S001,7/01/2012,0.21
S001,7/02/2012,0.21
S001,7/03/2012,0.2
S001,7/05/2012,0.19
S001,7/06/2012,0.19
S001,7/07/2012,0.19
S001,7/08/2012,0.2
S001,7/09/2012,0.2
S001,7/10/2012,0.21
S001,7/11/2012,0.21
S001,7/15/2012,0.19
S001,7/16/2012,0.2
S001,7/17/2012,0.21
S001,7/18/2012,0.21
S001,7/19/2012,0.21
S001,7/20/2012,0.24
S001,7/21/2012,0.25
S002,6/27/2012,0.19
S002,6/28/2012,0.19
S002,6/29/2012,0.2
S002,6/30/2012,0.2
S002,7/01/2012,0.21
S002,7/02/2012,0.21
S002,7/03/2012,0.2
S002,7/04/2012,0.19
S002,7/05/2012,0.19
S002,7/06/2012,0.19
S002,7/07/2012,0.19
S002,7/08/2012,0.2
S002,7/09/2012,0.2
S002,7/10/2012,0.21
S002,7/11/2012,0.21
S002,7/15/2012,0.19
S002,7/16/2012,0.2
S002,7/17/2012,0.21
S002,7/18/2012,0.21
S002,7/19/2012,0.21
S002,7/20/2012,0.24
;
run;

proc sql;
create view V_A as
select *, max(date) as _max_date format=date9.
from A
group by Stock
order by Stock, Date
;
quit;

data B;
infile datalines dsd;
input STOCK \$ DATE:mmddyy.;
format date date9.;
datalines;
S001,6/21/2012
S001,6/22/2012
S001,6/23/2012
S001,6/24/2012
S001,6/25/2012
S001,6/26/2012
S001,6/27/2012
S001,6/28/2012
S001,6/29/2012
S001,6/30/2012
S001,7/01/2012
S001,7/02/2012
S001,7/03/2012
S001,7/04/2012
S001,7/05/2012
S001,7/06/2012
S001,7/07/2012
S001,7/08/2012
S001,7/09/2012
S001,7/10/2012
S001,7/11/2012
S001,7/12/2012
S001,7/13/2012
S001,7/14/2012
S001,7/15/2012
S001,7/16/2012
S001,7/17/2012
S001,7/18/2012
S001,7/19/2012
S001,7/20/2012
S001,7/21/2012
S001,7/22/2012
S001,7/23/2012
S002,6/21/2012
S002,6/22/2012
S002,6/23/2012
S002,6/24/2012
S002,6/25/2012
S002,6/26/2012
S002,6/27/2012
S002,6/28/2012
S002,6/29/2012
S002,6/30/2012
S002,7/01/2012
S002,7/02/2012
S002,7/03/2012
S002,7/04/2012
S002,7/05/2012
S002,7/06/2012
S002,7/07/2012
S002,7/08/2012
S002,7/09/2012
S002,7/10/2012
S002,7/11/2012
S002,7/12/2012
S002,7/13/2012
S002,7/14/2012
S002,7/15/2012
S002,7/16/2012
S002,7/17/2012
S002,7/18/2012
S002,7/19/2012
S002,7/20/2012
S002,7/21/2012
S002,7/22/2012
S002,7/23/2012
;
run;

data C (drop=_;
merge V_A (in=inA) B;
by stock date;
retain _R_max_date _R_return;
if first.stock then call missing(_R_max_date,_R_return);
_R_max_date =coalesce(_max_date,_R_max_date);
_R_return   =coalesce(return,_R_return);
if not inA and date < _R_max_date then return=_R_return;
run;

All Replies
Solution
‎06-30-2012 08:41 AM
Posts: 4,737

Re: Generate lags for missing data

One way to go:

data A;
infile datalines dsd;
input STOCK \$ DATE:mmddyy. RETURN 32.;
format date date9.;
datalines;
S001,6/27/2012,0.19
S001,6/28/2012,0.19
S001,6/29/2012,0.2
S001,6/30/2012,0.2
S001,7/01/2012,0.21
S001,7/02/2012,0.21
S001,7/03/2012,0.2
S001,7/05/2012,0.19
S001,7/06/2012,0.19
S001,7/07/2012,0.19
S001,7/08/2012,0.2
S001,7/09/2012,0.2
S001,7/10/2012,0.21
S001,7/11/2012,0.21
S001,7/15/2012,0.19
S001,7/16/2012,0.2
S001,7/17/2012,0.21
S001,7/18/2012,0.21
S001,7/19/2012,0.21
S001,7/20/2012,0.24
S001,7/21/2012,0.25
S002,6/27/2012,0.19
S002,6/28/2012,0.19
S002,6/29/2012,0.2
S002,6/30/2012,0.2
S002,7/01/2012,0.21
S002,7/02/2012,0.21
S002,7/03/2012,0.2
S002,7/04/2012,0.19
S002,7/05/2012,0.19
S002,7/06/2012,0.19
S002,7/07/2012,0.19
S002,7/08/2012,0.2
S002,7/09/2012,0.2
S002,7/10/2012,0.21
S002,7/11/2012,0.21
S002,7/15/2012,0.19
S002,7/16/2012,0.2
S002,7/17/2012,0.21
S002,7/18/2012,0.21
S002,7/19/2012,0.21
S002,7/20/2012,0.24
;
run;

proc sql;
create view V_A as
select *, max(date) as _max_date format=date9.
from A
group by Stock
order by Stock, Date
;
quit;

data B;
infile datalines dsd;
input STOCK \$ DATE:mmddyy.;
format date date9.;
datalines;
S001,6/21/2012
S001,6/22/2012
S001,6/23/2012
S001,6/24/2012
S001,6/25/2012
S001,6/26/2012
S001,6/27/2012
S001,6/28/2012
S001,6/29/2012
S001,6/30/2012
S001,7/01/2012
S001,7/02/2012
S001,7/03/2012
S001,7/04/2012
S001,7/05/2012
S001,7/06/2012
S001,7/07/2012
S001,7/08/2012
S001,7/09/2012
S001,7/10/2012
S001,7/11/2012
S001,7/12/2012
S001,7/13/2012
S001,7/14/2012
S001,7/15/2012
S001,7/16/2012
S001,7/17/2012
S001,7/18/2012
S001,7/19/2012
S001,7/20/2012
S001,7/21/2012
S001,7/22/2012
S001,7/23/2012
S002,6/21/2012
S002,6/22/2012
S002,6/23/2012
S002,6/24/2012
S002,6/25/2012
S002,6/26/2012
S002,6/27/2012
S002,6/28/2012
S002,6/29/2012
S002,6/30/2012
S002,7/01/2012
S002,7/02/2012
S002,7/03/2012
S002,7/04/2012
S002,7/05/2012
S002,7/06/2012
S002,7/07/2012
S002,7/08/2012
S002,7/09/2012
S002,7/10/2012
S002,7/11/2012
S002,7/12/2012
S002,7/13/2012
S002,7/14/2012
S002,7/15/2012
S002,7/16/2012
S002,7/17/2012
S002,7/18/2012
S002,7/19/2012
S002,7/20/2012
S002,7/21/2012
S002,7/22/2012
S002,7/23/2012
;
run;

data C (drop=_;
merge V_A (in=inA) B;
by stock date;
retain _R_max_date _R_return;
if first.stock then call missing(_R_max_date,_R_return);
_R_max_date =coalesce(_max_date,_R_max_date);
_R_return   =coalesce(return,_R_return);
if not inA and date < _R_max_date then return=_R_return;
run;

Posts: 3,167

Re: Generate lags for missing data

If you already have 'c' and you also have SAS/ETS, besides data step, you could also try:

proc timeseries data=c out=want;

id date  interval=day setmissing=previous;

var return;

by stock;

run;

Haikuo

The following is the data step approach:

data want;

retain _r;

set c;

by stock;

return=coalesce(return,_r);

_r=ifn(last.stock,.,return);

drop _r;

run;

🔒 This topic is solved and locked.