Help using Base SAS procedures

Generate lags for missing data

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

Generate lags for missing data

I have data A with loss of data and data B with full trading days, after merging I get data C. For days of loss data and only in the boundaries of trading days of stock S001, S002, S00..in data A, the return in those days will equal to the returns of the previous days (lag-1). To be more specific, for stock S001 I would like to fill the value for loss data days of 7/4/2012 and 7/12-7/14/2012 (in red colors and to be .2 .21 .21 .21 respectively), but not for days outside the boundaries of 6/26/2012 and 7/22/2012 when the stocks are not listed. How could I do this for the data C and extend to n stocks?

Thank you and best regards.

                      A                                                                                                           B                                                                                    C

STOCKDATERETURNSTOCKDATESTOCKDATERETURN
S0016/27/20120.19S0016/21/2012S0016/21/2012
S0016/28/20120.19S0016/22/2012S0016/22/2012
S0016/29/20120.2S0016/23/2012S0016/23/2012
S0016/30/20120.2S0016/24/2012S0016/24/2012
S0017/1/20120.21S0016/25/2012S0016/25/2012
S0017/2/20120.21S0016/26/2012S0016/26/2012
S0017/3/20120.2S0016/27/2012S0016/27/20120.19
S0017/5/20120.19S0016/28/2012S0016/28/20120.19
S0017/6/20120.19S0016/29/2012S0016/29/20120.2
S0017/7/20120.19S0016/30/2012S0016/30/20120.2
S0017/8/20120.2S0017/1/2012S0017/1/20120.21
S0017/9/20120.2S0017/2/2012S0017/2/20120.21
S0017/10/20120.21S0017/3/2012S0017/3/20120.2
S0017/11/20120.21S0017/4/2012S0017/4/2012
S0017/15/20120.19S0017/5/2012S0017/5/20120.19
S0017/16/20120.2S0017/6/2012S0017/6/20120.19
S0017/17/20120.21S0017/7/2012S0017/7/20120.19
S0017/18/20120.21S0017/8/2012S0017/8/20120.2
S0017/19/20120.21S0017/9/2012S0017/9/20120.2
S0017/20/20120.24S0017/10/2012S0017/10/20120.21
S0017/21/20120.25S0017/11/2012S0017/11/20120.21
S0026/27/20120.19S0017/12/2012S0017/12/2012
S0026/28/20120.19S0017/13/2012S0017/13/2012
S0026/29/20120.2S0017/14/2012S0017/14/2012
S0026/30/20120.2S0017/15/2012S0017/15/20120.19
S0027/1/20120.21S0017/16/2012S0017/16/20120.2
S0027/2/20120.21S0017/17/2012S0017/17/20120.21
S0027/3/20120.2S0017/18/2012S0017/18/20120.21
S0027/4/20120.19S0017/19/2012S0017/19/20120.21
S0027/5/20120.19S0017/20/2012S0017/20/20120.24
S0027/6/20120.19S0017/21/2012S0017/21/20120.25
S0027/7/20120.19S0017/22/2012S0017/22/2012
S0027/8/20120.2S0017/23/2012S0017/23/2012
S0027/9/20120.2S0026/21/2012S0026/21/2012
S0027/10/20120.21S0026/22/2012S0026/22/2012
S0027/11/20120.21S0026/23/2012S0026/23/2012
S0027/15/20120.19S0026/24/2012S0026/24/2012
S0027/16/20120.2S0026/25/2012S0026/25/2012
S0027/17/20120.21S0026/26/2012S0026/26/2012
S0027/18/20120.21S0026/27/2012S0026/27/20120.19
S0027/19/20120.21S0026/28/2012S0026/28/20120.19
S0027/20/20120.24S0026/29/2012S0026/29/20120.2
S0026/30/2012S0026/30/20120.2
S0027/1/2012S0027/1/20120.21
S0027/2/2012S0027/2/20120.21
S0027/3/2012S0027/3/20120.2
S0027/4/2012S0027/4/20120.19
S0027/5/2012S0027/5/20120.19
S0027/6/2012S0027/6/20120.19
S0027/7/2012S0027/7/20120.19
S0027/8/2012S0027/8/20120.2
S0027/9/2012S0027/9/20120.2
S0027/10/2012S0027/10/20120.21
S0027/11/2012S0027/11/20120.21
S0027/12/2012S0027/12/2012
S0027/13/2012S0027/13/2012
S0027/14/2012S0027/14/2012
S0027/15/2012S0027/15/20120.19
S0027/16/2012S0027/16/20120.2
S0027/17/2012S0027/17/20120.21
S0027/18/2012S0027/18/20120.21
S0027/19/2012S0027/19/20120.21
S0027/20/2012S0027/20/20120.24
S0027/21/2012S0027/21/2012
S0027/22/2012S0027/22/2012
S0027/23/2012S0027/23/2012

Accepted Solutions
Solution
‎06-30-2012 08:41 AM
Respected Advisor
Posts: 4,173

Re: Generate lags for missing data

One way to go:

data A;
  infile datalines dsd;
  input STOCK $ DATE:mmddyy. RETURN 32.;
  format date date9.;
datalines;
S001,6/27/2012,0.19
S001,6/28/2012,0.19
S001,6/29/2012,0.2
S001,6/30/2012,0.2
S001,7/01/2012,0.21
S001,7/02/2012,0.21
S001,7/03/2012,0.2
S001,7/05/2012,0.19
S001,7/06/2012,0.19
S001,7/07/2012,0.19
S001,7/08/2012,0.2
S001,7/09/2012,0.2
S001,7/10/2012,0.21
S001,7/11/2012,0.21
S001,7/15/2012,0.19
S001,7/16/2012,0.2
S001,7/17/2012,0.21
S001,7/18/2012,0.21
S001,7/19/2012,0.21
S001,7/20/2012,0.24
S001,7/21/2012,0.25
S002,6/27/2012,0.19
S002,6/28/2012,0.19
S002,6/29/2012,0.2
S002,6/30/2012,0.2
S002,7/01/2012,0.21
S002,7/02/2012,0.21
S002,7/03/2012,0.2
S002,7/04/2012,0.19
S002,7/05/2012,0.19
S002,7/06/2012,0.19
S002,7/07/2012,0.19
S002,7/08/2012,0.2
S002,7/09/2012,0.2
S002,7/10/2012,0.21
S002,7/11/2012,0.21
S002,7/15/2012,0.19
S002,7/16/2012,0.2
S002,7/17/2012,0.21
S002,7/18/2012,0.21
S002,7/19/2012,0.21
S002,7/20/2012,0.24
;
run;

proc sql;
  create view V_A as 
  select *, max(date) as _max_date format=date9.
  from A
  group by Stock
  order by Stock, Date
  ;
quit;

data B;
  infile datalines dsd;
  input STOCK $ DATE:mmddyy.;
  format date date9.;
datalines;
S001,6/21/2012
S001,6/22/2012
S001,6/23/2012
S001,6/24/2012
S001,6/25/2012
S001,6/26/2012
S001,6/27/2012
S001,6/28/2012
S001,6/29/2012
S001,6/30/2012
S001,7/01/2012
S001,7/02/2012
S001,7/03/2012
S001,7/04/2012
S001,7/05/2012
S001,7/06/2012
S001,7/07/2012
S001,7/08/2012
S001,7/09/2012
S001,7/10/2012
S001,7/11/2012
S001,7/12/2012
S001,7/13/2012
S001,7/14/2012
S001,7/15/2012
S001,7/16/2012
S001,7/17/2012
S001,7/18/2012
S001,7/19/2012
S001,7/20/2012
S001,7/21/2012
S001,7/22/2012
S001,7/23/2012
S002,6/21/2012
S002,6/22/2012
S002,6/23/2012
S002,6/24/2012
S002,6/25/2012
S002,6/26/2012
S002,6/27/2012
S002,6/28/2012
S002,6/29/2012
S002,6/30/2012
S002,7/01/2012
S002,7/02/2012
S002,7/03/2012
S002,7/04/2012
S002,7/05/2012
S002,7/06/2012
S002,7/07/2012
S002,7/08/2012
S002,7/09/2012
S002,7/10/2012
S002,7/11/2012
S002,7/12/2012
S002,7/13/2012
S002,7/14/2012
S002,7/15/2012
S002,7/16/2012
S002,7/17/2012
S002,7/18/2012
S002,7/19/2012
S002,7/20/2012
S002,7/21/2012
S002,7/22/2012
S002,7/23/2012
;
run;


data C (drop=_Smiley Happy;
  merge V_A (in=inA) B;
  by stock date;
  retain _R_max_date _R_return;
  if first.stock then call missing(_R_max_date,_R_return);
  _R_max_date =coalesce(_max_date,_R_max_date);
  _R_return   =coalesce(return,_R_return);
  if not inA and date < _R_max_date then return=_R_return;
run;

View solution in original post


All Replies
Solution
‎06-30-2012 08:41 AM
Respected Advisor
Posts: 4,173

Re: Generate lags for missing data

One way to go:

data A;
  infile datalines dsd;
  input STOCK $ DATE:mmddyy. RETURN 32.;
  format date date9.;
datalines;
S001,6/27/2012,0.19
S001,6/28/2012,0.19
S001,6/29/2012,0.2
S001,6/30/2012,0.2
S001,7/01/2012,0.21
S001,7/02/2012,0.21
S001,7/03/2012,0.2
S001,7/05/2012,0.19
S001,7/06/2012,0.19
S001,7/07/2012,0.19
S001,7/08/2012,0.2
S001,7/09/2012,0.2
S001,7/10/2012,0.21
S001,7/11/2012,0.21
S001,7/15/2012,0.19
S001,7/16/2012,0.2
S001,7/17/2012,0.21
S001,7/18/2012,0.21
S001,7/19/2012,0.21
S001,7/20/2012,0.24
S001,7/21/2012,0.25
S002,6/27/2012,0.19
S002,6/28/2012,0.19
S002,6/29/2012,0.2
S002,6/30/2012,0.2
S002,7/01/2012,0.21
S002,7/02/2012,0.21
S002,7/03/2012,0.2
S002,7/04/2012,0.19
S002,7/05/2012,0.19
S002,7/06/2012,0.19
S002,7/07/2012,0.19
S002,7/08/2012,0.2
S002,7/09/2012,0.2
S002,7/10/2012,0.21
S002,7/11/2012,0.21
S002,7/15/2012,0.19
S002,7/16/2012,0.2
S002,7/17/2012,0.21
S002,7/18/2012,0.21
S002,7/19/2012,0.21
S002,7/20/2012,0.24
;
run;

proc sql;
  create view V_A as 
  select *, max(date) as _max_date format=date9.
  from A
  group by Stock
  order by Stock, Date
  ;
quit;

data B;
  infile datalines dsd;
  input STOCK $ DATE:mmddyy.;
  format date date9.;
datalines;
S001,6/21/2012
S001,6/22/2012
S001,6/23/2012
S001,6/24/2012
S001,6/25/2012
S001,6/26/2012
S001,6/27/2012
S001,6/28/2012
S001,6/29/2012
S001,6/30/2012
S001,7/01/2012
S001,7/02/2012
S001,7/03/2012
S001,7/04/2012
S001,7/05/2012
S001,7/06/2012
S001,7/07/2012
S001,7/08/2012
S001,7/09/2012
S001,7/10/2012
S001,7/11/2012
S001,7/12/2012
S001,7/13/2012
S001,7/14/2012
S001,7/15/2012
S001,7/16/2012
S001,7/17/2012
S001,7/18/2012
S001,7/19/2012
S001,7/20/2012
S001,7/21/2012
S001,7/22/2012
S001,7/23/2012
S002,6/21/2012
S002,6/22/2012
S002,6/23/2012
S002,6/24/2012
S002,6/25/2012
S002,6/26/2012
S002,6/27/2012
S002,6/28/2012
S002,6/29/2012
S002,6/30/2012
S002,7/01/2012
S002,7/02/2012
S002,7/03/2012
S002,7/04/2012
S002,7/05/2012
S002,7/06/2012
S002,7/07/2012
S002,7/08/2012
S002,7/09/2012
S002,7/10/2012
S002,7/11/2012
S002,7/12/2012
S002,7/13/2012
S002,7/14/2012
S002,7/15/2012
S002,7/16/2012
S002,7/17/2012
S002,7/18/2012
S002,7/19/2012
S002,7/20/2012
S002,7/21/2012
S002,7/22/2012
S002,7/23/2012
;
run;


data C (drop=_Smiley Happy;
  merge V_A (in=inA) B;
  by stock date;
  retain _R_max_date _R_return;
  if first.stock then call missing(_R_max_date,_R_return);
  _R_max_date =coalesce(_max_date,_R_max_date);
  _R_return   =coalesce(return,_R_return);
  if not inA and date < _R_max_date then return=_R_return;
run;

Respected Advisor
Posts: 3,156

Re: Generate lags for missing data

If you already have 'c' and you also have SAS/ETS, besides data step, you could also try:

proc timeseries data=c out=want;

id date  interval=day setmissing=previous;

var return;

by stock;

run;

Haikuo

The following is the data step approach:

data want;

retain _r;

set c;

by stock;

return=coalesce(return,_r);

_r=ifn(last.stock,.,return);

drop _r;

run;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 187 views
  • 3 likes
  • 3 in conversation