I have data A with loss of data and data B with full trading days, after merging I get data C. For days of loss data and only in the boundaries of trading days of stock S001, S002, S00..in data A, the return in those days will equal to the returns of the previous days (lag-1). To be more specific, for stock S001 I would like to fill the value for loss data days of 7/4/2012 and 7/12-7/14/2012 (in red colors and to be .2 .21 .21 .21 respectively), but not for days outside the boundaries of 6/26/2012 and 7/22/2012 when the stocks are not listed. How could I do this for the data C and extend to n stocks?
Thank you and best regards.
A B C
STOCK | DATE | RETURN | STOCK | DATE | STOCK | DATE | RETURN | ||||
S001 | 6/27/2012 | 0.19 | S001 | 6/21/2012 | S001 | 6/21/2012 | |||||
S001 | 6/28/2012 | 0.19 | S001 | 6/22/2012 | S001 | 6/22/2012 | |||||
S001 | 6/29/2012 | 0.2 | S001 | 6/23/2012 | S001 | 6/23/2012 | |||||
S001 | 6/30/2012 | 0.2 | S001 | 6/24/2012 | S001 | 6/24/2012 | |||||
S001 | 7/1/2012 | 0.21 | S001 | 6/25/2012 | S001 | 6/25/2012 | |||||
S001 | 7/2/2012 | 0.21 | S001 | 6/26/2012 | S001 | 6/26/2012 | |||||
S001 | 7/3/2012 | 0.2 | S001 | 6/27/2012 | S001 | 6/27/2012 | 0.19 | ||||
S001 | 7/5/2012 | 0.19 | S001 | 6/28/2012 | S001 | 6/28/2012 | 0.19 | ||||
S001 | 7/6/2012 | 0.19 | S001 | 6/29/2012 | S001 | 6/29/2012 | 0.2 | ||||
S001 | 7/7/2012 | 0.19 | S001 | 6/30/2012 | S001 | 6/30/2012 | 0.2 | ||||
S001 | 7/8/2012 | 0.2 | S001 | 7/1/2012 | S001 | 7/1/2012 | 0.21 | ||||
S001 | 7/9/2012 | 0.2 | S001 | 7/2/2012 | S001 | 7/2/2012 | 0.21 | ||||
S001 | 7/10/2012 | 0.21 | S001 | 7/3/2012 | S001 | 7/3/2012 | 0.2 | ||||
S001 | 7/11/2012 | 0.21 | S001 | 7/4/2012 | S001 | 7/4/2012 | |||||
S001 | 7/15/2012 | 0.19 | S001 | 7/5/2012 | S001 | 7/5/2012 | 0.19 | ||||
S001 | 7/16/2012 | 0.2 | S001 | 7/6/2012 | S001 | 7/6/2012 | 0.19 | ||||
S001 | 7/17/2012 | 0.21 | S001 | 7/7/2012 | S001 | 7/7/2012 | 0.19 | ||||
S001 | 7/18/2012 | 0.21 | S001 | 7/8/2012 | S001 | 7/8/2012 | 0.2 | ||||
S001 | 7/19/2012 | 0.21 | S001 | 7/9/2012 | S001 | 7/9/2012 | 0.2 | ||||
S001 | 7/20/2012 | 0.24 | S001 | 7/10/2012 | S001 | 7/10/2012 | 0.21 | ||||
S001 | 7/21/2012 | 0.25 | S001 | 7/11/2012 | S001 | 7/11/2012 | 0.21 | ||||
S002 | 6/27/2012 | 0.19 | S001 | 7/12/2012 | S001 | 7/12/2012 | |||||
S002 | 6/28/2012 | 0.19 | S001 | 7/13/2012 | S001 | 7/13/2012 | |||||
S002 | 6/29/2012 | 0.2 | S001 | 7/14/2012 | S001 | 7/14/2012 | |||||
S002 | 6/30/2012 | 0.2 | S001 | 7/15/2012 | S001 | 7/15/2012 | 0.19 | ||||
S002 | 7/1/2012 | 0.21 | S001 | 7/16/2012 | S001 | 7/16/2012 | 0.2 | ||||
S002 | 7/2/2012 | 0.21 | S001 | 7/17/2012 | S001 | 7/17/2012 | 0.21 | ||||
S002 | 7/3/2012 | 0.2 | S001 | 7/18/2012 | S001 | 7/18/2012 | 0.21 | ||||
S002 | 7/4/2012 | 0.19 | S001 | 7/19/2012 | S001 | 7/19/2012 | 0.21 | ||||
S002 | 7/5/2012 | 0.19 | S001 | 7/20/2012 | S001 | 7/20/2012 | 0.24 | ||||
S002 | 7/6/2012 | 0.19 | S001 | 7/21/2012 | S001 | 7/21/2012 | 0.25 | ||||
S002 | 7/7/2012 | 0.19 | S001 | 7/22/2012 | S001 | 7/22/2012 | |||||
S002 | 7/8/2012 | 0.2 | S001 | 7/23/2012 | S001 | 7/23/2012 | |||||
S002 | 7/9/2012 | 0.2 | S002 | 6/21/2012 | S002 | 6/21/2012 | |||||
S002 | 7/10/2012 | 0.21 | S002 | 6/22/2012 | S002 | 6/22/2012 | |||||
S002 | 7/11/2012 | 0.21 | S002 | 6/23/2012 | S002 | 6/23/2012 | |||||
S002 | 7/15/2012 | 0.19 | S002 | 6/24/2012 | S002 | 6/24/2012 | |||||
S002 | 7/16/2012 | 0.2 | S002 | 6/25/2012 | S002 | 6/25/2012 | |||||
S002 | 7/17/2012 | 0.21 | S002 | 6/26/2012 | S002 | 6/26/2012 | |||||
S002 | 7/18/2012 | 0.21 | S002 | 6/27/2012 | S002 | 6/27/2012 | 0.19 | ||||
S002 | 7/19/2012 | 0.21 | S002 | 6/28/2012 | S002 | 6/28/2012 | 0.19 | ||||
S002 | 7/20/2012 | 0.24 | S002 | 6/29/2012 | S002 | 6/29/2012 | 0.2 | ||||
S002 | 6/30/2012 | S002 | 6/30/2012 | 0.2 | |||||||
S002 | 7/1/2012 | S002 | 7/1/2012 | 0.21 | |||||||
S002 | 7/2/2012 | S002 | 7/2/2012 | 0.21 | |||||||
S002 | 7/3/2012 | S002 | 7/3/2012 | 0.2 | |||||||
S002 | 7/4/2012 | S002 | 7/4/2012 | 0.19 | |||||||
S002 | 7/5/2012 | S002 | 7/5/2012 | 0.19 | |||||||
S002 | 7/6/2012 | S002 | 7/6/2012 | 0.19 | |||||||
S002 | 7/7/2012 | S002 | 7/7/2012 | 0.19 | |||||||
S002 | 7/8/2012 | S002 | 7/8/2012 | 0.2 | |||||||
S002 | 7/9/2012 | S002 | 7/9/2012 | 0.2 | |||||||
S002 | 7/10/2012 | S002 | 7/10/2012 | 0.21 | |||||||
S002 | 7/11/2012 | S002 | 7/11/2012 | 0.21 | |||||||
S002 | 7/12/2012 | S002 | 7/12/2012 | ||||||||
S002 | 7/13/2012 | S002 | 7/13/2012 | ||||||||
S002 | 7/14/2012 | S002 | 7/14/2012 | ||||||||
S002 | 7/15/2012 | S002 | 7/15/2012 | 0.19 | |||||||
S002 | 7/16/2012 | S002 | 7/16/2012 | 0.2 | |||||||
S002 | 7/17/2012 | S002 | 7/17/2012 | 0.21 | |||||||
S002 | 7/18/2012 | S002 | 7/18/2012 | 0.21 | |||||||
S002 | 7/19/2012 | S002 | 7/19/2012 | 0.21 | |||||||
S002 | 7/20/2012 | S002 | 7/20/2012 | 0.24 | |||||||
S002 | 7/21/2012 | S002 | 7/21/2012 | ||||||||
S002 | 7/22/2012 | S002 | 7/22/2012 | ||||||||
S002 | 7/23/2012 | S002 | 7/23/2012 |
One way to go:
data A;
infile datalines dsd;
input STOCK $ DATE:mmddyy. RETURN 32.;
format date date9.;
datalines;
S001,6/27/2012,0.19
S001,6/28/2012,0.19
S001,6/29/2012,0.2
S001,6/30/2012,0.2
S001,7/01/2012,0.21
S001,7/02/2012,0.21
S001,7/03/2012,0.2
S001,7/05/2012,0.19
S001,7/06/2012,0.19
S001,7/07/2012,0.19
S001,7/08/2012,0.2
S001,7/09/2012,0.2
S001,7/10/2012,0.21
S001,7/11/2012,0.21
S001,7/15/2012,0.19
S001,7/16/2012,0.2
S001,7/17/2012,0.21
S001,7/18/2012,0.21
S001,7/19/2012,0.21
S001,7/20/2012,0.24
S001,7/21/2012,0.25
S002,6/27/2012,0.19
S002,6/28/2012,0.19
S002,6/29/2012,0.2
S002,6/30/2012,0.2
S002,7/01/2012,0.21
S002,7/02/2012,0.21
S002,7/03/2012,0.2
S002,7/04/2012,0.19
S002,7/05/2012,0.19
S002,7/06/2012,0.19
S002,7/07/2012,0.19
S002,7/08/2012,0.2
S002,7/09/2012,0.2
S002,7/10/2012,0.21
S002,7/11/2012,0.21
S002,7/15/2012,0.19
S002,7/16/2012,0.2
S002,7/17/2012,0.21
S002,7/18/2012,0.21
S002,7/19/2012,0.21
S002,7/20/2012,0.24
;
run;
proc sql;
create view V_A as
select *, max(date) as _max_date format=date9.
from A
group by Stock
order by Stock, Date
;
quit;
data B;
infile datalines dsd;
input STOCK $ DATE:mmddyy.;
format date date9.;
datalines;
S001,6/21/2012
S001,6/22/2012
S001,6/23/2012
S001,6/24/2012
S001,6/25/2012
S001,6/26/2012
S001,6/27/2012
S001,6/28/2012
S001,6/29/2012
S001,6/30/2012
S001,7/01/2012
S001,7/02/2012
S001,7/03/2012
S001,7/04/2012
S001,7/05/2012
S001,7/06/2012
S001,7/07/2012
S001,7/08/2012
S001,7/09/2012
S001,7/10/2012
S001,7/11/2012
S001,7/12/2012
S001,7/13/2012
S001,7/14/2012
S001,7/15/2012
S001,7/16/2012
S001,7/17/2012
S001,7/18/2012
S001,7/19/2012
S001,7/20/2012
S001,7/21/2012
S001,7/22/2012
S001,7/23/2012
S002,6/21/2012
S002,6/22/2012
S002,6/23/2012
S002,6/24/2012
S002,6/25/2012
S002,6/26/2012
S002,6/27/2012
S002,6/28/2012
S002,6/29/2012
S002,6/30/2012
S002,7/01/2012
S002,7/02/2012
S002,7/03/2012
S002,7/04/2012
S002,7/05/2012
S002,7/06/2012
S002,7/07/2012
S002,7/08/2012
S002,7/09/2012
S002,7/10/2012
S002,7/11/2012
S002,7/12/2012
S002,7/13/2012
S002,7/14/2012
S002,7/15/2012
S002,7/16/2012
S002,7/17/2012
S002,7/18/2012
S002,7/19/2012
S002,7/20/2012
S002,7/21/2012
S002,7/22/2012
S002,7/23/2012
;
run;
data C (drop=_:);
merge V_A (in=inA) B;
by stock date;
retain _R_max_date _R_return;
if first.stock then call missing(_R_max_date,_R_return);
_R_max_date =coalesce(_max_date,_R_max_date);
_R_return =coalesce(return,_R_return);
if not inA and date < _R_max_date then return=_R_return;
run;
One way to go:
data A;
infile datalines dsd;
input STOCK $ DATE:mmddyy. RETURN 32.;
format date date9.;
datalines;
S001,6/27/2012,0.19
S001,6/28/2012,0.19
S001,6/29/2012,0.2
S001,6/30/2012,0.2
S001,7/01/2012,0.21
S001,7/02/2012,0.21
S001,7/03/2012,0.2
S001,7/05/2012,0.19
S001,7/06/2012,0.19
S001,7/07/2012,0.19
S001,7/08/2012,0.2
S001,7/09/2012,0.2
S001,7/10/2012,0.21
S001,7/11/2012,0.21
S001,7/15/2012,0.19
S001,7/16/2012,0.2
S001,7/17/2012,0.21
S001,7/18/2012,0.21
S001,7/19/2012,0.21
S001,7/20/2012,0.24
S001,7/21/2012,0.25
S002,6/27/2012,0.19
S002,6/28/2012,0.19
S002,6/29/2012,0.2
S002,6/30/2012,0.2
S002,7/01/2012,0.21
S002,7/02/2012,0.21
S002,7/03/2012,0.2
S002,7/04/2012,0.19
S002,7/05/2012,0.19
S002,7/06/2012,0.19
S002,7/07/2012,0.19
S002,7/08/2012,0.2
S002,7/09/2012,0.2
S002,7/10/2012,0.21
S002,7/11/2012,0.21
S002,7/15/2012,0.19
S002,7/16/2012,0.2
S002,7/17/2012,0.21
S002,7/18/2012,0.21
S002,7/19/2012,0.21
S002,7/20/2012,0.24
;
run;
proc sql;
create view V_A as
select *, max(date) as _max_date format=date9.
from A
group by Stock
order by Stock, Date
;
quit;
data B;
infile datalines dsd;
input STOCK $ DATE:mmddyy.;
format date date9.;
datalines;
S001,6/21/2012
S001,6/22/2012
S001,6/23/2012
S001,6/24/2012
S001,6/25/2012
S001,6/26/2012
S001,6/27/2012
S001,6/28/2012
S001,6/29/2012
S001,6/30/2012
S001,7/01/2012
S001,7/02/2012
S001,7/03/2012
S001,7/04/2012
S001,7/05/2012
S001,7/06/2012
S001,7/07/2012
S001,7/08/2012
S001,7/09/2012
S001,7/10/2012
S001,7/11/2012
S001,7/12/2012
S001,7/13/2012
S001,7/14/2012
S001,7/15/2012
S001,7/16/2012
S001,7/17/2012
S001,7/18/2012
S001,7/19/2012
S001,7/20/2012
S001,7/21/2012
S001,7/22/2012
S001,7/23/2012
S002,6/21/2012
S002,6/22/2012
S002,6/23/2012
S002,6/24/2012
S002,6/25/2012
S002,6/26/2012
S002,6/27/2012
S002,6/28/2012
S002,6/29/2012
S002,6/30/2012
S002,7/01/2012
S002,7/02/2012
S002,7/03/2012
S002,7/04/2012
S002,7/05/2012
S002,7/06/2012
S002,7/07/2012
S002,7/08/2012
S002,7/09/2012
S002,7/10/2012
S002,7/11/2012
S002,7/12/2012
S002,7/13/2012
S002,7/14/2012
S002,7/15/2012
S002,7/16/2012
S002,7/17/2012
S002,7/18/2012
S002,7/19/2012
S002,7/20/2012
S002,7/21/2012
S002,7/22/2012
S002,7/23/2012
;
run;
data C (drop=_:);
merge V_A (in=inA) B;
by stock date;
retain _R_max_date _R_return;
if first.stock then call missing(_R_max_date,_R_return);
_R_max_date =coalesce(_max_date,_R_max_date);
_R_return =coalesce(return,_R_return);
if not inA and date < _R_max_date then return=_R_return;
run;
If you already have 'c' and you also have SAS/ETS, besides data step, you could also try:
proc timeseries data=c out=want;
id date interval=day setmissing=previous;
var return;
by stock;
run;
Haikuo
The following is the data step approach:
data want;
retain _r;
set c;
by stock;
return=coalesce(return,_r);
_r=ifn(last.stock,.,return);
drop _r;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.