I have data A with loss of data and data B with full trading days, after merging I get data C. For days of loss data and only in the boundaries of trading days of stock S001, S002, S00..in data A, the return in those days will equal to the returns of the previous days (lag-1). To be more specific, for stock S001 I would like to fill the value for loss data days of 7/4/2012 and 7/12-7/14/2012 (in red colors and to be .2 .21 .21 .21 respectively), but not for days outside the boundaries of 6/26/2012 and 7/22/2012 when the stocks are not listed. How could I do this for the data C and extend to n stocks?
Thank you and best regards.
A B C
STOCK | DATE | RETURN | STOCK | DATE | STOCK | DATE | RETURN | ||||
S001 | 6/27/2012 | 0.19 | S001 | 6/21/2012 | S001 | 6/21/2012 | |||||
S001 | 6/28/2012 | 0.19 | S001 | 6/22/2012 | S001 | 6/22/2012 | |||||
S001 | 6/29/2012 | 0.2 | S001 | 6/23/2012 | S001 | 6/23/2012 | |||||
S001 | 6/30/2012 | 0.2 | S001 | 6/24/2012 | S001 | 6/24/2012 | |||||
S001 | 7/1/2012 | 0.21 | S001 | 6/25/2012 | S001 | 6/25/2012 | |||||
S001 | 7/2/2012 | 0.21 | S001 | 6/26/2012 | S001 | 6/26/2012 | |||||
S001 | 7/3/2012 | 0.2 | S001 | 6/27/2012 | S001 | 6/27/2012 | 0.19 | ||||
S001 | 7/5/2012 | 0.19 | S001 | 6/28/2012 | S001 | 6/28/2012 | 0.19 | ||||
S001 | 7/6/2012 | 0.19 | S001 | 6/29/2012 | S001 | 6/29/2012 | 0.2 | ||||
S001 | 7/7/2012 | 0.19 | S001 | 6/30/2012 | S001 | 6/30/2012 | 0.2 | ||||
S001 | 7/8/2012 | 0.2 | S001 | 7/1/2012 | S001 | 7/1/2012 | 0.21 | ||||
S001 | 7/9/2012 | 0.2 | S001 | 7/2/2012 | S001 | 7/2/2012 | 0.21 | ||||
S001 | 7/10/2012 | 0.21 | S001 | 7/3/2012 | S001 | 7/3/2012 | 0.2 | ||||
S001 | 7/11/2012 | 0.21 | S001 | 7/4/2012 | S001 | 7/4/2012 | |||||
S001 | 7/15/2012 | 0.19 | S001 | 7/5/2012 | S001 | 7/5/2012 | 0.19 | ||||
S001 | 7/16/2012 | 0.2 | S001 | 7/6/2012 | S001 | 7/6/2012 | 0.19 | ||||
S001 | 7/17/2012 | 0.21 | S001 | 7/7/2012 | S001 | 7/7/2012 | 0.19 | ||||
S001 | 7/18/2012 | 0.21 | S001 | 7/8/2012 | S001 | 7/8/2012 | 0.2 | ||||
S001 | 7/19/2012 | 0.21 | S001 | 7/9/2012 | S001 | 7/9/2012 | 0.2 | ||||
S001 | 7/20/2012 | 0.24 | S001 | 7/10/2012 | S001 | 7/10/2012 | 0.21 | ||||
S001 | 7/21/2012 | 0.25 | S001 | 7/11/2012 | S001 | 7/11/2012 | 0.21 | ||||
S002 | 6/27/2012 | 0.19 | S001 | 7/12/2012 | S001 | 7/12/2012 | |||||
S002 | 6/28/2012 | 0.19 | S001 | 7/13/2012 | S001 | 7/13/2012 | |||||
S002 | 6/29/2012 | 0.2 | S001 | 7/14/2012 | S001 | 7/14/2012 | |||||
S002 | 6/30/2012 | 0.2 | S001 | 7/15/2012 | S001 | 7/15/2012 | 0.19 | ||||
S002 | 7/1/2012 | 0.21 | S001 | 7/16/2012 | S001 | 7/16/2012 | 0.2 | ||||
S002 | 7/2/2012 | 0.21 | S001 | 7/17/2012 | S001 | 7/17/2012 | 0.21 | ||||
S002 | 7/3/2012 | 0.2 | S001 | 7/18/2012 | S001 | 7/18/2012 | 0.21 | ||||
S002 | 7/4/2012 | 0.19 | S001 | 7/19/2012 | S001 | 7/19/2012 | 0.21 | ||||
S002 | 7/5/2012 | 0.19 | S001 | 7/20/2012 | S001 | 7/20/2012 | 0.24 | ||||
S002 | 7/6/2012 | 0.19 | S001 | 7/21/2012 | S001 | 7/21/2012 | 0.25 | ||||
S002 | 7/7/2012 | 0.19 | S001 | 7/22/2012 | S001 | 7/22/2012 | |||||
S002 | 7/8/2012 | 0.2 | S001 | 7/23/2012 | S001 | 7/23/2012 | |||||
S002 | 7/9/2012 | 0.2 | S002 | 6/21/2012 | S002 | 6/21/2012 | |||||
S002 | 7/10/2012 | 0.21 | S002 | 6/22/2012 | S002 | 6/22/2012 | |||||
S002 | 7/11/2012 | 0.21 | S002 | 6/23/2012 | S002 | 6/23/2012 | |||||
S002 | 7/15/2012 | 0.19 | S002 | 6/24/2012 | S002 | 6/24/2012 | |||||
S002 | 7/16/2012 | 0.2 | S002 | 6/25/2012 | S002 | 6/25/2012 | |||||
S002 | 7/17/2012 | 0.21 | S002 | 6/26/2012 | S002 | 6/26/2012 | |||||
S002 | 7/18/2012 | 0.21 | S002 | 6/27/2012 | S002 | 6/27/2012 | 0.19 | ||||
S002 | 7/19/2012 | 0.21 | S002 | 6/28/2012 | S002 | 6/28/2012 | 0.19 | ||||
S002 | 7/20/2012 | 0.24 | S002 | 6/29/2012 | S002 | 6/29/2012 | 0.2 | ||||
S002 | 6/30/2012 | S002 | 6/30/2012 | 0.2 | |||||||
S002 | 7/1/2012 | S002 | 7/1/2012 | 0.21 | |||||||
S002 | 7/2/2012 | S002 | 7/2/2012 | 0.21 | |||||||
S002 | 7/3/2012 | S002 | 7/3/2012 | 0.2 | |||||||
S002 | 7/4/2012 | S002 | 7/4/2012 | 0.19 | |||||||
S002 | 7/5/2012 | S002 | 7/5/2012 | 0.19 | |||||||
S002 | 7/6/2012 | S002 | 7/6/2012 | 0.19 | |||||||
S002 | 7/7/2012 | S002 | 7/7/2012 | 0.19 | |||||||
S002 | 7/8/2012 | S002 | 7/8/2012 | 0.2 | |||||||
S002 | 7/9/2012 | S002 | 7/9/2012 | 0.2 | |||||||
S002 | 7/10/2012 | S002 | 7/10/2012 | 0.21 | |||||||
S002 | 7/11/2012 | S002 | 7/11/2012 | 0.21 | |||||||
S002 | 7/12/2012 | S002 | 7/12/2012 | ||||||||
S002 | 7/13/2012 | S002 | 7/13/2012 | ||||||||
S002 | 7/14/2012 | S002 | 7/14/2012 | ||||||||
S002 | 7/15/2012 | S002 | 7/15/2012 | 0.19 | |||||||
S002 | 7/16/2012 | S002 | 7/16/2012 | 0.2 | |||||||
S002 | 7/17/2012 | S002 | 7/17/2012 | 0.21 | |||||||
S002 | 7/18/2012 | S002 | 7/18/2012 | 0.21 | |||||||
S002 | 7/19/2012 | S002 | 7/19/2012 | 0.21 | |||||||
S002 | 7/20/2012 | S002 | 7/20/2012 | 0.24 | |||||||
S002 | 7/21/2012 | S002 | 7/21/2012 | ||||||||
S002 | 7/22/2012 | S002 | 7/22/2012 | ||||||||
S002 | 7/23/2012 | S002 | 7/23/2012 |
One way to go:
data A;
infile datalines dsd;
input STOCK $ DATE:mmddyy. RETURN 32.;
format date date9.;
datalines;
S001,6/27/2012,0.19
S001,6/28/2012,0.19
S001,6/29/2012,0.2
S001,6/30/2012,0.2
S001,7/01/2012,0.21
S001,7/02/2012,0.21
S001,7/03/2012,0.2
S001,7/05/2012,0.19
S001,7/06/2012,0.19
S001,7/07/2012,0.19
S001,7/08/2012,0.2
S001,7/09/2012,0.2
S001,7/10/2012,0.21
S001,7/11/2012,0.21
S001,7/15/2012,0.19
S001,7/16/2012,0.2
S001,7/17/2012,0.21
S001,7/18/2012,0.21
S001,7/19/2012,0.21
S001,7/20/2012,0.24
S001,7/21/2012,0.25
S002,6/27/2012,0.19
S002,6/28/2012,0.19
S002,6/29/2012,0.2
S002,6/30/2012,0.2
S002,7/01/2012,0.21
S002,7/02/2012,0.21
S002,7/03/2012,0.2
S002,7/04/2012,0.19
S002,7/05/2012,0.19
S002,7/06/2012,0.19
S002,7/07/2012,0.19
S002,7/08/2012,0.2
S002,7/09/2012,0.2
S002,7/10/2012,0.21
S002,7/11/2012,0.21
S002,7/15/2012,0.19
S002,7/16/2012,0.2
S002,7/17/2012,0.21
S002,7/18/2012,0.21
S002,7/19/2012,0.21
S002,7/20/2012,0.24
;
run;
proc sql;
create view V_A as
select *, max(date) as _max_date format=date9.
from A
group by Stock
order by Stock, Date
;
quit;
data B;
infile datalines dsd;
input STOCK $ DATE:mmddyy.;
format date date9.;
datalines;
S001,6/21/2012
S001,6/22/2012
S001,6/23/2012
S001,6/24/2012
S001,6/25/2012
S001,6/26/2012
S001,6/27/2012
S001,6/28/2012
S001,6/29/2012
S001,6/30/2012
S001,7/01/2012
S001,7/02/2012
S001,7/03/2012
S001,7/04/2012
S001,7/05/2012
S001,7/06/2012
S001,7/07/2012
S001,7/08/2012
S001,7/09/2012
S001,7/10/2012
S001,7/11/2012
S001,7/12/2012
S001,7/13/2012
S001,7/14/2012
S001,7/15/2012
S001,7/16/2012
S001,7/17/2012
S001,7/18/2012
S001,7/19/2012
S001,7/20/2012
S001,7/21/2012
S001,7/22/2012
S001,7/23/2012
S002,6/21/2012
S002,6/22/2012
S002,6/23/2012
S002,6/24/2012
S002,6/25/2012
S002,6/26/2012
S002,6/27/2012
S002,6/28/2012
S002,6/29/2012
S002,6/30/2012
S002,7/01/2012
S002,7/02/2012
S002,7/03/2012
S002,7/04/2012
S002,7/05/2012
S002,7/06/2012
S002,7/07/2012
S002,7/08/2012
S002,7/09/2012
S002,7/10/2012
S002,7/11/2012
S002,7/12/2012
S002,7/13/2012
S002,7/14/2012
S002,7/15/2012
S002,7/16/2012
S002,7/17/2012
S002,7/18/2012
S002,7/19/2012
S002,7/20/2012
S002,7/21/2012
S002,7/22/2012
S002,7/23/2012
;
run;
data C (drop=_:);
merge V_A (in=inA) B;
by stock date;
retain _R_max_date _R_return;
if first.stock then call missing(_R_max_date,_R_return);
_R_max_date =coalesce(_max_date,_R_max_date);
_R_return =coalesce(return,_R_return);
if not inA and date < _R_max_date then return=_R_return;
run;
One way to go:
data A;
infile datalines dsd;
input STOCK $ DATE:mmddyy. RETURN 32.;
format date date9.;
datalines;
S001,6/27/2012,0.19
S001,6/28/2012,0.19
S001,6/29/2012,0.2
S001,6/30/2012,0.2
S001,7/01/2012,0.21
S001,7/02/2012,0.21
S001,7/03/2012,0.2
S001,7/05/2012,0.19
S001,7/06/2012,0.19
S001,7/07/2012,0.19
S001,7/08/2012,0.2
S001,7/09/2012,0.2
S001,7/10/2012,0.21
S001,7/11/2012,0.21
S001,7/15/2012,0.19
S001,7/16/2012,0.2
S001,7/17/2012,0.21
S001,7/18/2012,0.21
S001,7/19/2012,0.21
S001,7/20/2012,0.24
S001,7/21/2012,0.25
S002,6/27/2012,0.19
S002,6/28/2012,0.19
S002,6/29/2012,0.2
S002,6/30/2012,0.2
S002,7/01/2012,0.21
S002,7/02/2012,0.21
S002,7/03/2012,0.2
S002,7/04/2012,0.19
S002,7/05/2012,0.19
S002,7/06/2012,0.19
S002,7/07/2012,0.19
S002,7/08/2012,0.2
S002,7/09/2012,0.2
S002,7/10/2012,0.21
S002,7/11/2012,0.21
S002,7/15/2012,0.19
S002,7/16/2012,0.2
S002,7/17/2012,0.21
S002,7/18/2012,0.21
S002,7/19/2012,0.21
S002,7/20/2012,0.24
;
run;
proc sql;
create view V_A as
select *, max(date) as _max_date format=date9.
from A
group by Stock
order by Stock, Date
;
quit;
data B;
infile datalines dsd;
input STOCK $ DATE:mmddyy.;
format date date9.;
datalines;
S001,6/21/2012
S001,6/22/2012
S001,6/23/2012
S001,6/24/2012
S001,6/25/2012
S001,6/26/2012
S001,6/27/2012
S001,6/28/2012
S001,6/29/2012
S001,6/30/2012
S001,7/01/2012
S001,7/02/2012
S001,7/03/2012
S001,7/04/2012
S001,7/05/2012
S001,7/06/2012
S001,7/07/2012
S001,7/08/2012
S001,7/09/2012
S001,7/10/2012
S001,7/11/2012
S001,7/12/2012
S001,7/13/2012
S001,7/14/2012
S001,7/15/2012
S001,7/16/2012
S001,7/17/2012
S001,7/18/2012
S001,7/19/2012
S001,7/20/2012
S001,7/21/2012
S001,7/22/2012
S001,7/23/2012
S002,6/21/2012
S002,6/22/2012
S002,6/23/2012
S002,6/24/2012
S002,6/25/2012
S002,6/26/2012
S002,6/27/2012
S002,6/28/2012
S002,6/29/2012
S002,6/30/2012
S002,7/01/2012
S002,7/02/2012
S002,7/03/2012
S002,7/04/2012
S002,7/05/2012
S002,7/06/2012
S002,7/07/2012
S002,7/08/2012
S002,7/09/2012
S002,7/10/2012
S002,7/11/2012
S002,7/12/2012
S002,7/13/2012
S002,7/14/2012
S002,7/15/2012
S002,7/16/2012
S002,7/17/2012
S002,7/18/2012
S002,7/19/2012
S002,7/20/2012
S002,7/21/2012
S002,7/22/2012
S002,7/23/2012
;
run;
data C (drop=_:);
merge V_A (in=inA) B;
by stock date;
retain _R_max_date _R_return;
if first.stock then call missing(_R_max_date,_R_return);
_R_max_date =coalesce(_max_date,_R_max_date);
_R_return =coalesce(return,_R_return);
if not inA and date < _R_max_date then return=_R_return;
run;
If you already have 'c' and you also have SAS/ETS, besides data step, you could also try:
proc timeseries data=c out=want;
id date interval=day setmissing=previous;
var return;
by stock;
run;
Haikuo
The following is the data step approach:
data want;
retain _r;
set c;
by stock;
return=coalesce(return,_r);
_r=ifn(last.stock,.,return);
drop _r;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.