Hi,
I was wondering if someone can help answer how I should be doing this.
If there is an easy way to do this.
I have a missing data question my dataset looks like the following
I have dates and codes and info that are sorted by dates and then
codes.
I need to fill each code with weekdays from its starting point to its
last date with weekdays. So for the below for example 4/6/2012 is a
weekday is missing. So I need to add that date with the previous
available data for info.
Each code can have a different start date and end date. Any help on this is greatly
appreciated! Thank you
Input
date code info
4/2/2012 ann 0.450078295
4/3/2012 ann 0.582292885
4/4/2012 ann 0.72081712
4/5/2012 ann 0.146093011
4/9/2012 ann 0.079612504
4/10/2012 ann 0.022709629
4/11/2012 ann 0.97874129
4/12/2012 ann 0.901577583
3/2/2012 joe 0.38815284
3/5/2012 joe 0.555984459
3/12/2012 joe 0.7702825
Output
date code info
4/2/2012 ann 0.450078295
4/3/2012 ann 0.582292885
4/4/2012 ann 0.72081712
4/5/2012 ann 0.146093011
4/6/2012 ann 0.146093
4/9/2012 ann 0.079612504
4/10/2012 ann 0.022709629
4/11/2012 ann 0.97874129
4/12/2012 ann 0.901577583
3/2/2012 joe 0.38815284
3/5/2012 joe 0.555984459
3/6/2012 joe 0.555984459
3/7/2012 joe 0.555984459
3/8/2012 joe 0.555984459
3/9/2012 joe 0.555984459
3/12/2012 joe 0.7702825
The key is to set the unit to 'weekday'.
data have;
input date mmddyy10. code $ info;
format date mmddyy10. info 11.9;
cards;
4/2/2012 ann 0.450078295
4/3/2012 ann 0.582292885
4/4/2012 ann 0.72081712
4/5/2012 ann 0.146093011
4/9/2012 ann 0.079612504
4/10/2012 ann 0.022709629
4/11/2012 ann 0.97874129
4/12/2012 ann 0.901577583
3/2/2012 joe 0.38815284
3/5/2012 joe 0.555984459
3/12/2012 joe 0.7702825
;
data want (drop=_:);
retain _code _info;
set have;
by code;
set have (firstobs=2 keep=date rename=date=_date)
have (obs=1 drop=_all_);
if last.code then call missing(_date);
output;
_code=code;
_info=info;
do while (intck('weekday',date,_date)>1) ;
code=_code;
info=_info;
date=intnx('weekday',date,1);
output;
end;
run;
proc print;run;
Haikuo
hdg,
It is possible that Hai.kuo's solution worked. He is typically quite thoughtful and correct. My brain is fried from trying to follow it, though. Here's another approach.
First, construct a shell data set, containing the code / dates you are hoping to get. Sorting is done in anticipation of merging later:
proc sort data=have;
by code date;
run;
proc summary data=have;
by code;
var date;
output out=shell (drop=_type_ _freq_) min=min_date max=max_date;
run;
data shell;
set shell;
by code;
do date=min_date to max_date;
if 1 < weekday(date) < 7 then output;
end;
keep code date;
run;
Then merge the shell back with the data:
data want;
merge have (rename=(info=original_info)) shell;
by code date;
if original_info > . then info = original_info;
retain info;
* After checking the result: drop original_info;
run;
I'm sure somebody will look at my approach and find it just as unusual and brain-frying. But that's how I would do it.
Good luck.
hi ... here's another idea (also uses the PROC SUMMARY plus a loop approach) ...
data have;
input date : mmddyy. code : $3. info;
format date mmddyy10. info 11.9 ;
cards;
4/2/2012 ann 0.450078295
4/3/2012 ann 0.582292885
4/4/2012 ann 0.72081712
4/5/2012 ann 0.146093011
4/9/2012 ann 0.079612504
4/10/2012 ann 0.022709629
4/11/2012 ann 0.97874129
4/12/2012 ann 0.901577583
3/2/2012 joe 0.38815284
3/5/2012 joe 0.555984459
3/12/2012 joe 0.7702825
;
proc sort data=have (index=(cd=(code date)));
by code date;
run;
proc summary data=have nway;
class code;
var date;
output out=shell (keep=code dt:) min=dt1 max=dt2;
run;
data new;
set shell;
do date=dt1 to dt2;
if weekday(date) in (1,7) then continue;
set have key=cd / unique;
if _error_ then _error_ = 0;
output;
end;
drop dt: ;
run;
Hello -
If you have access to SAS/ETS software then PROC TIMESERIES will be handy:
proc timeseries data=have out=want;
id date interval=weekday setmissing=previous;
var info;
by code;
run;
Thanks,
Udo
Many Thanks everyone!! Really appreciated
Dear all, I am revisiting this request
In the above example I only have to copy over the info column which is missing, but what if I have many other columns which I want to copy over for example price below.
date code info price
4/2/2012 ann 0.450078295 0.2
4/3/2012 ann 0.582292885 0.3
4/4/2012 ann 0.72081712 0.3
4/5/2012 ann 0.146093011 0.23
4/9/2012 ann 0.079612504 0.4
4/10/2012 ann 0.022709629 0.45
Output
date code info price
4/2/2012 ann 0.450078295 0.2
4/3/2012 ann 0.582292885 0.3
4/4/2012 ann 0.72081712 0.3
4/5/2012 ann 0.146093011 0.23
4/6/2012 ann 0.146093 0.23
and so on
Thanks again!
Post as new question please, with all relevant information and mark this one as answered.
data have; input date mmddyy10. code $ info price; format date mmddyy10. info 11.9; cards; 4/2/2012 ann 0.450078295 0.2 4/3/2012 ann 0.582292885 0.4 4/4/2012 ann 0.72081712 0.6 4/5/2012 ann 0.146093011 0.2 4/9/2012 ann 0.079612504 0.2 4/10/2012 ann 0.022709629 0.2 4/11/2012 ann 0.97874129 0.2 4/12/2012 ann 0.901577583 0.2 3/2/2012 joe 0.38815284 0.2 3/5/2012 joe 0.555984459 0.4 3/12/2012 joe 0.7702825 0.4 ; run; data want(drop=_: i found); merge have have(firstobs=2 keep=code date rename=(code=_code date=_date)); retain found 0; if code ne lag(code) then found=0; if not found then do; output; if code eq _code then do; do i=date+1 to _date-1; if weekday(date) not in (1 7) then do; date=i;output;found=1;leave; end; end; end; end; run;
Ksharp
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.