Help using Base SAS procedures

missing data

Reply
Frequent Contributor
Frequent Contributor
Posts: 79

missing data

Hi,
I was wondering if someone can help answer how I should be doing this.
If there is an easy way to do this.

I have a missing data question my dataset looks like the following
I have dates and codes and info that are sorted by dates and then
codes.

I need to fill each code with weekdays from its starting point to its
last date with weekdays. So for the below for example 4/6/2012 is a
weekday is missing. So I need to add that date with the previous
available data for info.

Each code can have a different start date and end date. Any help on this is greatly
appreciated! Thank you

Input

date    code    info
4/2/2012        ann     0.450078295
4/3/2012        ann     0.582292885
4/4/2012        ann     0.72081712
4/5/2012        ann     0.146093011
4/9/2012        ann     0.079612504
4/10/2012       ann     0.022709629
4/11/2012       ann     0.97874129
4/12/2012       ann     0.901577583
3/2/2012        joe     0.38815284
3/5/2012        joe     0.555984459
3/12/2012       joe     0.7702825

Output
     date    code    info
4/2/2012        ann     0.450078295
4/3/2012        ann     0.582292885
4/4/2012        ann     0.72081712
4/5/2012        ann     0.146093011
4/6/2012        ann     0.146093
4/9/2012        ann     0.079612504
4/10/2012       ann     0.022709629
4/11/2012       ann     0.97874129
4/12/2012       ann     0.901577583
3/2/2012        joe     0.38815284
3/5/2012        joe     0.555984459
3/6/2012        joe     0.555984459
3/7/2012        joe     0.555984459
3/8/2012        joe     0.555984459
3/9/2012        joe     0.555984459
3/12/2012       joe     0.7702825

Respected Advisor
Posts: 3,124

missing data

The key is to set the unit to 'weekday'.

data have;

input date mmddyy10.    code $    info;

format date mmddyy10. info 11.9;

cards;

4/2/2012        ann     0.450078295

4/3/2012        ann     0.582292885

4/4/2012        ann     0.72081712

4/5/2012        ann     0.146093011

4/9/2012        ann     0.079612504

4/10/2012       ann     0.022709629

4/11/2012       ann     0.97874129

4/12/2012       ann     0.901577583

3/2/2012        joe     0.38815284

3/5/2012        joe     0.555984459

3/12/2012       joe     0.7702825

;

data want (drop=_Smiley Happy;

retain  _code _info;

set have;

by code;

set have (firstobs=2 keep=date rename=date=_date)

    have (obs=1 drop=_all_);

if last.code then call missing(_date);

output;

_code=code;

_info=info;

do while (intck('weekday',date,_date)>1) ;

     code=_code;

     info=_info;

     date=intnx('weekday',date,1);

     output;

end;

run;

proc print;run;

Haikuo

Super User
Posts: 5,099

missing data

hdg,

It is possible that Hai.kuo's solution worked.  He is typically quite thoughtful and correct.  My brain is fried from trying to follow it, though.  Here's another approach.

First, construct a shell data set, containing the code / dates  you are hoping to get.  Sorting is done in anticipation of merging later:

proc sort data=have;

   by code date;

run;

proc summary data=have;

   by code;

   var date;

   output out=shell (drop=_type_ _freq_) min=min_date max=max_date;

run;

data shell;

   set shell;

   by code;

   do date=min_date to max_date;

        if 1 < weekday(date) < 7 then output;

   end;

   keep code date;

run;

Then merge the shell back with the data:

data want;

   merge have (rename=(info=original_info)) shell;

   by code date;

   if original_info > . then info = original_info;

   retain info;

   * After checking the result:  drop original_info;

run;

I'm sure somebody will look at my approach and find it just as unusual and brain-frying.  But that's how I would do it.

Good luck.

Valued Guide
Posts: 765

Re: missing data

hi ... here's another idea (also uses the PROC SUMMARY plus a loop approach) ...

data have;

input date : mmddyy. code : $3. info;

format date mmddyy10. info 11.9 ;

cards;

4/2/2012        ann     0.450078295

4/3/2012        ann     0.582292885

4/4/2012        ann     0.72081712

4/5/2012        ann     0.146093011

4/9/2012        ann     0.079612504

4/10/2012       ann     0.022709629

4/11/2012       ann     0.97874129

4/12/2012       ann     0.901577583

3/2/2012        joe     0.38815284

3/5/2012        joe     0.555984459

3/12/2012       joe     0.7702825

;

proc sort data=have (index=(cd=(code date)));

by code date;

run;

proc summary data=have nway;

class code;

var date;

output out=shell (keep=code dtSmiley Happy min=dt1 max=dt2;

run;

data new;

set shell;

do date=dt1 to dt2;

   if weekday(date) in (1,7) then continue;

   set have key=cd / unique;

   if _error_ then _error_ = 0;

   output;

end;

drop dt: ; 

run;

SAS Employee
Posts: 416

Re: missing data

Hello -

If you have access to SAS/ETS software then PROC TIMESERIES will be handy:

proc timeseries data=have out=want;

id date interval=weekday setmissing=previous;

var info;

by code;

run;

Thanks,

Udo

Frequent Contributor
Frequent Contributor
Posts: 79

Re: missing data

Many Thanks everyone!! Really appreciated

Frequent Contributor
Frequent Contributor
Posts: 79

Re: missing data

Dear all, I am revisiting this request

In the above example I only have to copy over the info column which is missing, but what if I have many other columns which I want to copy over for example price below.

date    code    info                                  price
4/2/2012        ann     0.450078295  0.2
4/3/2012        ann     0.582292885  0.3
4/4/2012        ann     0.72081712 0.3
4/5/2012        ann     0.146093011 0.23
4/9/2012        ann     0.079612504  0.4
4/10/2012       ann     0.022709629  0.45

Output
     date    code    info                         price
4/2/2012        ann     0.450078295   0.2
4/3/2012        ann     0.582292885  0.3
4/4/2012        ann     0.72081712   0.3
4/5/2012        ann     0.146093011 0.23
4/6/2012        ann     0.146093  0.23

and so on

Thanks again!

Super User
Posts: 17,963

Re: missing data

Post as new question please, with all relevant information and mark this one as answered.

Super User
Posts: 9,691

Re: missing data

data have;
input date mmddyy10.    code $    info price;
format date mmddyy10. info 11.9;
cards;
4/2/2012        ann     0.450078295 0.2
4/3/2012        ann     0.582292885 0.4
4/4/2012        ann     0.72081712  0.6
4/5/2012        ann     0.146093011 0.2
4/9/2012        ann     0.079612504 0.2
4/10/2012       ann     0.022709629 0.2
4/11/2012       ann     0.97874129 0.2
4/12/2012       ann     0.901577583 0.2
3/2/2012        joe     0.38815284 0.2
3/5/2012        joe     0.555984459 0.4
3/12/2012       joe     0.7702825 0.4
;
run;
data want(drop=_: i found);
 merge have have(firstobs=2 keep=code date rename=(code=_code date=_date));
 retain found 0;
if code ne lag(code) then found=0;
if not found then do; output;
 if code eq _code then do;
   do i=date+1 to _date-1;
    if weekday(date) not in (1 7) then do;
       date=i;output;found=1;leave;
      end;
   end;
 end;
end;
run;

Ksharp

Ask a Question
Discussion stats
  • 8 replies
  • 447 views
  • 0 likes
  • 7 in conversation