BookmarkSubscribeRSS Feed
hdg
Obsidian | Level 7 hdg
Obsidian | Level 7

Hi,
I was wondering if someone can help answer how I should be doing this.
If there is an easy way to do this.

I have a missing data question my dataset looks like the following
I have dates and codes and info that are sorted by dates and then
codes.

I need to fill each code with weekdays from its starting point to its
last date with weekdays. So for the below for example 4/6/2012 is a
weekday is missing. So I need to add that date with the previous
available data for info.

Each code can have a different start date and end date. Any help on this is greatly
appreciated! Thank you

Input

date    code    info
4/2/2012        ann     0.450078295
4/3/2012        ann     0.582292885
4/4/2012        ann     0.72081712
4/5/2012        ann     0.146093011
4/9/2012        ann     0.079612504
4/10/2012       ann     0.022709629
4/11/2012       ann     0.97874129
4/12/2012       ann     0.901577583
3/2/2012        joe     0.38815284
3/5/2012        joe     0.555984459
3/12/2012       joe     0.7702825

Output
     date    code    info
4/2/2012        ann     0.450078295
4/3/2012        ann     0.582292885
4/4/2012        ann     0.72081712
4/5/2012        ann     0.146093011
4/6/2012        ann     0.146093
4/9/2012        ann     0.079612504
4/10/2012       ann     0.022709629
4/11/2012       ann     0.97874129
4/12/2012       ann     0.901577583
3/2/2012        joe     0.38815284
3/5/2012        joe     0.555984459
3/6/2012        joe     0.555984459
3/7/2012        joe     0.555984459
3/8/2012        joe     0.555984459
3/9/2012        joe     0.555984459
3/12/2012       joe     0.7702825

8 REPLIES 8
Haikuo
Onyx | Level 15

The key is to set the unit to 'weekday'.

data have;

input date mmddyy10.    code $    info;

format date mmddyy10. info 11.9;

cards;

4/2/2012        ann     0.450078295

4/3/2012        ann     0.582292885

4/4/2012        ann     0.72081712

4/5/2012        ann     0.146093011

4/9/2012        ann     0.079612504

4/10/2012       ann     0.022709629

4/11/2012       ann     0.97874129

4/12/2012       ann     0.901577583

3/2/2012        joe     0.38815284

3/5/2012        joe     0.555984459

3/12/2012       joe     0.7702825

;

data want (drop=_:);

retain  _code _info;

set have;

by code;

set have (firstobs=2 keep=date rename=date=_date)

    have (obs=1 drop=_all_);

if last.code then call missing(_date);

output;

_code=code;

_info=info;

do while (intck('weekday',date,_date)>1) ;

     code=_code;

     info=_info;

     date=intnx('weekday',date,1);

     output;

end;

run;

proc print;run;

Haikuo

Astounding
PROC Star

hdg,

It is possible that Hai.kuo's solution worked.  He is typically quite thoughtful and correct.  My brain is fried from trying to follow it, though.  Here's another approach.

First, construct a shell data set, containing the code / dates  you are hoping to get.  Sorting is done in anticipation of merging later:

proc sort data=have;

   by code date;

run;

proc summary data=have;

   by code;

   var date;

   output out=shell (drop=_type_ _freq_) min=min_date max=max_date;

run;

data shell;

   set shell;

   by code;

   do date=min_date to max_date;

        if 1 < weekday(date) < 7 then output;

   end;

   keep code date;

run;

Then merge the shell back with the data:

data want;

   merge have (rename=(info=original_info)) shell;

   by code date;

   if original_info > . then info = original_info;

   retain info;

   * After checking the result:  drop original_info;

run;

I'm sure somebody will look at my approach and find it just as unusual and brain-frying.  But that's how I would do it.

Good luck.

MikeZdeb
Rhodochrosite | Level 12

hi ... here's another idea (also uses the PROC SUMMARY plus a loop approach) ...

data have;

input date : mmddyy. code : $3. info;

format date mmddyy10. info 11.9 ;

cards;

4/2/2012        ann     0.450078295

4/3/2012        ann     0.582292885

4/4/2012        ann     0.72081712

4/5/2012        ann     0.146093011

4/9/2012        ann     0.079612504

4/10/2012       ann     0.022709629

4/11/2012       ann     0.97874129

4/12/2012       ann     0.901577583

3/2/2012        joe     0.38815284

3/5/2012        joe     0.555984459

3/12/2012       joe     0.7702825

;

proc sort data=have (index=(cd=(code date)));

by code date;

run;

proc summary data=have nway;

class code;

var date;

output out=shell (keep=code dt:) min=dt1 max=dt2;

run;

data new;

set shell;

do date=dt1 to dt2;

   if weekday(date) in (1,7) then continue;

   set have key=cd / unique;

   if _error_ then _error_ = 0;

   output;

end;

drop dt: ; 

run;

udo_sas
SAS Employee

Hello -

If you have access to SAS/ETS software then PROC TIMESERIES will be handy:

proc timeseries data=have out=want;

id date interval=weekday setmissing=previous;

var info;

by code;

run;

Thanks,

Udo

hdg
Obsidian | Level 7 hdg
Obsidian | Level 7

Many Thanks everyone!! Really appreciated

hdg
Obsidian | Level 7 hdg
Obsidian | Level 7

Dear all, I am revisiting this request

In the above example I only have to copy over the info column which is missing, but what if I have many other columns which I want to copy over for example price below.

date    code    info                                  price
4/2/2012        ann     0.450078295  0.2
4/3/2012        ann     0.582292885  0.3
4/4/2012        ann     0.72081712 0.3
4/5/2012        ann     0.146093011 0.23
4/9/2012        ann     0.079612504  0.4
4/10/2012       ann     0.022709629  0.45

Output
     date    code    info                         price
4/2/2012        ann     0.450078295   0.2
4/3/2012        ann     0.582292885  0.3
4/4/2012        ann     0.72081712   0.3
4/5/2012        ann     0.146093011 0.23
4/6/2012        ann     0.146093  0.23

and so on

Thanks again!

Reeza
Super User

Post as new question please, with all relevant information and mark this one as answered.

Ksharp
Super User
data have;
input date mmddyy10.    code $    info price;
format date mmddyy10. info 11.9;
cards;
4/2/2012        ann     0.450078295 0.2
4/3/2012        ann     0.582292885 0.4
4/4/2012        ann     0.72081712  0.6
4/5/2012        ann     0.146093011 0.2
4/9/2012        ann     0.079612504 0.2
4/10/2012       ann     0.022709629 0.2
4/11/2012       ann     0.97874129 0.2
4/12/2012       ann     0.901577583 0.2
3/2/2012        joe     0.38815284 0.2
3/5/2012        joe     0.555984459 0.4
3/12/2012       joe     0.7702825 0.4
;
run;
data want(drop=_: i found);
 merge have have(firstobs=2 keep=code date rename=(code=_code date=_date));
 retain found 0;
if code ne lag(code) then found=0;
if not found then do; output;
 if code eq _code then do;
   do i=date+1 to _date-1;
    if weekday(date) not in (1 7) then do;
       date=i;output;found=1;leave;
      end;
   end;
 end;
end;
run;

Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1095 views
  • 0 likes
  • 7 in conversation