How can I fill in missing values with data from the day after?

Reply
Occasional Contributor
Posts: 8

How can I fill in missing values with data from the day after?

Hi SAS Support Community!

I have a dataset with missing values that looks like this:

DATE               PRICE          VOL

01/01/2000          10               100

01/02/2000          .                    .

01/03/2000          11               200

(goes on for thousands of records)

I would like to replace these missing values with the value from the day after (date + 1 day).

Would anyone have a suitable solution for this? Thanks!

Respected Advisor
Posts: 3,124

Re: How can I fill in missing values with data from the day after?

If you only need one leading obs or any fixed number of leading obs, it is easy to do using a simple self merge in data step or a self-join in Proc SQL:

data have;

input DATE :mmddyy10.    PRICE          VOL;

cards;

01/01/2000          10               100

01/02/2000          .                    .

01/03/2000          11               200

;

data want;

  merge have have(firstobs=2 keep=price vol rename=(price=_p vol=_v));

    price=coalesce(price,_p);

  vol=coalesce(vol,_v);

  drop _:;

run;

However, if the gap in between is dynamic or unknown, then you will probably need something more complex, please note, if you have many variables to fill in, you will need to add some array().

data have;

input DATE :mmddyy10.    PRICE          VOL;

cards;

01/01/2000          10               100

01/02/2000          .                    .

01/03/2000          .                 .

01/04/2000          .                150

01/05/2000          11               200

;

data want (keep=date price vol);

  if _n_=1 then do;

    if 0 then set have (rename=(date=_d price=_p vol=_v));

  declare hash h(dataset:'have (rename=(date=_d price=_p vol=_v))', ordered:'y');

    h.definekey('_d');

  h.definedata(all:'y');

  h.definedone();

  declare hiter hi('h');

   end;

   set have;

      if missing(price) then do;

     rc=hi.setcur(key:date);

  do while (rc=0);

    if not missing(_p) then leave;

    rc=hi.next();

    end;

  end;

       price=coalesce(price,_p);

      if missing(vol) then do;

     rc=hi.setcur(key:date);

  do while (rc=0);

    if not missing(_v) then leave;

    rc=hi.next();

  end;

  end;

  vol=coalesce(vol,_v);

  run;

Haikuo

Super Contributor
Posts: 333

Re: How can I fill in missing values with data from the day after?

Here is another solution using retain in the data step. If the dataset is huge this may not be viable since it requires sorting the data, but thought I would post it out just in case you can use it.

data have;

input DATE :mmddyy10.    PRICE          VOL;

cards;

01/01/2000          10               100

01/02/2000          .                    .

01/03/2000          11               200

;

run;

proc sort data=have; by descending date ; run;

data have2;

  format date mmddyy10.;

  set have;

  by descending date ;

  if price = . then price = p;

  if vol = . then vol = v;

  p = price;

  v = vol;

  retain p v;

run;

proc sort data=have2; by date; run;

Hope it helps!

EJ

Ask a Question
Discussion stats
  • 2 replies
  • 165 views
  • 0 likes
  • 3 in conversation