BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SidB
Calcite | Level 5

I need to replace all the missing values (in previous instances) with the latest instance values

 

ID   Start       End

1   3/4/2022  

1   3/5/2022   3/6/2022

 

needs to looks like - 

ID   Start       End

1   3/4/2022  3/6/2022

1   3/5/2022   3/6/2022

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

The double DOW technique is well suited for this kind of operation:

 

data have;
input ID (Start End) (:mmddyy.);
format start end yymmdd10.;
datalines;
1   3/4/2022  .
1   3/5/2022   3/6/2022
;

data want;
do until (last.id);
    set have; by id;
    if not missing(end) then latest = max(latest, end);
    end;
do until (last.id);
    set have; by id;
    end = coalesce(end, latest);
    output;
    end;
drop latest;
run;
PG

View solution in original post

3 REPLIES 3
PGStats
Opal | Level 21

The double DOW technique is well suited for this kind of operation:

 

data have;
input ID (Start End) (:mmddyy.);
format start end yymmdd10.;
datalines;
1   3/4/2022  .
1   3/5/2022   3/6/2022
;

data want;
do until (last.id);
    set have; by id;
    if not missing(end) then latest = max(latest, end);
    end;
do until (last.id);
    set have; by id;
    end = coalesce(end, latest);
    output;
    end;
drop latest;
run;
PG
SidB
Calcite | Level 5

@PGStats Thank you so much! this worked

mkeintz
PROC Star

The double DOW, as @PGStats suggests is well suited to your task.

 

The core logic of the double DOW is to read all obs for each ID twice, the first time to establish the latest END value, and the second time to assign that value when necessary.

 

The code below also reads each ID twice, but uses the "IN=" dataset name parameters (in the SET statement) to identify the equivalent of each DO loop:

 

data have;
input ID (Start End) (:mmddyy.);
format start end yymmdd10.;
datalines;
1   3/4/2022  .
1   3/5/2022   3/6/2022
run;

data want (drop=_:);
  set have (in=firstpass)  have (in=secondpass);
  by id;
  retain _last_end;
  if first.id then call missing(_last_end);
  if firstpass then _last_end=coalesce(end,_last_end);
  if secondpass;
  end=coalesce(end,_last_end);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 2312 views
  • 4 likes
  • 3 in conversation