I know how to use the RETAIN statement to fill in missing values with the last non-missing value, but is there a way to do it backwards? That is, can I fill in previous missing values with the next non-missing value?
Example - I am trying to get "Filled Data" from the "Missing Values" column:
Time Point | Missing Values | Filled Data |
---|---|---|
1 | . | 7 |
2 | . | 7 |
3 | . | 7 |
4 | . | 7 |
5 | 7 | 7 |
6 | . | 6 |
7 | 6 | 6 |
8 | 5 | 5 |
9 | . | 4 |
10 | 4 | 4 |
11 | 3 | 3 |
12 | 2 | 2 |
13 | 1 | 1 |
One way would be to just presort the data in descending order and apply the same logic. e.g.,
data have;
input Time_Point Missing_Values;
cards;
1 .
2 .
3 .
4 .
5 7
6 .
7 6
8 5
9 .
10 4
11 3
12 2
13 1
;
proc sort data=have out=want;
by descending Time_Point;
run;
data want;
set want;
retain Filled_Data;
if not missing(Missing_Values) then
Filled_Data=Missing_Values;
run;
proc sort data=want;
by Time_Point;
run;
One way would be to just presort the data in descending order and apply the same logic. e.g.,
data have;
input Time_Point Missing_Values;
cards;
1 .
2 .
3 .
4 .
5 7
6 .
7 6
8 5
9 .
10 4
11 3
12 2
13 1
;
proc sort data=have out=want;
by descending Time_Point;
run;
data want;
set want;
retain Filled_Data;
if not missing(Missing_Values) then
Filled_Data=Missing_Values;
run;
proc sort data=want;
by Time_Point;
run;
This answer works great for this scenario. Once I posted my question, I thought about doing the reverse sort. However, I was not sure what to do if both the first AND last observations were missing. I guess re-sort and rerun the same data step?
If what you described make sense with your data, then you could do it with something like:
data have;
input Time_Point Missing_Values;
cards;
1 .
2 .
3 .
4 .
5 7
6 .
7 6
8 5
9 .
10 4
11 3
12 2
13 .
;
proc sort data=have out=want;
by descending Time_Point;
run;
data want;
set want;
retain Filled_Data;
if not missing(Missing_Values) then
Filled_Data=Missing_Values;
run;
proc sort data=want;
by Time_Point;
run;
data want (drop=last_:);
set want;
retain last_Filled_Data;
if not missing(Filled_Data) then
last_Filled_Data=Filled_Data;
else Filled_Data=last_Filled_Data;
run;
have you tried walking up and down your data with the POINT= option of the SET statement?
It is another method.
data have; input Time_Point Missing_Values; cards; 1 . 2 . 3 . 4 . 5 7 6 . 7 6 8 5 9 . 10 4 11 3 12 2 13 1 ; run; data want; do until(missing_values or last); set have end=last; if not missing(missing_values) then want_value=missing_values; end; do until(missing_values or _last); set have end=_last; output; end; run;
Ksharp
Hello
Do your code works when the dataset is grouped?
I am dealing with this case and so far I can not make your code to worK?
Could you please give me a hand?
Thank you
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.