I know how to use the RETAIN statement to fill in missing values with the last non-missing value, but is there a way to do it backwards? That is, can I fill in previous missing values with the next non-missing value?
Example - I am trying to get "Filled Data" from the "Missing Values" column:
| Time Point | Missing Values | Filled Data |
|---|---|---|
1 | . | 7 |
| 2 | . | 7 |
| 3 | . | 7 |
| 4 | . | 7 |
| 5 | 7 | 7 |
| 6 | . | 6 |
| 7 | 6 | 6 |
| 8 | 5 | 5 |
| 9 | . | 4 |
| 10 | 4 | 4 |
| 11 | 3 | 3 |
| 12 | 2 | 2 |
| 13 | 1 | 1 |
One way would be to just presort the data in descending order and apply the same logic. e.g.,
data have;
input Time_Point Missing_Values;
cards;
1 .
2 .
3 .
4 .
5 7
6 .
7 6
8 5
9 .
10 4
11 3
12 2
13 1
;
proc sort data=have out=want;
by descending Time_Point;
run;
data want;
set want;
retain Filled_Data;
if not missing(Missing_Values) then
Filled_Data=Missing_Values;
run;
proc sort data=want;
by Time_Point;
run;
One way would be to just presort the data in descending order and apply the same logic. e.g.,
data have;
input Time_Point Missing_Values;
cards;
1 .
2 .
3 .
4 .
5 7
6 .
7 6
8 5
9 .
10 4
11 3
12 2
13 1
;
proc sort data=have out=want;
by descending Time_Point;
run;
data want;
set want;
retain Filled_Data;
if not missing(Missing_Values) then
Filled_Data=Missing_Values;
run;
proc sort data=want;
by Time_Point;
run;
This answer works great for this scenario. Once I posted my question, I thought about doing the reverse sort. However, I was not sure what to do if both the first AND last observations were missing. I guess re-sort and rerun the same data step?
If what you described make sense with your data, then you could do it with something like:
data have;
input Time_Point Missing_Values;
cards;
1 .
2 .
3 .
4 .
5 7
6 .
7 6
8 5
9 .
10 4
11 3
12 2
13 .
;
proc sort data=have out=want;
by descending Time_Point;
run;
data want;
set want;
retain Filled_Data;
if not missing(Missing_Values) then
Filled_Data=Missing_Values;
run;
proc sort data=want;
by Time_Point;
run;
data want (drop=last_:);
set want;
retain last_Filled_Data;
if not missing(Filled_Data) then
last_Filled_Data=Filled_Data;
else Filled_Data=last_Filled_Data;
run;
have you tried walking up and down your data with the POINT= option of the SET statement?
It is another method.
data have; input Time_Point Missing_Values; cards; 1 . 2 . 3 . 4 . 5 7 6 . 7 6 8 5 9 . 10 4 11 3 12 2 13 1 ; run; data want; do until(missing_values or last); set have end=last; if not missing(missing_values) then want_value=missing_values; end; do until(missing_values or _last); set have end=_last; output; end; run;
Ksharp
Hello
Do your code works when the dataset is grouped?
I am dealing with this case and so far I can not make your code to worK?
Could you please give me a hand?
Thank you
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.