I know how to use the RETAIN statement to fill in missing values with the last non-missing value, but is there a way to do it backwards? That is, can I fill in previous missing values with the next non-missing value?
Example - I am trying to get "Filled Data" from the "Missing Values" column:
Time Point | Missing Values | Filled Data |
---|---|---|
1 | . | 7 |
2 | . | 7 |
3 | . | 7 |
4 | . | 7 |
5 | 7 | 7 |
6 | . | 6 |
7 | 6 | 6 |
8 | 5 | 5 |
9 | . | 4 |
10 | 4 | 4 |
11 | 3 | 3 |
12 | 2 | 2 |
13 | 1 | 1 |
One way would be to just presort the data in descending order and apply the same logic. e.g.,
data have;
input Time_Point Missing_Values;
cards;
1 .
2 .
3 .
4 .
5 7
6 .
7 6
8 5
9 .
10 4
11 3
12 2
13 1
;
proc sort data=have out=want;
by descending Time_Point;
run;
data want;
set want;
retain Filled_Data;
if not missing(Missing_Values) then
Filled_Data=Missing_Values;
run;
proc sort data=want;
by Time_Point;
run;
One way would be to just presort the data in descending order and apply the same logic. e.g.,
data have;
input Time_Point Missing_Values;
cards;
1 .
2 .
3 .
4 .
5 7
6 .
7 6
8 5
9 .
10 4
11 3
12 2
13 1
;
proc sort data=have out=want;
by descending Time_Point;
run;
data want;
set want;
retain Filled_Data;
if not missing(Missing_Values) then
Filled_Data=Missing_Values;
run;
proc sort data=want;
by Time_Point;
run;
This answer works great for this scenario. Once I posted my question, I thought about doing the reverse sort. However, I was not sure what to do if both the first AND last observations were missing. I guess re-sort and rerun the same data step?
If what you described make sense with your data, then you could do it with something like:
data have;
input Time_Point Missing_Values;
cards;
1 .
2 .
3 .
4 .
5 7
6 .
7 6
8 5
9 .
10 4
11 3
12 2
13 .
;
proc sort data=have out=want;
by descending Time_Point;
run;
data want;
set want;
retain Filled_Data;
if not missing(Missing_Values) then
Filled_Data=Missing_Values;
run;
proc sort data=want;
by Time_Point;
run;
data want (drop=last_:);
set want;
retain last_Filled_Data;
if not missing(Filled_Data) then
last_Filled_Data=Filled_Data;
else Filled_Data=last_Filled_Data;
run;
have you tried walking up and down your data with the POINT= option of the SET statement?
It is another method.
data have; input Time_Point Missing_Values; cards; 1 . 2 . 3 . 4 . 5 7 6 . 7 6 8 5 9 . 10 4 11 3 12 2 13 1 ; run; data want; do until(missing_values or last); set have end=last; if not missing(missing_values) then want_value=missing_values; end; do until(missing_values or _last); set have end=_last; output; end; run;
Ksharp
Hello
Do your code works when the dataset is grouped?
I am dealing with this case and so far I can not make your code to worK?
Could you please give me a hand?
Thank you
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.