I'm trying to figure out how to impute missing values with the most recent value. I found a paper (https://documentation.sas.com/?docsetId=casdspgm&docsetTarget=p0x15q4nde1191n10s3x42uy4epc.htm&docse...), but it seems to assume an upper bound for the gap between non-missing values.
Lets say we have the following dataset:
class Group
1001 A
1002 B
1003 C
1003
1003 C
1020 D
1020
1020
1020
1020
Desired output:
class Group
1001 A
1002 B
1003 C
1003 C
1003 C
1020 D
1020 D
1020 D
1020 D
1020 D
Assume the group stays consistent within all classes and the first instance of the class always has a filled in group. The number of records in a class is variable. I tried using the first function, but I keep getting a value of 1 even though the Group variable is composed of characters or strings, so I'm not sure whats happening there.
yes, first. and last. variables are always 0 or 1. Those values are not copied from the values of your GROUP in this example. That is an important tool and well worth learning.
If there are no other variables to consider, this would do the trick:
data want;
update have (obs=0) have;
by class;
output;
run;
If there are other variables that you want to leave unaffected, try it this way:
data want;
set have;
by class;
if first.class or group > ' ' then new_group = group;
retain new_group;
drop group;
rename new_group = group;
run;
Can you show us the desired output from this data set?
Why does the answer have to use LAG?
ok edited my post... it doesn't have to use lag (which is prolly the case), but I assumed that would be the most straightforward. Other solutions are still appreciated!
data want;
retain want_group;
infile cards missover;
input class group $;
if not missing(group) then want_group=group;
cards;
1001 A
1002 B
1003 C
1003
1003 C
1020 D
1020
1020
1020
1020
;
run;
Hello @toesockshoe Easy with UPDATE statement
data have;
infile cards truncover;
input class Group $;
cards;
1001 A
1002 B
1003 C
1003
1003 C
1020 D
1020
1020
1020
1020
;
data want;
update have(obs=0) have;
by class;
output;
run;
yes, first. and last. variables are always 0 or 1. Those values are not copied from the values of your GROUP in this example. That is an important tool and well worth learning.
If there are no other variables to consider, this would do the trick:
data want;
update have (obs=0) have;
by class;
output;
run;
If there are other variables that you want to leave unaffected, try it this way:
data want;
set have;
by class;
if first.class or group > ' ' then new_group = group;
retain new_group;
drop group;
rename new_group = group;
run;
cheers mate.... works like a charm
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.