Now, I got following data structure.
What I want to do is to replace missing values of item2999 to item2999 of same id and same year but freq==A
so that it would be..
Anyone know how to do this without lag operator?
Many Thanks.
Below is the SAS code for making dataset.
data test;
input code year_ freq $ item2999;
cards;
1 2000 A 1934
1 2001 A 1111
1 2002 A 2425
1 2002 B .
1 2003 A 2134
1 2003 B .
1 2004 A 111
2 2000 A 2166
2 2001 A 124125
2 2002 A 34643634
2 2003 A 12321451
2 2003 B 241242
2 2004 B .
2 2004 A 484848
;
run;
And if you want to avoid sorting:-
data test;
input code year_ freq $ item2999;
cards;
1 2000 A 1934
1 2001 A 1111
1 2002 A 2425
1 2002 B .
1 2003 A 2134
1 2003 B .
1 2004 A 111
2 2000 A 2166
2 2001 A 124125
2 2002 A 34643634
2 2003 A 12321451
2 2003 B 241242
2 2004 B .
2 2004 A 484848
;
run;
data want;
if _N_ = 1 then do;
if 0 then set test;
declare hash h(dataset: "test(where=(not missing(item2999))keep=code year_ item2999");
h.defineKey('code','year_');
h.defineData('item2999');
h.defineDone();
end;
set test;
if missing(item2999) then h.find();
run;
Regards,
Naveen Srinivasan
PS if you are familiar with Hashes and know how to maintain and modify, you'll be fine otherwise just ignore. Cheers
something like this
proc sort data =test;
by code year_;
run;
data have;
set test;
by code year_;
retain item2999_new;
if item2999 ne . then item2999_new=item2999;
else item2999= item2999_new;
drop item2999_new;
run;
Here's a small detail that makes a big difference in the effectiveness of various solutions.
Sometimes the A record appears before the B record, sometimes the other way around. Is it OK to sort the data set by all three variables: CODE YEAR_ FREQ ?
OK, similar to what was already suggested:
proc sort data=have;
by code year_ freq;
run;
data want;
set have;
by code year_;
retain replacement_value;
if first.year_ then replacement_value = item2999;
else if item_2999=. then item_2999 = replacement_value;
drop replacement_value;
run;
There are safeguards that could complicate the program (not included here). For example, what if a year contains two B records, but no A records. Should the first B record supply a value to replace a missing second B record? I skipped that sort of complication.
And if you want to avoid sorting:-
data test;
input code year_ freq $ item2999;
cards;
1 2000 A 1934
1 2001 A 1111
1 2002 A 2425
1 2002 B .
1 2003 A 2134
1 2003 B .
1 2004 A 111
2 2000 A 2166
2 2001 A 124125
2 2002 A 34643634
2 2003 A 12321451
2 2003 B 241242
2 2004 B .
2 2004 A 484848
;
run;
data want;
if _N_ = 1 then do;
if 0 then set test;
declare hash h(dataset: "test(where=(not missing(item2999))keep=code year_ item2999");
h.defineKey('code','year_');
h.defineData('item2999');
h.defineDone();
end;
set test;
if missing(item2999) then h.find();
run;
Regards,
Naveen Srinivasan
PS if you are familiar with Hashes and know how to maintain and modify, you'll be fine otherwise just ignore. Cheers
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.