Now, I got following data structure.
What I want to do is to replace missing values of item2999 to item2999 of same id and same year but freq==A
so that it would be..
Anyone know how to do this without lag operator?
Many Thanks.
Below is the SAS code for making dataset.
data test;
input code year_ freq $ item2999;
cards;
1 2000 A 1934
1 2001 A 1111
1 2002 A 2425
1 2002 B .
1 2003 A 2134
1 2003 B .
1 2004 A 111
2 2000 A 2166
2 2001 A 124125
2 2002 A 34643634
2 2003 A 12321451
2 2003 B 241242
2 2004 B .
2 2004 A 484848
;
run;
And if you want to avoid sorting:-
data test;
input code year_ freq $ item2999;
cards;
1 2000 A 1934
1 2001 A 1111
1 2002 A 2425
1 2002 B .
1 2003 A 2134
1 2003 B .
1 2004 A 111
2 2000 A 2166
2 2001 A 124125
2 2002 A 34643634
2 2003 A 12321451
2 2003 B 241242
2 2004 B .
2 2004 A 484848
;
run;
data want;
if _N_ = 1 then do;
if 0 then set test;
declare hash h(dataset: "test(where=(not missing(item2999))keep=code year_ item2999");
h.defineKey('code','year_');
h.defineData('item2999');
h.defineDone();
end;
set test;
if missing(item2999) then h.find();
run;
Regards,
Naveen Srinivasan
PS if you are familiar with Hashes and know how to maintain and modify, you'll be fine otherwise just ignore. Cheers
something like this
proc sort data =test;
by code year_;
run;
data have;
set test;
by code year_;
retain item2999_new;
if item2999 ne . then item2999_new=item2999;
else item2999= item2999_new;
drop item2999_new;
run;
Here's a small detail that makes a big difference in the effectiveness of various solutions.
Sometimes the A record appears before the B record, sometimes the other way around. Is it OK to sort the data set by all three variables: CODE YEAR_ FREQ ?
OK, similar to what was already suggested:
proc sort data=have;
by code year_ freq;
run;
data want;
set have;
by code year_;
retain replacement_value;
if first.year_ then replacement_value = item2999;
else if item_2999=. then item_2999 = replacement_value;
drop replacement_value;
run;
There are safeguards that could complicate the program (not included here). For example, what if a year contains two B records, but no A records. Should the first B record supply a value to replace a missing second B record? I skipped that sort of complication.
And if you want to avoid sorting:-
data test;
input code year_ freq $ item2999;
cards;
1 2000 A 1934
1 2001 A 1111
1 2002 A 2425
1 2002 B .
1 2003 A 2134
1 2003 B .
1 2004 A 111
2 2000 A 2166
2 2001 A 124125
2 2002 A 34643634
2 2003 A 12321451
2 2003 B 241242
2 2004 B .
2 2004 A 484848
;
run;
data want;
if _N_ = 1 then do;
if 0 then set test;
declare hash h(dataset: "test(where=(not missing(item2999))keep=code year_ item2999");
h.defineKey('code','year_');
h.defineData('item2999');
h.defineDone();
end;
set test;
if missing(item2999) then h.find();
run;
Regards,
Naveen Srinivasan
PS if you are familiar with Hashes and know how to maintain and modify, you'll be fine otherwise just ignore. Cheers
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.