Now, I got following data structure.
What I want to do is to replace missing values of item2999 to item2999 of same id and same year but freq==A
so that it would be..
Anyone know how to do this without lag operator?
Many Thanks.
Below is the SAS code for making dataset.
data test;
input code year_ freq $ item2999;
cards;
1 2000 A 1934
1 2001 A 1111
1 2002 A 2425
1 2002 B .
1 2003 A 2134
1 2003 B .
1 2004 A 111
2 2000 A 2166
2 2001 A 124125
2 2002 A 34643634
2 2003 A 12321451
2 2003 B 241242
2 2004 B .
2 2004 A 484848
;
run;
And if you want to avoid sorting:-
data test;
input code year_ freq $ item2999;
cards;
1 2000 A 1934
1 2001 A 1111
1 2002 A 2425
1 2002 B .
1 2003 A 2134
1 2003 B .
1 2004 A 111
2 2000 A 2166
2 2001 A 124125
2 2002 A 34643634
2 2003 A 12321451
2 2003 B 241242
2 2004 B .
2 2004 A 484848
;
run;
data want;
if _N_ = 1 then do;
if 0 then set test;
declare hash h(dataset: "test(where=(not missing(item2999))keep=code year_ item2999");
h.defineKey('code','year_');
h.defineData('item2999');
h.defineDone();
end;
set test;
if missing(item2999) then h.find();
run;
Regards,
Naveen Srinivasan
PS if you are familiar with Hashes and know how to maintain and modify, you'll be fine otherwise just ignore. Cheers
something like this
proc sort data =test;
by code year_;
run;
data have;
set test;
by code year_;
retain item2999_new;
if item2999 ne . then item2999_new=item2999;
else item2999= item2999_new;
drop item2999_new;
run;
Here's a small detail that makes a big difference in the effectiveness of various solutions.
Sometimes the A record appears before the B record, sometimes the other way around. Is it OK to sort the data set by all three variables: CODE YEAR_ FREQ ?
OK, similar to what was already suggested:
proc sort data=have;
by code year_ freq;
run;
data want;
set have;
by code year_;
retain replacement_value;
if first.year_ then replacement_value = item2999;
else if item_2999=. then item_2999 = replacement_value;
drop replacement_value;
run;
There are safeguards that could complicate the program (not included here). For example, what if a year contains two B records, but no A records. Should the first B record supply a value to replace a missing second B record? I skipped that sort of complication.
And if you want to avoid sorting:-
data test;
input code year_ freq $ item2999;
cards;
1 2000 A 1934
1 2001 A 1111
1 2002 A 2425
1 2002 B .
1 2003 A 2134
1 2003 B .
1 2004 A 111
2 2000 A 2166
2 2001 A 124125
2 2002 A 34643634
2 2003 A 12321451
2 2003 B 241242
2 2004 B .
2 2004 A 484848
;
run;
data want;
if _N_ = 1 then do;
if 0 then set test;
declare hash h(dataset: "test(where=(not missing(item2999))keep=code year_ item2999");
h.defineKey('code','year_');
h.defineData('item2999');
h.defineDone();
end;
set test;
if missing(item2999) then h.find();
run;
Regards,
Naveen Srinivasan
PS if you are familiar with Hashes and know how to maintain and modify, you'll be fine otherwise just ignore. Cheers
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.