GVKEY | fyear | EXECID | AGE |
002538 | 2007 | 02158 | 56 |
002538 | 2008 | 02158 | |
002538 | 2009 | 02158 | |
002538 | 2010 | 02158 | |
002538 | 2011 | 42441 | 57 |
002538 | 2012 | 42441 | 58 |
002538 | 2013 | 42441 | 59 |
002538 | 2014 | 42439 | 63 |
002538 | 2015 | 30776 | 50 |
002538 | 2016 | 30776 | 51 |
002554 | 1993 | 00222 | |
002554 | 1994 | 00222 | 59 |
002554 | 1995 | 00222 | 60 |
002554 | 1996 | 00222 | |
002554 | 1997 | 00222 | 62 |
002554 | 1998 | 00222 |
I want the following result. It is filling the missing age by execid (based on previously given age)
GVKEY | fyear | EXECID | AGE |
002538 | 2007 | 02158 | 56 |
002538 | 2008 | 02158 | 57 |
002538 | 2009 | 02158 | 58 |
002538 | 2010 | 02158 | 59 |
002538 | 2011 | 42441 | 57 |
002538 | 2012 | 42441 | 58 |
002538 | 2013 | 42441 | 59 |
002538 | 2014 | 42439 | 63 |
002538 | 2015 | 30776 | 50 |
002538 | 2016 | 30776 | 51 |
002554 | 1993 | 00222 | |
002554 | 1994 | 00222 | 59 |
002554 | 1995 | 00222 | 60 |
002554 | 1996 | 00222 | 61 |
002554 | 1997 | 00222 | 62 |
002554 | 1998 | 00222 | 63 |
HI @abdulla I would recommend learning a set of doctrines called "Dorfmanisms" taught by Guru @hashman His disciples like me have the privilege of plagiarizing it 🙂
data have;
infile cards truncover;
input GVKEY $ fyear EXECID $ AGE;
cards;
002538 2007 02158 56
002538 2008 02158
002538 2009 02158
002538 2010 02158
002538 2011 42441 57
002538 2012 42441 58
002538 2013 42441 59
002538 2014 42439 63
002538 2015 30776 50
002538 2016 30776 51
002554 1993 00222
002554 1994 00222 59
002554 1995 00222 60
002554 1996 00222
002554 1997 00222 62
002554 1998 00222
;
data want;
_iorc_=.;
do until(last.execid);
set have;
by gvkey execid notsorted;
if age then _iorc_=age;
else age=_iorc_;
output;
_iorc_+1;
end;
run;
proc print noobs;run;
@novinosrin your code has tabs in the datalines, so you should use dlm='09'x.
And it does not solve the issue of a first observation of a group having a missing age.
HI Sir @Kurt_Bremser Thank you. The tab delimiter didn't quite impact the output when I ran using PC SAS however yes it helps to be cognizant when running the same in a SAS EG editor page, and so I have added EXPANDTABS
data have;
infile cards expandtabs truncover;
input GVKEY $ fyear EXECID $ AGE;
cards;
002538 2007 02158 56
002538 2008 02158
002538 2009 02158
002538 2010 02158
002538 2011 42441 57
002538 2012 42441 58
002538 2013 42441 59
002538 2014 42439 63
002538 2015 30776 50
002538 2016 30776 51
002554 1993 00222
002554 1994 00222 59
002554 1995 00222 60
002554 1996 00222
002554 1997 00222 62
002554 1998 00222
;
As far as "And it does not solve the issue of a first observation of a group having a missing age."-
I believe you are referring to
GVKEY fyear EXECID AGE
002554 1993 00222 .
OP's expected out(WANT in the question) below seems to be start with the 1st non-missing value
002554 | 1993 | 00222 | |
002554 | 1994 | 00222 | 59 |
002554 | 1995 | 00222 | 60 |
002554 | 1996 | 00222 | 61 |
002554 | 1997 | 00222 | 62 |
002554 | 1998 | 00222 | 63 |
which leads to me to believe the logic doesn't have to solve for the very first observation of a BY GROUP i.e. GVKEY EXECID combination. So, thus far, it appears my results seem to match the WANT of OP.
I guess whether the expected out WANT is actually different, I am afraid that would have to be confirmed by OP
Yes, but later the OP stated this
But sometimes first record is missing and for that I want to do it by descending fyear
Hmm in that case, I guess the logic becomes a mere sequential counter starting from the earliest non-missing age value-
So, no real charm
data have;
infile cards expandtabs truncover;
input GVKEY $ fyear EXECID $ AGE;
cards;
002538 2007 02158 56
002538 2008 02158
002538 2009 02158
002538 2010 02158
002538 2011 42441 57
002538 2012 42441 58
002538 2013 42441 59
002538 2014 42439 63
002538 2015 30776 50
002538 2016 30776 51
002554 1993 00222
002554 1994 00222 59
002554 1995 00222 60
002554 1996 00222
002554 1997 00222 62
002554 1998 00222
;
data want;
if _n_=1 then do;
dcl hash h(dataset:'have(where=(AGE))');
h.definekey('GVKEY','EXECID');
h.definedata('age');
h.definedone();
end;
do until(last.execid);
set have(drop=age);
by gvkey execid notsorted;
if first.execid then _n_=h.find();
else age=sum(age,1);
output;
end;
run;
proc print noobs;run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.