BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
abdulla
Pyrite | Level 9
GVKEYfyearEXECIDAGE
00253820070215856
002538200802158 
002538200902158 
002538201002158 
00253820114244157
00253820124244158
00253820134244159
00253820144243963
00253820153077650
00253820163077651
002554199300222 
00255419940022259
00255419950022260
002554199600222 
00255419970022262
002554199800222 

 

I want the following result. It is filling the missing age by execid (based on previously given age)

GVKEYfyearEXECIDAGE
00253820070215856
00253820080215857
00253820090215858
00253820100215859
00253820114244157
00253820124244158
00253820134244159
00253820144243963
00253820153077650
00253820163077651
002554199300222 
00255419940022259
00255419950022260
00255419960022261
00255419970022262
00255419980022263
1 ACCEPTED SOLUTION
8 REPLIES 8
Reeza
Super User
How do you know which one is correct? Is it always the assumption that the first record is correct?
abdulla
Pyrite | Level 9
Yes, first record is correct. But sometimes first record is missing and for that I want to do it by descending fyear
novinosrin
Tourmaline | Level 20

HI @abdulla  I would recommend learning a set of doctrines called "Dorfmanisms" taught by Guru @hashman  His disciples like me have the privilege of plagiarizing it 🙂  

 



data have;
infile cards truncover;
input GVKEY $	fyear	EXECID $	AGE;
cards;
002538	2007	02158	56
002538	2008	02158	 
002538	2009	02158	 
002538	2010	02158	 
002538	2011	42441	57
002538	2012	42441	58
002538	2013	42441	59
002538	2014	42439	63
002538	2015	30776	50
002538	2016	30776	51
002554	1993	00222	 
002554	1994	00222	59
002554	1995	00222	60
002554	1996	00222	 
002554	1997	00222	62
002554	1998	00222	 
;

data want;
 _iorc_=.;
 do until(last.execid);
  set have;
  by gvkey execid notsorted;
  if age then _iorc_=age;
  else age=_iorc_;
  output;
  _iorc_+1;
 end;
run;

proc print noobs;run;

 

Kurt_Bremser
Super User

@novinosrin your code has tabs in the datalines, so you should use dlm='09'x.

And it does not solve the issue of a first observation of a group having a missing age.

novinosrin
Tourmaline | Level 20

HI Sir @Kurt_Bremser  Thank you. The tab delimiter didn't quite impact the output when I ran using PC SAS however yes it helps to be cognizant when running the same in a SAS EG editor page, and so I have added EXPANDTABS

 



data have;
infile cards expandtabs truncover;
input GVKEY $	fyear	EXECID $	AGE;
cards;
002538	2007	02158	56
002538	2008	02158	 
002538	2009	02158	 
002538	2010	02158	 
002538	2011	42441	57
002538	2012	42441	58
002538	2013	42441	59
002538	2014	42439	63
002538	2015	30776	50
002538	2016	30776	51
002554	1993	00222	 
002554	1994	00222	59
002554	1995	00222	60
002554	1996	00222	 
002554	1997	00222	62
002554	1998	00222	 
;

As far as "And it does not solve the issue of a first observation of a group having a missing age."-

I believe you are referring to

GVKEY fyear EXECID AGE
002554 1993 00222 .

OP's expected out(WANT in the question) below seems to be start with the 1st non-missing value

002554 1993 00222  
002554 1994 00222 59
002554 1995 00222 60
002554 1996 00222 61
002554 1997 00222 62
002554 1998 00222 63

 

which leads to me to believe the logic doesn't have to solve for the very first observation of a BY GROUP i.e. GVKEY EXECID combination. So, thus far, it appears my results seem to match the WANT of OP. 

I guess whether the expected out WANT is actually different, I am afraid that would have to be confirmed by OP

novinosrin
Tourmaline | Level 20

Hmm in that case, I guess the logic becomes a mere sequential counter starting from the earliest non-missing age value-

So, no real charm


data have;
infile cards expandtabs truncover;
input GVKEY $	fyear	EXECID $	AGE;
cards;
002538	2007	02158	56
002538	2008	02158	 
002538	2009	02158	 
002538	2010	02158	 
002538	2011	42441	57
002538	2012	42441	58
002538	2013	42441	59
002538	2014	42439	63
002538	2015	30776	50
002538	2016	30776	51
002554	1993	00222	 
002554	1994	00222	59
002554	1995	00222	60
002554	1996	00222	 
002554	1997	00222	62
002554	1998	00222	 
;
data want;
 if _n_=1 then do;
  dcl hash h(dataset:'have(where=(AGE))');
  h.definekey('GVKEY','EXECID');
  h.definedata('age');
  h.definedone();
 end;
 do until(last.execid);
  set have(drop=age);
  by gvkey execid notsorted;
  if first.execid then _n_=h.find();
  else age=sum(age,1);
  output;
 end;
run;

proc print noobs;run;

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 820 views
  • 2 likes
  • 4 in conversation