BookmarkSubscribeRSS Feed
Taoran
Calcite | Level 5

Hi all,

 

I want to backfill the ExecID data if BECAMECEO<=fyear<=LEFTOFC. 

Take year 1979 as an example, the EXECID should be the EXECID in 1994 because 1979 is within the executive employment period. 

 

year in ExecuCompDate became CEOdate left CEO name EXECID
1979    
1980    
1981    
1982    
1983    
1984    
1985    
1986    
1987    
1988    
1989    
1990    
1991    
1992    
1993    
199401Jan195501OCT1996Ira A. Eichner9248
199501Jan195501OCT1996Ira A. Eichner9249
199609OCT199631MAY2018David P. Storch9249

 

How can I code this to backfill the data?

6 REPLIES 6
mkeintz
PROC Star

If you want a reliable, tested code sample, please provide your sample data in the form of a working data step.  You don't even have a column named fyear, yet your refer to it in your question.  Probably we can guess, but is it a good idea to force your advisors to guess?  Help us help you.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Taoran
Calcite | Level 5

This is the data I have:

FyearDate became CEOdate left CEO name EXECID
1979   .
1980   .
1981   .
1982   .
1983   .
1984   .
1985   .
1986   .
1987   .
1988   .
1989   .
1990   .
1991   .
1992   .
1993   .
199401Jan195501OCT1996Ira A. Eichner9248
199501Jan195501OCT1996Ira A. Eichner9249
199609OCT199631MAY2018David P. Storch9249

 

And this is the data I want:

FyearDate became CEOdate left CEO name EXECID
1979   9248
1980   9248
1981   9248
1982   9248
1983   9248
1984   9248
1985   9248
1986   9248
1987   9248
1988   9248
1989   9248
1990   9248
1991   9248
1992   9248
1993   9248
199401Jan195501OCT1996Ira A. Eichner9248
199501Jan195501OCT1996Ira A. Eichner9249
199609OCT199631MAY2018David P. Storch9249
AhmedAl_Attar
Ammonite | Level 13

@Taoran 

Try this 

/* First sort descending to give the latest values at top */
proc sort data=have out=have_srtd; 
	by descending 'year in ExecuComp'n ;
run;

/* Keep track the first non-missing Previous value and use it to replace the missing value */
data want(drop=prev_val);
	set work.have_srtd;
	retain prev_val;
	if (lag(EXECID) ge 0) then prev_val=lag(EXECID);
	put prev_val= EXECID=;
	if (missing(EXECID)=1) then EXECID=prev_val;
	put prev_val= EXECID=;
run;
/* Restore original sort order */
proc sort data=work.want;
	by 'year in ExecuComp'n ;
run;
Taoran
Calcite | Level 5

Hi AhmedA1,

 

Thanks for your reply. 

I did not quite understand your code, but I did not see the code like  'BECAMECEO<=year in execucomp<=LEFTOFC' appears, and this is a really important condition that I want. 

AhmedAl_Attar
Ammonite | Level 13
Comparing year to a date will never work!
the end-year and start-year in your sample data are overlapping!!

1995 01Jan1955 01OCT1996 Ira A. Eichner 9249
1996 09OCT1996 31MAY2018 David P. Storch 9249

When comparing 1996 to date ranges, it evaluates to true for both rows
mkeintz
PROC Star

Editte note: you still haven't provided a working data step, so the code below is untested.

 

 

This looks like Execucomp data from Compustat.  If so, then fiscal year (variable FYEAR) need not include January 1st of the same nominal year.   For instance,  fiscal year May 1993-April 1994 is identified as fiscal year 1993 (more months in calendar 1993 than calendar 1994), but does not include Jan 1, 1993.  If a person became CEO of such a company in February 1994 you would presumably miss the fact that they overlap with FYEAR=1993.   Isn't there another variable indicating which month (with possible values 1 through 12) is the last of a fiscal year  Fiscal years ending in January - through May are identified with the year in which they start.  Fiscal years ending in June through December are identified with the year containing the end month.

 

And more generally, you haven't stated what you want done when two people are CEO's in the same FYEAR.  You show one record per FYEAR.  So if there are two (or more!) CEO's in one fiscal year, what do you want in the result?  Multiple records?   One record with the CEO having has the most months?  But if so, then how will you determine which CEO has the most months?  Especially given that fiscal years need not align with calendar years.

 

Also, presumably you have this need to retrieve future values for multiple companies (multiple gvkeys?), so you would need your sample data to have gvkey, and to be sorted by gvkey fyear.

 

data want (drop=y);
  set have (in=firstpass)  have (in=secondpass);
  by gvkey ;

  array id_hist {1990:2020} _temporary_;

  if first.gvkey then call missing(of id_hist{*});
  if firstpass=1 and nmiss(begindate,enddate)=0 then do y=year(begindate) to year(enddate);
    id_hist(y)=execid;
  end;
  if secondpass;
  if missing(execid) then execid=id_hist{fyear};
run;

For any observation with a blank execid, if that fyear value overlapped if more than one date range for CEO's, this program would return the latter execid.  

 

Also note this program does not sort in descending chronological order, then carry "forward" values, then re-sort in ascending order.  Instead it reads each GVKEY twice - once to establish an ID_HIST, the second time to retrieve from that history.  

 

Finally, this program covers 1990 through 2020.  If you need more years, change the lower and/or upper bounds of the id_hist array.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1406 views
  • 0 likes
  • 3 in conversation