I have the following dataset:
DATA have;
input year person_id firm_id;
DATALINES;
2000 1 405
2001 1 405
2002 1 405
2003 1 302
2004 1 302
2005 1 302
2006 1 405
2007 1 405
2008 1 405
2009 1 543
2010 1 543
2005 2 112
2006 2 112
2007 2 112
2010 2 112
2011 2 405
2012 2 543
2013 2 112
2014 2 112
2015 2 112
2016 2 112
2017 2 112
2018 2 112
2019 2 112
2001 3 405
2004 3 405
2005 3 302
2010 3 302
2011 3 405
2014 3 405
2015 3 405
2016 3 405
2000 4 302
2005 4 405
2010 4 112
;
RUN;
I want to create a variable called tenure to produce the following:
DATA want;
input year person_id firm_id tenure;
DATALINES;
2000 1 405 0
2001 1 405 1
2002 1 405 2
2003 1 302 0
2004 1 302 1
2005 1 302 2
2006 1 405 0
2007 1 405 1
2008 1 405 2
2009 1 543 0
2010 1 543 1
2005 2 112 0
2006 2 112 1
2007 2 112 2
2010 2 112 5
2011 2 405 0
2012 2 543 0
2013 2 112 0
2014 2 112 1
2015 2 112 2
2016 2 112 3
2017 2 112 4
2018 2 112 5
2019 2 112 6
2001 3 405 0
2004 3 405 3
2005 3 302 0
2010 3 302 5
2011 3 405 0
2014 3 405 3
2015 3 405 4
2016 3 405 5
2000 4 302 0
2005 4 405 0
2010 4 112 0
;
RUN;
Basically, tenure is constructed as the number of consecutive years that the same firm_id has served the person_id. For example, firm_id=405 serves person_id=1 from 2000 to 2002, but then person_id=1 switches to firm_id=302 in 2003, so the tenure variable resets to 0. In 2006, the person switches back to firm_id=405, even though this firm has served this person before, tenure is defined as consecutive years of service, so tenure begins from 0 again in 2006.
Note that there could be gaps in years, if there are gaps, and the firm_id did not change, then we assume the same firm_id has served the person_id. E.g., person_id=3 and firm_id=405 in 2001 and 2004. Even though there is a gap, since the firm_id does not change, tenure is 3 in 2004. If the firm_id changes when there is a gap, then tenure = 0. E.g., person_id=4.
data want;
set have;
by person_id firm_id notsorted;
tenure+dif(year);
if first.firm_id then tenure=0;
run;
data want;
set have;
by person_id firm_id notsorted;
retain _year;
if first.firm_id then _year = year;
tenure = year - _year;
drop _year;
run;
Untested, posted from my tablet.
data want;
set have;
by person_id firm_id notsorted;
tenure+dif(year);
if first.firm_id then tenure=0;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.