I have the following dataset:
DATA have;
input year person_id firm_id;
DATALINES;
2000 1 405
2001 1 405
2002 1 405
2003 1 302
2004 1 302
2005 1 302
2006 1 405
2007 1 405
2008 1 405
2009 1 543
2010 1 543
2005 2 112
2006 2 112
2007 2 112
2010 2 112
2011 2 405
2012 2 543
2013 2 112
2014 2 112
2015 2 112
2016 2 112
2017 2 112
2018 2 112
2019 2 112
2001 3 405
2004 3 405
2005 3 302
2010 3 302
2011 3 405
2014 3 405
2015 3 405
2016 3 405
2000 4 302
2005 4 405
2010 4 112
;
RUN;
I want to create a variable called tenure to produce the following:
DATA want;
input year person_id firm_id tenure;
DATALINES;
2000 1 405 0
2001 1 405 1
2002 1 405 2
2003 1 302 0
2004 1 302 1
2005 1 302 2
2006 1 405 0
2007 1 405 1
2008 1 405 2
2009 1 543 0
2010 1 543 1
2005 2 112 0
2006 2 112 1
2007 2 112 2
2010 2 112 5
2011 2 405 0
2012 2 543 0
2013 2 112 0
2014 2 112 1
2015 2 112 2
2016 2 112 3
2017 2 112 4
2018 2 112 5
2019 2 112 6
2001 3 405 0
2004 3 405 3
2005 3 302 0
2010 3 302 5
2011 3 405 0
2014 3 405 3
2015 3 405 4
2016 3 405 5
2000 4 302 0
2005 4 405 0
2010 4 112 0
;
RUN;
Basically, tenure is constructed as the number of consecutive years that the same firm_id has served the person_id. For example, firm_id=405 serves person_id=1 from 2000 to 2002, but then person_id=1 switches to firm_id=302 in 2003, so the tenure variable resets to 0. In 2006, the person switches back to firm_id=405, even though this firm has served this person before, tenure is defined as consecutive years of service, so tenure begins from 0 again in 2006.
Note that there could be gaps in years, if there are gaps, and the firm_id did not change, then we assume the same firm_id has served the person_id. E.g., person_id=3 and firm_id=405 in 2001 and 2004. Even though there is a gap, since the firm_id does not change, tenure is 3 in 2004. If the firm_id changes when there is a gap, then tenure = 0. E.g., person_id=4.
data want;
set have;
by person_id firm_id notsorted;
tenure+dif(year);
if first.firm_id then tenure=0;
run;
data want;
set have;
by person_id firm_id notsorted;
retain _year;
if first.firm_id then _year = year;
tenure = year - _year;
drop _year;
run;
Untested, posted from my tablet.
data want;
set have;
by person_id firm_id notsorted;
tenure+dif(year);
if first.firm_id then tenure=0;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.