BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
elbarto
Obsidian | Level 7

I have the following dataset:



DATA have;
input year	person_id	firm_id;
DATALINES;
2000	1	405
2001	1	405
2002	1	405
2003	1	302
2004	1	302
2005	1	302
2006	1	405
2007	1	405
2008	1	405
2009	1	543
2010	1	543
2005	2	112
2006	2	112
2007	2	112
2010	2	112
2011	2	405
2012	2	543
2013	2	112
2014	2	112
2015	2	112
2016	2	112
2017	2	112
2018	2	112
2019	2	112
2001	3	405
2004	3	405
2005	3	302
2010	3	302
2011	3	405
2014	3	405
2015	3	405
2016	3	405
2000	4	302
2005	4	405
2010	4	112

;
RUN;

 

 

I want to create a variable called tenure to produce the following:

DATA want;
input year	person_id	firm_id	tenure;
DATALINES;
2000	1	405	0
2001	1	405	1
2002	1	405	2
2003	1	302	0
2004	1	302	1
2005	1	302	2
2006	1	405	0
2007	1	405	1
2008	1	405	2
2009	1	543	0
2010	1	543	1
2005	2	112	0
2006	2	112	1
2007	2	112	2
2010	2	112	5
2011	2	405	0
2012	2	543	0
2013	2	112	0
2014	2	112	1
2015	2	112	2
2016	2	112	3
2017	2	112	4
2018	2	112	5
2019	2	112	6
2001	3	405	0
2004	3	405	3
2005	3	302	0
2010	3	302	5
2011	3	405	0
2014	3	405	3
2015	3	405	4
2016	3	405	5
2000	4	302	0
2005	4	405	0
2010	4	112	0
;
RUN;

 

Basically, tenure is constructed as the number of consecutive years that the same firm_id has served the person_id. For example, firm_id=405 serves person_id=1 from 2000 to 2002, but then person_id=1 switches to firm_id=302 in 2003, so the tenure variable resets to 0. In 2006, the person switches back to firm_id=405, even though this firm has served this person before, tenure is defined as consecutive years of service, so tenure begins from 0 again in 2006.

 

 

Note that there could be gaps in years, if there are gaps, and the firm_id did not change, then we assume the same firm_id has served the person_id. E.g., person_id=3 and firm_id=405 in 2001 and 2004. Even though there is a gap, since the firm_id does not change, tenure is 3 in 2004. If the firm_id changes when there is a gap, then tenure = 0. E.g., person_id=4.

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star
data want;
  set have;
  by person_id firm_id notsorted;
  tenure+dif(year);
  if first.firm_id then tenure=0;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

2 REPLIES 2
Kurt_Bremser
Super User
data want;
set have;
by person_id firm_id notsorted;
retain _year;
if first.firm_id then _year = year;
tenure = year - _year;
drop _year;
run;

Untested, posted from my tablet.

mkeintz
PROC Star
data want;
  set have;
  by person_id firm_id notsorted;
  tenure+dif(year);
  if first.firm_id then tenure=0;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 843 views
  • 2 likes
  • 3 in conversation