BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
serrld113
Obsidian | Level 7

Hey guys, I'm struggling with building a counter variable.

 

What I have to do is count the number of years an employee has been in a certain position in a certain department, without counting the employees who were in that position previously. I only need continuous years of employment, if there is a break in employment, then I would only want to count from the most recent hire year.

 

My variables are: year, department, position, emp_id,. After running a proc sql to pull the employees I ran a proc sort and then this data step:

 

data makegroup;
set match;
countvar +1;
by year department position emp_id;
if first.year then countvar=1;
run;

This unfortunately starts a counter from 1 all the way to 40k+ ....

 

 

The next step I have

 

proc sql;
create table want as 
select year, department, position, emp_id,
		count(countvar) as total_years
from makegroup
group by year, department, position, emp_id
having year = max(year)
order by year, department, position, emp_id
;
quit;

Which is great except it gives me the employees that were previously in that position and it doesn't account for breaks in employment.

 

 

Any help would be much appreciated! Thank you

 

Edit:

 

For example:

 

in this table there is a break between 0708 and 0910, so I want to count 9 years, instead my code counts for 12.

yeardepartmentpositionemp_id
506salesassistls3891g
607salesassistls3891g
708salesassistls3891g
910salesassistls3891g
1011salesassistls3891g
1112salesassistls3891g
1213salesassistls3891g
1314salesassistls3891g
1415salesassistls3891g
1516salesassistls3891g
1617salesassistls3891g
1718salesassistls3891g
1 ACCEPTED SOLUTION

Accepted Solutions
serrld113
Obsidian | Level 7

I ended up doing a proc transpose...

 

proc transpose data = testin out = testout;
by department position emp_id;
var year;
run;

 

then a proc sort followed by this:

 

data testout;
set testout;
else if col1 = 1718 and col2 = 1617 and col3 = 1516 and col4 = 1415 and col5 = 1314 and col6 = 1213 then consec_years = 6;
else if col1 = 1718 and col2 = 1617 and col3 = 1516 and col4 = 1415 and col5 = 1314 then consec_years = 5;
else if col1 = 1718 and col2 = 1617 and col3 = 1516 and col4 = 1415 then consec_years = 4;
else if col1 = 1718 and col2 = 1617 and col3 = 1516 then consec_years = 3;
else if col1 = 1718 and col2 = 1617 then consec_years = 2;
else if col1 = 1718 then consec_years = 1;
run;

 

Hopefully this will help anybody else with the same problem as me. Thank you to everybody that helped out

View solution in original post

9 REPLIES 9
Reeza
Super User
count( distinct countvar)

 

Are you trying to count the distinct occurrences? 

Larrihoover
Obsidian | Level 7
You could easily do something like left join select distinct on position and emp_id where position = "one to get rid of". Then on the join do on a.emp_id != b.emp_id. To remove the employee IDs that already had the position. You would have to recount after removing the employees, if you do not want to recount instead of removing make a flag and then in if first. statement say if first.year and flag=1 then countvar=1;. Hard for me to do without coding the whole thing but this should give you some ideas.
Reeza
Super User


Hi ladies,

 

You need a counter for each employee, so it's first.empid not first.year

It's possible you may need to modify your BY order as well, but you can play around with that and see what you need. 

In this case if they switch department/position the counter will reset, if you don't want that remove those from the BY statement or put them after the emp_id in the BY statement.

 

data makegroup;
set match;

by year department position emp_id;

if first.emp_id r then countvar=1;

countvar +1;

run;

PS. Please only ask one question at a time, it helps you to get answers faster.

 


@serrld113 wrote:

Hey guys, I'm struggling with building a counter variable.

 

What I have to do is count the number of years an employee has been in a certain position in a certain department, without counting the employees who were in that position previously. I only need continuous years of employment, if there is a break in employment, then I would only want to count from the most recent hire year.

 

My variables are: year, department, position, emp_id,. After running a proc sql to pull the employees I ran a proc sort and then this data step:

 

data makegroup;
set match;
countvar +1;
by year department position emp_id;
if first.year then countvar=1;
run;

This unfortunately starts a counter from 1 all the way to 40k+ ....

 

 

The next step I have

 

proc sql;
create table want as 
select year, department, position, emp_id,
		count(countvar) as total_years
from makegroup
group by year, department, position, emp_id
having year = max(year)
order by year, department, position, emp_id
;
quit;

Which is great except it gives me the employees that were previously in that position and it doesn't account for breaks in employment.

 

 

Any help would be much appreciated! Thank you

 

Edit:

 

For example:

 

in this table there is a break between 0708 and 0910, so I want to count 9 years, instead my code counts for 12.

year department position emp_id
506 sales assist ls3891g
607 sales assist ls3891g
708 sales assist ls3891g
910 sales assist ls3891g
1011 sales assist ls3891g
1112 sales assist ls3891g
1213 sales assist ls3891g
1314 sales assist ls3891g
1415 sales assist ls3891g
1516 sales assist ls3891g
1617 sales assist ls3891g
1718 sales assist ls3891g

 

 

 

Larrihoover
Obsidian | Level 7
Now that I look at it again putting multiple conditions in the if statement like Reeza is saying is the best way. To fix the last part it would be really easy just to make a year4 field and do all the conditions in there if first.emp_id r and first.year4 then countvar=1; .
serrld113
Obsidian | Level 7

Even with this solution, my counter is still saying 12 instead of 9

Reeza
Super User

Post your code. 

 

What would you expect from the sample data you provided above? 

Assuming that's all of your data what would the output look like.

serrld113
Obsidian | Level 7
data test;
input year department $ position $ emp_id $;
datalines;
506 sales assist ls3891g 
607 sales assist ls3891g 
708 sales assist ls3891g 
910 sales assist ls3891g 
1011 sales assist ls3891g 
1112 sales assist ls3891g 
1213 sales assist ls3891g 
1314 sales assist ls3891g 
1415 sales assist ls3891g 
1516 sales assist ls3891g 
1617 sales assist ls3891g 
1718 sales assist ls3891g 
;
run;

data makegroup;
set test;
by year department position emp_id;
if first.emp_id then countvar=1;
countvar +1;
run;

proc sql;
create table counts2 as 
select department, position, emp_id, count(countvar) as totalyears from makegroup
group by department, position, emp_id;
quit;
departmentpositionemp_idtotalyears
salesassistls3891g12

 

I would expect it to look like this:

 

departmentpositionemp_idtotalyears
salesassistls3891g9
Reeza
Super User

You can't quite do it that easily. It sounds like you're looking for the longest continuous employment it seems for each employee or the last stretch, not sure because you only have one example so can't generalize.

You need to convert your dates to SAS dates to make this work and then check for continuity. 

There may be easier ways to do this with PROC TIMESERIES.

delete_counter.JPG

 

 

 

 

 

serrld113
Obsidian | Level 7

I ended up doing a proc transpose...

 

proc transpose data = testin out = testout;
by department position emp_id;
var year;
run;

 

then a proc sort followed by this:

 

data testout;
set testout;
else if col1 = 1718 and col2 = 1617 and col3 = 1516 and col4 = 1415 and col5 = 1314 and col6 = 1213 then consec_years = 6;
else if col1 = 1718 and col2 = 1617 and col3 = 1516 and col4 = 1415 and col5 = 1314 then consec_years = 5;
else if col1 = 1718 and col2 = 1617 and col3 = 1516 and col4 = 1415 then consec_years = 4;
else if col1 = 1718 and col2 = 1617 and col3 = 1516 then consec_years = 3;
else if col1 = 1718 and col2 = 1617 then consec_years = 2;
else if col1 = 1718 then consec_years = 1;
run;

 

Hopefully this will help anybody else with the same problem as me. Thank you to everybody that helped out

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1712 views
  • 0 likes
  • 3 in conversation