BookmarkSubscribeRSS Feed
Obsidian | Level 7

I am facing a problem,where I have an employee file;this file is a combination of employee data from different employers.So the chances are that an employee number can be shared between multiple employers.I am trying to set a flag for those employee numbers who are only actively working in the company.

I did the following:
sorted the file on emp_no and active_dt field.

Then if first.emp_no ne last.emp_no(only for employees whose active_dt >= today())) then set the multi_emp to "Y".

But I am facing the following issue is like this:
for eg:
the emp_no 11111 is active in company ABC
and emp_no 1111 is non-active in company ZZZ.

emp_NO:11111 active_dt:12/31/2099 company name:xxxx
emp_NO:11111active_dt: 12/31/2001 company name:vvvvv

as per the above logic these records are getting flagged as multi_emp = 'Y'.

I only want employees who are active and share same emp_nos;
like the following:

emp_NO:22222 active_dt:12/31/2099 company name:xxxx
emp_NO:22222 active_dt: 12/31/2020 company name:vvvvv
emp_NO:22222 active_dt:12/31/2099 company name:KKKK

Please share your thoughts.
Thanks in advance.
Quartz | Level 8 SPR
Quartz | Level 8
Hello Ren2010,

If I've understood you correctly it is necessary to add a company name to your sorting sequence, i.e. company_name emp_no active_dt.

Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Suggest you share whatever code piece you may have tried that is not working?

And, suggesting also that you try adding this code line so you can tell when you have FIRST. and/or LAST. (where "nn" can be unique and you can insert multiples at various DATA step locations to see effect):

PUTLOG '>DIAG-nn>' / _ALL_;

For your interest, I believe that BY GROUP processing, while having the proper sorted file and also having the BY statement variable optimized, then you can identify unique conditions / characteristics for your data and set a flag accordingly.

It will work-out as long as you find a suitable combination and sequence for your BY variables, then test those conditions while using the PUTLOG example.

Scott Barry
SBBWorks, Inc.

Suggested Google advanced search argument, this topic / post:

by group processing
Obsidian | Level 7
My code is below:

proc sort data=combine_mbr
by emp_no actv_dt;

data mbr_error_undup ;

set combine_hics;
by emp_no actv_dt;

if emp_no ne ' '
then do;
if first.emp_no ne last.emp_no
then Multi_mbrs='Y';

if last.emp_no then output;

Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Your FIRST. and LAST. processing logic is flawed (you may be thinking of LAG instead?) -- the condition is either a 1 or a 0, which leads back to the idea of adding the PUTLOG statements for self-diagnosis. Also, suggest reading more about BY GROUP PROCESSING in the SAS support website references and SAS-hosted DOC.

Scott Barry
SBBWorks, Inc.


Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.


Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 3 in conversation