Hi
i have a employee level transactions data which shows type of transaction and the date of the transaction that the employee did
i have one more data which has the employee job information( the table is shown below ) , i have to calculate the job code tenure of the employee
The logic to calculate the job code tenure is like this : lets say the employee did transactions on 13th and 16th july-2020
Please note that job code CA is repeated after the LA , we shouldn't consider these while calculating the tenure for 15th july.
i have also attached the below table in Excel format.
EMP_ID | Job_cd | cre_dt |
1 | CA | 15-Jul-20 |
1 | LA | 13-Jun-20 |
1 | LA | 12-Jun-20 |
1 | LA | 02-May-20 |
1 | LA | 13-Jan-19 |
1 | CA | 12-Dec-19 |
1 | CA | 11-Oct-19 |
1 | DA | 01-Jan-18 |
1 | DA | 01-Jun-18 |
1 | DA | 15-Nov-18 |
Let's see if I understand:
One table (call it T) has one record per employee/transaction. It has employee id (ID), transaction date (TDATE) and various other transaction related variables. The other table (call it J, for which you provide a sample) is a table of some sort of employee job code change. It has one record per change for each employee. It has variables ID, jobcode-change-date (CRE_DT in your example), and job code (JOB_CD).
For some reason your table J appears to be sorted by ID and descending (!) CRE_DT.
Let's assume, however that table T is sorted by ID and ascending TDATE.
Now, if J is not very big, I would first sort it by ID and ascending CRE_DT, first, as here:
proc sort data=J out=Jprime;
by id cre_dt;
run;
data want;
set Jprime (keep=id cre_dt rename=(cre_dt=tdate) in=inJ) T (in=inT);
by id tdate ;
/* if first.id then call missing (of _all_); */
if inJ then set Jprime ;
if inT;
run;
The data WANT step works this way:
This program assumes that there will always be a J record preceding the earliest T record for each ID. If not, then de-comment the call missing statement.
Now, what if your J dataset is too big to sort, and you'd rather process it in its current pathologically sorted order? Then you could first read in all the J records, and save the job codes in a cre_dt-indexed array. Then when the T records are read start with cre_dt=tdate and go backwards until a non-blank job_cd is retrieved from the array:
%let mindate=01jan2015;
%let maxdate=31dec2019;
data want;
set J (in=inJ) T (in=inT);
by id;
array jcodes {%sysevalf("&mindate"d):%sysevalf("&maxdate"d)} $2 _temporary_ ;
if first.id then call missing(of jcodes{*});
if inJ then jcodes{cre_dt}=job_cd;
if inT;
do cre_dt=tdate to "&mindate"d by -1 until(job_cd^=' ');
job_cd=jcodes{cre_dt};
end;
run;
@mkeintz Thanks for your Reply.
The current logic calculates the date correctly for the transaction of 16th july (i.e the cre_dt is 15th July)
but for the transaction of 13th july the cre_dt should be 13th jan-19 because the job code LA started on 13 jan-19
current logic calculates the cre_dt for the transaction of 13 july as 13 jun-2020
Thanks again for your help..
I'd be happy to address this. Please provide SAS data steps with relevant parts of the two tables, so that I can actually test the code.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.