Each month I will have to process raw payroll information and create an analysis dataset out of it for people within my compnay.
Here is my current thinking.
When I get the raw March 2020 data, I will break this out into smaller datasets based on the paymonth. So all January payments will be put into a SAS dataset called Jan_2020M03, all February payments into Feb_2020M03 etc. The M03 allows me to identify which month the raw data has came in on.
However, I will need to recreate my analysis dataset each month. To create the Jan_2020Analysis dataset I will need to.
First read in the Jan_2020M01 dataset. I will then need to read in Jan_2020M02 dataset. From here I will need to overwrite information in the Jan_2020M01 information with information Jan_2020M02. I can do this with a unique payslip id. However, I will need to read in the Jan_2020M03 and potentially overwrite information in the Jan_2020M01 and Jan_2020M02. This will need to keep occurring until I have read in all the Jan_2020 datasets. Once this is done, I can then create my analysis dataset.
I need some dynamic way to this. My folder structure is as follows;
\\location\2020\2020M01
So within the above location I have 12 sas datasets Jan_2020M01...Feb_2020M01....Dec_2020M01
However, I have 12 other folders such as
\\location\2020\2020M02
So within the above location I have 12 sas datasets Jan_2020M02...Feb_2020M02....Dec_2020M02
So I need a way for SAS to only pick out the datasets I want. So when I want to recreate the January file pick out all the January datasets.
I would appreciate any advise.
Since you have a unique id, you can try the following:
data jan_2020analysis;
merge
libm01.jan_2020m01
libm02.jan_2020m02
.....
libm12.jan_2020m12
;
by payslip_id;
run;
Once you can verify that this works (the datasets are read in sequence, so any newer information for a particular payslip will automatically replace the older in the PDV), you can start to macrotize it. First of all, replace jan and 2020 with macro variables. Next, wrap the step in a macro and loop from 1 to 12 to create the month sequence:
merge
%do i = 1 %to 12;
libm%sysfunc(putn(&i.,z2.)).&month._&year.m%sysfunc(putn(&i.,z2.))
%end;
;
by using month and year as macro parameters, you can then call the macro repeatedly for months and years (eg from a dataset with call execute()).
Two comments to get things rolling ...
Agree with @Astounding on naming the months. On top of that, I see no resons to split the libraries, as the dataset names are unique anyway. Just one RAW library should suffice.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.