Desktop productivity for business analysts and programmers

Compiling daily datasets into monthly

Reply
Contributor
Posts: 22

Compiling daily datasets into monthly

Hi,

 

I am trying to compile daily datasets into monthly. Below is my Macro to compile. I manage to execute the program and get the desired output. However, my question is, if my daily dataset is missing or incomplete for that certain month, my program will stop executing because of the missing daily dataset. How to ignore the missing, and just compile anything that I have for that  month. Thanks.

 

%MACRO COMBINE(MONTH);

%LOCAL LAST_MONTH DATE ;

%LET LAST_MONTH=%SYSFUNC(INTNX(MONTH,%SYSFUNC(TODAY()),-1,B));

DATA SAVE.ACCOUNT_&REPTMON;

SET

%DO DATE=&LAST_MONTH %TO %SYSFUNC(INTNX(MONTH,&LAST_MONTH,0,E));

 SAVE.ACCOUNT_%SYSFUNC(PUTN(&DATE,DDMMYYN8))

%END;

;

RUN;

%MEND;

%COMBINE;

 

Super User
Posts: 20,203

Re: Compiling daily datasets into monthly

[ Edited ]

Add a step that checks for the existence of a data set - see the SAS 9.4 Macro Appendix for some examples or the FEXIST function.

 

How big are these data sets? This is a very inefficient way of doing this. 

Contributor
Posts: 22

Re: Compiling daily datasets into monthly

The datasets is not really big. So, my code is inefficient is it? I should check out on that FEXIST function.

Super User
Posts: 20,203

Re: Compiling daily datasets into monthly

Sadly your naming convention doesn't let itself to any of the list options. If you have any control over that, it's a small change that can make your life a lot simpler.

 

For example if it was YYMMDD you can do the following to get all of one month at once. 

 

data var_month1;
set library.data_201701: ;
run;

Or 

 

data var_month1;
set data_20171201-data_201712;
run;

 

If you're appending data in a loop in that manner, is it any faster to use PROC APPEND? not sure if the data is small if that would gain any advantage. 

 

I don't know how small is small, but could you combine them all at once as well and then possibly split it back out if required. 

In general, splitting data sets into subsets isn't recommended, but sometimes that's outside of your control. 

 

 

 

 

Contributor
Posts: 22

Re: Compiling daily datasets into monthly

Reeza,

 

Thanks for the suggestion. As I already use this method, I might stick with this macro. And yes, I think append will be lot of easier.

 

%macro cre_monthly(prefix);

%local last_month date ;

%let last_month=%sysfunc(intnx(month,'01may2021'd,-1,b));

%do date=&last_month %to %sysfunc(intnx(month,&last_month,0,e));

%let dsn = save.scb_welcome_call_list_%sysfunc(putn(&date,ddmmyyn8));

%if %sysfunc(exist(&dsn)) %then %do;

proc append base=&prefix data=&dsn;

run;

%end;

%end;

%mend cre_monthly;

%cre_monthly(save.scb_welcome_call_list_&reptmon);

 

Above code is acceptable or u think still  ineffienct? Thanksss for the advice!!

Super User
Posts: 8,037

Re: Compiling daily datasets into monthly

[ Edited ]

Supposing your macro variable &reptmon contains month and year in the format MMYYYY, this works without macro logic, it only uses macro variables to hand values over from one step to the next:

data _null_;
last_month = input("01&reptmon.",ddmmyy8.);
length dsn $41 all_datasets $1301;
/* maximum length needed: 32 bytes for datasetname, 8 for library, 1 for dot */
do date = last_month to intnx('month',last_month,0,'e');
  dsn = 'save.scb_welcome_call_list_' !! putn(date,ddmmyyn8.);
  if exist(dsn) then all_datasets = catx(' ',trim(all_datasets),dsn);
end;
call symput('all_datasets',trim(all_datasets));
run;

data save.scb_welcome_call_list_&reptmon;
set &all_datasets;
run;

Instead of a series of appends, you get one data step that does the concatenation; also note that this procedure can be run repeatedly during a month, always resulting in correct values up-to-date.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 8,152

Re: Compiling daily datasets into monthly

You haven't answered the main question here however, why do you have the data in lots of datasets in the first place?  This is inefficient in both storage and coding/processing terms, and is likely implemented by someone used to using Excel.  

 

Data which is the same should be stored in one dataset, unless there are specific reasons for not doing so.  You will find your coding is far far simpler if you keep it in one dataset as all this macro code and likely the thousands of lines of other macro code you have had to program to work with such a thing, will disappear.  For instance to create one dataset form all files:

data alldata;
  length orig_file $200;
  set save.account: indsname=tmp;
  orig_file=tmp;
run;

This will create one dataset with all those datasets (assuming there are no differences) and have one column with the file the data came from, which you could process out to get an actual date column to work with.  No more messing about then all in one, and you still retain all the information, but it is now in a usable format.

Ask a Question
Discussion stats
  • 6 replies
  • 147 views
  • 0 likes
  • 4 in conversation