- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have a series of daily files that need to be aggregated first for each daily file (using SQL) and then add out daily output files together. The file date in the file name is the data collection date.
When I use %do %to to loop date, I understood I need to such as set the date format as sas date for the %do loop to work. But
1) how do I read in the file date using the sas system date from %do loop?
2) Re the condition in SQL where value for variable RESOLVED='Yes' can be either upper case or case sensitive. I used %upcase but it didn't seem to work.
3) in terms of efficiency, is there any difference between using PROC .. APPEND or DATAT .. SET step to concatenadate data (it can be potentially over 365 small datasets).
Here is the code, with just 3 date files to be read in. I'd like to put the start date in the %macro GET, because there will be repeating runs with each in a different start date.
The want data is as follows. The filedate can be either text or date format;
File date count_cases
20201130 2
20201201 1
20201212 3
%let enddate= '16dec2020'd;
data mydata20201130; input x1 x2 resolved $; datalines;
1 2 Yes
1 5 Yes
;
data mydata20201201; input x1 x2 resolved $; datalines;
3 2 YES
4 5 no
;
data mydata20201202; input x1 x2 resolved $; datalines;
1 3 Yes
3 4 YES
;
%let enddate=today();
%macro Get;
%let startdate=%sysfunc(putn(&startdate, 8.));
%let enddate=%sysfunc(putn(&enddate, 8.));
%DO date=&startdate %TO &enddate;
%let Filedate=&date; format filedate yymmddn8.;
%put filedate;
PROC SQL;
CREATE TABLE Case&FILEDATE. AS
SELECT &Filedate, COUNT(*) AS Count_&type
FROM mydata&FILEDATE.
WHERE %upcase(RESOLVED)='NO';
QUIT;
%END;
Data ALlcases_&resident; set Case&startdt - Case&enddt; run;
%MEND;
%Get type=Cases, startdate='30Nov2020'd);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
First, lets straighten out some very important concepts.
- SAS dates are integers, representing the number of days since 01JAN1960. For any mathematical usage, like incrementing by one day, the values must be number of days since 01JAN1960. If, instead of the number of days since 01JAN1960, you use the integer 20201130, then incrementing by 1 gives you 20201131 which is not a date.
- Macro variables should not be formatted (except when humans need to understand the date)
- To use data step functions on macro variables, you need %SYSFUNC
- Use macro functions on macro variables and data set functions on data set variables, and not macro functions on data set variables, or data set functions on macro variables except when using %SYSFUNC (example: %upcase(resolved) is wrong, it is a macro function on a data set variable)
%let enddate=%sysfunc(today()); /* Not formatted, number of days since 01JAN1960 */
%macro Get(type=,startdate=);
%let startdate=%sysevalf("&startdate"d); /* Not formatted */
%do date=&startdate %to &enddate; /* Number of days since 01JAN1960 */
%let filedate=%sysfunc(putn(&date,yymmddn8.)); /* Has to be formatted since data set names have human readable dates */
proc sql;
create table case&filedate as
select &filedate, count(*) as count_&type
from mydata&filedate.
where lowcase(resolved)='no';
quit;
%end;
data allcases_&resident; set case:; run;
%mend;
%get(type=cases, startdate=30nov2020)
I think this still fails unless &resident was previously defined. And your macro variable &type is never used.
IMPORTANT ADVICE: you need to get the code to work without macros and without macro variables on one or two input data sets. If you can't do that, then your macro will never work either. Most people we give this advice to never actually seem to go down that path (or at least, they don't show us it in the forums), and we never hear from them again. And yet this advice is critical if you are going to be writing macros. Many of the issues in your code can be resolved by getting working code without macros and without macro variables first. Ignoring this advice just will slow you down in the long run.
Additional comments regarding the logic of the procedure: after initially doing all these appends, why run this every day and do all of this data set or PROC SQL work and appending every single day. Why not save the results from the first run, and then append the next day's data? That's one append and no more looping.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot PaigeMiller. It works! The code is a typo. It should have been data allcases_&type;
data allcases_&resident; set case:; run;
Just one question regarding enddate and another one about when a file is missing between start and end date.
1. If my end date is the day before today, or I want to use a user-define date. How do I change the code? The following isn't working (sorry my knowledge about SAS system functions sucks).
%let enddate=%sysfunc(today()-1);
%let enddate=%sysfunc('15dec2020'd);
2. What if I have a missing file in one of the dates between start date and end date? I think the code still works, with just error messages about the missing file, which I can ignore. So it's not a big deal. Just wondering if there is a quick fix (e.g. option in %do %to).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Let's make this even simpler. You do not need macros at all.
Let me say that again so you know I didn't make a mistake. YOU DO NOT NEED MACROS AT ALL.
data final;
set mydata:;
where lowcase(resolved)='no';
run;
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This statement
%let enddate=%sysfunc(today()-1);
cannot work. %SYSFUNC() only take a function call and an optional second argument that is the format to apply to the results. It does not accept arbitrary expression. You could use something like this instead.
%let enddate=%eval(%sysfunc(today())-1);
Similarly this cannot work either:
%let enddate=%sysfunc('15dec2020'd);
There is no function call there inside of the %SYSFUNC() call. The previous post actually had a way to do this:
%let enddate=%sysevalf('15dec2020'd);
The %EVAL() function only handles integer arithmetic. The %SYSEVALF() macro function can handle floating point arithmetic and date literals. But depending on how you intend to use the macro variable ENDDATE you might just use
%let enddate='15dec2020'd;
To test if a FILE exists use the FILEEXIST() function. To test if SAS dataset exists use the EXIST() function. To use those function in macro logic then embed them in %SYSFUNC() macro function. So here is any example of testing if a dataset exists before trying to reference it.
%if %sysfunc(exist(mylib.mydataset)) %then %do;
proc append data=mylib.mydataset base=summary ;
run;
%end;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I don't see any need to use macro logic here. It is easier to just generate the code from the list of files.
%let startdate='30Nov2020'd;
%let enddate= '16dec2020'd;
proc contents data=work._all_ noprint out=contents; run;
data files ;
set contents(keep=libname memname);
by libname memname;
if first.memname and memname =: 'MYDATA' ;
rundate = input(substrn(memname,7),??yymmdd8.);
format rundate yymmdd10. ;
if &startdate <= rundate <= &enddate;
run;
filename code temp;
data _null_;
set files ;
file code;
put 'insert into allcases(rundate,count) select "' rundate date9. '"d'
',count(*) from ' libname +(-1) '.' memname
'where upcase(resolved)="NO";'
;
run;
data allcases;
length rundate count 8;
format rundate yymmdd10. count comma11.;
stop;
run;
proc sql ;
%include code / source2;
quit;
proc print data=allcases;
run;
Results:
362 proc sql ; 363 %include code / source2; NOTE: %INCLUDE (level 1) file CODE is file .../#LN00046. 364 +insert into allcases(rundate,count) select "30NOV2020"d,count(*) from WORK.MYDATA20201130 where upcase(resolved)="NO"; NOTE: 1 row was inserted into WORK.ALLCASES. 365 +insert into allcases(rundate,count) select "01DEC2020"d,count(*) from WORK.MYDATA20201201 where upcase(resolved)="NO"; NOTE: 1 row was inserted into WORK.ALLCASES. 366 +insert into allcases(rundate,count) select "02DEC2020"d,count(*) from WORK.MYDATA20201202 where upcase(resolved)="NO"; NOTE: 1 row was inserted into WORK.ALLCASES. NOTE: %INCLUDE (level 1) ending. 367 quit; NOTE: PROCEDURE SQL used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
Obs rundate count 1 2020-11-30 0 2 2020-12-01 1 3 2020-12-02 0
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Something like the following most likely.
%let Filedate=%sysfunc(putn(&date, yymmddn8.));
2. Try using UPPER or LOWER instead. Otherwise your data is different - possibly invisible white spaces. Running a PROC FREQ on the column will allow you to verify the data.
3. Append is more efficient than a data step for combining data. In Append it copies the data over in multiple lines because there's no expectation of doing any calculations whereas a data step may try to change data somehow so it needs to process it line by line.
This entire process is inefficient and likely doesn't need a macro. As Paige indicates you likely want to append your data and filter it dynamically for multiple dates at once and do the count as the second step. I'd create a view that did this personally.
Here's some instructions on how to use a %DO loop with dates.
https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=mcrolref&docsetTarget=n01...