Hello!
I am trying to pull in a bunch of monthly data files for departments. The way the files are structured is:
year/month/department file.xlsx
There are 43 different departments and I need to pull in the files from 16 different months for each department (then I will be appending them, but I know how to do that). First I created a macro that looks like this:
%macro datapull(year=, month=, departmentname=)
PROC IMPORT OUT=&deparmentname_&year_&month DATAFILE= "[first part of file path]/&year/&month/&departmentname"
DBMS=xlsx REPLACE;
RUN;
%mend
I really dont want to have to call the macro 43*16 times. What I am looking to do is have the macro re-run over the different gl codes and only have to specify which months I want to pull (I am not interested in automating the month part).
I tried using %do departmentname= x, y, z; but that didnt work.
Any help???
Probably would be much easier to just use a DATA step to control the looping instead of bothering to work with macro code for that.
file code temp;
data _null_;
start = '01JAN2017'd ;
end = '01MAR2018'd ;
do departmentname='A','B','C';
do i=0 to intck('month',start,end);
date=intnx('month',start,i);
year=year(date);
month=month(date);
file code; /* line added by Shmuel */
put '%macro datapull(' year= ',' month= ',' departmentname= ')';
end;
end;
run;
%include code;
Likely to be easier user CALL EXECUTE to make the calls to proc import.
For example you can build your file name and SAS dataset names using iterated loops such as:
data example;
length dept $ 15 filestr $ 200 sasset $ 40;
do dept='Hardware','Paint','Clothing';
do year=2012 to 2016;
do month=1 to 12;
filestr =catt(catx('/','somepath',dept,year,month),'.xlsx');
sasset =catt('library.',catx('_',dept,year,month));
output;
end;
end;
end;
run;
once you have the filestr and sasset variables above making the values you need then replace the OUTPUT; above with
call execute ('proc import out=');
call execute (sasset);
call execute ('datafile=');
call execute filestr;
call execute ('dbms=xlsx replace;run;');
OR write the added lines to a text file and %include it.
CALL Execute places text in a queue to execute after the data step finishes.
you may need to specify a format such as put(month,z2.) in place of the simple value of month in the CATX function calls above if you are using months such as 01.
make sure that your dept strings are spelled correctly.
You can also use a data set with values of dept, year and month and skip the loops using a SET statement to bring in the needed values. This may be more flexible when if some departments have different availability dates.
Note: unless your XLSX files are extra specially clean you my end up with values that you cannot append the resulting data sets together because each call to proc import will generate a separate set of guesses as to variable type, length and informat/format needed.
Hi,
For the departments, since there are so many, I would suggest you loop over a dataset, instead of hardcoding them in a list.
The easiest solution I could think of was do a cartesian join of the 43 depts and 16 months, and loop over that table as shown below.
/* sashelp.class is a proxy for your dept list */
data dept (rename=(name=dept));
set sashelp.class(keep=name);
run;
/* Create a list of the months */
data dates;
datstart = '01DEC2016'd ;
datend = '01MAR2018'd ;
format date date10. ;
do i=0 to intck('month',datstart,datend);
date=intnx('month',datstart,i);
output;
end;
drop i datstart datend;
run;
/* Cartesian join to give every combination of dept and month/year */
proc sql noprint ;
create table namedat as select * from dept, dates
order by dept, date;
quit;
/* format month<10 with a leading 0 */
proc format; picture mnth other='99';
run;
data want;
set namedat;
year=put(year(date),4.);
month=put(month(date),mnth.);
run;
/* The following code for looping over a dataset is based on the paper by Moore, E. */
/* http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/a000093638.htm */
/* Performing Multiple Statements for Each Record in a SAS® Data Set by Moore, Edward */
%macro datapull;
DATA _NULL_;
IF 0 THEN SET want NOBS=n;
CALL SYMPUT('COUNT',n);
STOP;
RUN;
/* loop over each record */
%DO i=1 %TO &COUNT;
DATA _NULL_;
SET want (FIRSTOBS=&i);
CALL SYMPUT('deparmentname',trim(dept));
CALL SYMPUT('yr',year);
CALL SYMPUT('mo',left(put(month,$2.)));
STOP;
RUN;
%let OUT=&deparmentname._&yr._&mo;
%let path=/data;
%let DAT=&path./&yr./&mo./&deparmentname..xlsx;
%put &out;
%put &dat;
libname XL XLSX "&dat";
data &out;
set xl.sheet1; /* assuming your worksheet is named Sheet1 */
run;
LIBNAME XL CLEAR;
%END;
%mend datapull;
%datapull;
@Ineedsashelp123 wrote:
I tried using %do departmentname= x, y, z; but that didnt work.
Any help???
The %do interative loop does not allow a list of values. However you can get much the same effect using %sysfunc(countw()) and %scan.
%macro dummy (list=); %do i = 1 %to %sysfunc(countw(&list.)); %let temp = %scan(&list.,&i.); %put temp is &temp.; %end; %mend; %dummy(list= a b work other c);
Hi,
Maybe if you have access to a PIPE mechanism on your computer you could, instead of using macro, try do it in a datastep with call execute,
something like that:
/* for linux */
%let firstPartOfFilePath=/home/bart/SAS/test_folders; /*a base path for data */
filename f PIPE "find ""&firstPartOfFilePath."" | grep -i xlsx"; /* */
/* for windows */
/*
%let firstPartOfFilePath=E:\SAS_WORK_5400\test_folders;
filename f PIPE "dir ""&firstPartOfFilePath.\*"" /s/b";
*/
data test;
infile f dlm='0a0d'x;
input path : $ 1024.; /* read in data from PIPE line by line */
/* create required variables */
year = scan(path,-3,"/\");
month = scan(path,-2,"/\");
deparmentname = scan(path,-1,"/\");
deparmentname = substrn(deparmentname, 1, length(deparmentname)-5); /* to drop the XLSX extension */
/* use call execute to execute import */
call execute ("PROC IMPORT");
call execute (catx("_", "OUT=", deparmentname, year, month));
call execute (cats('DATAFILE= "', path, '"'));
call execute ("DBMS=xlsx REPLACE; RUN;");
run;
all the best
Bart
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.