SAS 9.4. I want to import multiple Excel files (approx 20 daily files) into SAS and merge them into 1 SAS file. I want to do this using macro. My issue arises when SAS starts to import the first file with no data because some files are empty. It converts my numeric variables into character. I have tried to modify my variables into both numeric/character with no luck. I have attached example data.
I get below error msg:
NOTE: The import data set has 3 observations and 19 variables.
NOTE: DER.20180301. data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.14 seconds
cpu time 0.03 seconds
NOTE: Table DER.'20180301.'n has been modified, with 19 columns.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.08 seconds
cpu time 0.00 seconds
NOTE: The import data set has 0 observations and 19 variables.
NOTE: DER.20180302. data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.18 seconds
cpu time 0.06 seconds
ERROR: Character column ID_2 requires a character format specification.
ERROR: Character column ID_2 requires a character informat specification.
NOTE: Table DER.'20180302.'n has been modified, with 19 columns.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.11 seconds
cpu time 0.00 seconds
ERROR: Value 3 on the SELECT clause does not match the data type of the corresponding column
listed after the INSERT table name.
ERROR: Value 9 on the SELECT clause does not match the data type of the corresponding column
listed after the INSERT table name.
ERROR: Value 11 on the SELECT clause does not match the data type of the corresponding column
listed after the INSERT table name.
My code is:
ods html newfile = proc;
%Let Year = %sysfunc(date(),year4.);
%Let LastMonth = %sysfunc(putn(%sysfunc(intnx(month,%sysfunc(today()),-1, same)),monname.));
/* Find and import relevant excel workbooks into SAS */
libname DER "I:\Dat\DER";
%let path = I:\Dat\3950MFI;
filename folder "&path\Control\&Year\&LastMonth";
options validmemname=extend; /* to allow non-standard dataset names */
/* Making a list of all excel workbooks into table FilesInFolder */
data FilesInFolder;
length File $300;
List = dopen('folder');
do Line = 1 to dnum(List);
File = trim(dread(List,Line));
output;
end;
run;
/* Ascending order for varibale File in table FilesInFolder */
proc sort data = FilesInFolder;
by File;
run;
/* Creating global macro variables */ /* not local */
data _NULL_;
set FilesInFolder end=final;
call symput(cats('File', _N_), trim(File));
call symput(cats('Name', _N_), trim(nliteral(substr(File,1,min(32, length(File)-4)))));
if final then call symputx(trim('Total'), _N_);
run;
%macro loop;
/* Imports all workbooks specified in variable List from table FilesInFolder and saves them as datasets in Library DER */
%do i = 1 %to &Total;
proc import datafile="&path\Control\&Year\&LastMonth\&&File&i"
out=DER.&&Name&i
DBMS=xlsx REPLACE;
sheet=DER;
getnames=yes;
run;
proc sql;
alter table DER.&&Name&i
modify COMMENT char(60) format=$60. informat=$60.
modify ID_2 char(60) format=$60. informat=$60.
modify ID_3 char(60) format=$60. informat=$60.
modify PRE_THEM_ID (60) format=$60. informat=$60.;
/* Inserts all datasets from above into the first dataset from the Library */
%if &i gt 1 %then %do;
proc sql;
insert into DER.&Name1 select * from DER.&&Name&i;
quit;
%end;
%end;
proc export data=DER.&Name1 outfile= "I:\Dat\3950MFI\OUT\DER_&LastMonth..xlsx"
DBMS=xlsx REPLACE;
sheet=DER;
run;
%mend loop;
%loop
I've decided to exclude all empty Excel workbooks from the proc import. Afterwards I've added a proc sql - create table step where I format the variables. After the proc import I add:
/* excludes all empty Excel workbooks from above proc import */
%let dsid=%sysfunc(open(DER.&&Name&i));
%let numobs=%sysfunc(attrn(&dsid,nobs));
%let rc=%sysfunc(close(&dsid));
%if &numobs. eq 0 %then %do;
%goto exit;
%end;
proc sql;
create table DER&i as select and format my variables
from DER.&&Name&i
quit;
/* Inserts all datasets from above into the first dataset from the Library */
%if &i gt 1 %then %do;
proc sql;
insert into DER1 select * from DER&i;
quit;
%end;
%exit:
%end;
This a consequence of using a file format that is totally unsuited for a reliable data transfer between applications. With the Excel format, you have to rely on guessing mechanisms that SAS had to build into proc import and the libname definitions for Excel.
Depending on content, your resulting data structures can (and will) look different with every run, causing the problems you experience.
Save to csv from Excel, and write a data step for the csv where YOU have control over the structure, and the following steps will encounter no problems anymore.
And stop using stupid dataset names ('20180301.'n). They only make work harder for no reason. Use a prefix that makes it a valid SAS name, so you don't have to type the 'some_****_in_here'n construct all the time.
Well, your choice of data medium - Excel is not the best to start with. You will get all sorts of issues as Excel is not a fixed format file, so merging etc. will be an issue. Bear this in mid as each time you run the program you will likely end up with different looking data or outcomes.
Now to import data from XLSX, it has been made a lot easier with 9.4 and the libname excel. For example to read all the sheets of xyz.xlsx into SAS then you can simply do:
libname tmp xlsx "xyz.xlsx"; proc copy in=tmp out=work; run; libname tmp clear;
This will create a dataset for all valid sheets in the Excel file, and copy them into your work library.
I will caution you again, this is not your problem, your problem is in the use of Excel as you will see each time you run this.
You might discuss with whoever is creating these files if another file creation option such as CSV (or other delimited text file format) would be practical. The process might be creating Excel files because "every one wants Excel" is a default thought pattern for some. Even if some users of this data want the spreadsheet format it may well be possible to have interchange data provided in another form for you, especially if these files are created by another program (especially a data base). An agreed upon file format for each of these files, a data step to read each and the daily requirement becomes changing input file names and output data set names (if needed).
Another option to consider if this data is being created by a different data base is that SAS provides a number of ways to connect directly to many databases. Perhaps you might be better off with a SAS/ACCESS and direct connections to the data source. If the data comes from a database it is almost certain that data has consistent data properties.
Document how much time it takes to adjust everything for ONE day of data. Extrapolate that work load over a 4 or 6 month period. Talk to your supervisor about the total cost in man hours and resources the current data process involving Excel will entail as well as delays in response to questions based on this daily data from having to "fix" things daily. You might be surprised at the amount of management support might appear when $$$$ are pointed out.
For what it may be worth: changing the informat property for a variable after it has been imported is basically lying to yourself and others. I really wouldn't want to explain why a variable has an informat of $60. when the actual length is 6 or 120 characters.
I've decided to exclude all empty Excel workbooks from the proc import. Afterwards I've added a proc sql - create table step where I format the variables. After the proc import I add:
/* excludes all empty Excel workbooks from above proc import */
%let dsid=%sysfunc(open(DER.&&Name&i));
%let numobs=%sysfunc(attrn(&dsid,nobs));
%let rc=%sysfunc(close(&dsid));
%if &numobs. eq 0 %then %do;
%goto exit;
%end;
proc sql;
create table DER&i as select and format my variables
from DER.&&Name&i
quit;
/* Inserts all datasets from above into the first dataset from the Library */
%if &i gt 1 %then %do;
proc sql;
insert into DER1 select * from DER&i;
quit;
%end;
%exit:
%end;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.