Hello,
I am trying to read all Excel-Files from a folder into SAS. This is how my data files look like:
x1 | x2 | x3 | x4 |
---|---|---|---|
text 1 | 100,99 | 10 | 1.512,36 |
text 2 | 200,99 | 20 | 5.216,85 |
text 3 | 300,99 | 30 | 8.234,25 |
This is my first trying but it doesn't work. There is something nonsense in the first column - rest is missing. AND next challenge is: how will it function if the excel-files hasn't the same columns....
%let dirname = C:\Users\eduger\Desktop\Test;
filename DIRLIST pipe "dir /B &dirname\*.xls";
data dirlist ;
length fname $256;
infile dirlist length=reclen ;
input fname $varying256. reclen ;
run;
proc print data = dirlist;
run;
data all_text (drop=fname);
length myfilename $100;
length x1 $100;
set dirlist;
filepath = "&dirname\"||fname;
infile dummy filevar = filepath length=reclen end=done missover;
do while(not done);
input x1 $ x2 x3 x4;
output;
end;
run;
proc print data=all_text;
run;
I HOPE SOMEONE CAN HELP ME...
BR, Silke
I had a similar problem once. If you modify this code (Excel Version!!) you might get a reasonable result. The aggregated file is called: XL_All.
%Let Directory=C:\Users\xxxx;
%Let XL_vers=ExcelCS;
Filename Dir_Pipe PIPE "DIR &Directory. /B" LRecL=32767;
Data Dir_List;
Infile Dir_Pipe Truncover;
Input Text $200.;
Run;
Data XL_List;
Set Dir_List ;
Where UpCase(Substr(Text,Index(Text,'.')+1,4)) in ("XLSX" "XLS");
Run;
Proc SQL NoPrint;
Select Text Into :XL_List Separated By '#' From XL_List;
Quit;
%Put **&XL_List.**;
%Macro Import_XL(XL_List=);
%Global i;
%Let i=1;
%Let Element=%Scan(&XL_List.,&i.,'#');
%Do %While ("&Element."^="");
%Put **&i.**&Element.**;
Proc Import Out=XL_&i
Datafile="&Directory.\&Element."
DBMS=&XL_vers
Replace;
*Sheet="";
Run;
%Let i=%Eval(&i.+1);
%Let Element=%Scan(&XL_List.,&i.,'#');
%End;
%Mend;
%Import_XL(XL_List=&XL_List.);
%Let i=%Eval(&i.-1);
%Put **&i.**;
%Macro Append_XL;
Data XL_ALL;
Set XL_1;
Run;
%If &i.=2 %Then %Do;
Proc Append Base=XL_All Data=XL_2;
Run;
%End;
%Else %Do;
%Do a=2 %To &i;
Proc Append Base=XL_All Data=XL_&a.;
Run;
%End;
%End;
%Mend;
%Append_XL;
Well, several questions:
Are the files XLS or XLSX?
Why not convert them to CSV - simple VBA script - as file format easier to work with.
Do each of the sheets contains *exactly* the same data, e.g. same columns/formats. Otherwise you will have problems setting them to one table.
Do you have mixed datatypes in your Excel columns, i.e. do you have columns with number and alphanumerics.
Do you have a structured data import documentation, i.e. what will you do with oddities - for example date column with partial dates, results with character results.
Once you have you plan on how each data element from each datafile should be handled then go about the process of grouping similar file structures and simple tools. Create an import program for each different file where there are significant differences, or alternatively read everything in a long character and then post-process the data.
The reason for all of the above is that Excel is not a database/data transfer/or anything else, and hence will cause you lots of headaches.
To add, I don't think you will be able to infile binary .XLS files as you have been doing, these would need proc import. Again, another good reason to save your data as CSV (basic text file with delimiters), which you can then use as you want.
I have *.xls AND *.xlsx files
Why not convert to csv: there are more information in the files then I need - I use only a certain range
Format in Columns is all the same BUT there are different columns with mixed datatypes. One column is character the rest is numeric - no date column
Well, as I added I don't think binary XLS files will work with infile. I would still recommend converting to CSV and then algorithmically removing records/columns you don't need.
You could try proc import each one and then proc append the result, and if your very likely you may end up with what you want. I would still look at the process I have given, as there are many trip ups - number stored a character, data hidden behind Excel formats, special characters, multi-line fields etc. any of which could give you an invalid end result. Try it and see:
%macro imp (fname=);
proc import datafile="...\&fname." out=tmp;
sheet="...";
range="...;
run;
proc append base=base data=tmp;
run;
%mend imp;
/* Do first one so we have base */
proc import datafile="...\..." out=base;
sheet="...";
range="...;
run;
/* Do others */
%imp (fname=...);
%imp (fname=...);
I had a similar problem once. If you modify this code (Excel Version!!) you might get a reasonable result. The aggregated file is called: XL_All.
%Let Directory=C:\Users\xxxx;
%Let XL_vers=ExcelCS;
Filename Dir_Pipe PIPE "DIR &Directory. /B" LRecL=32767;
Data Dir_List;
Infile Dir_Pipe Truncover;
Input Text $200.;
Run;
Data XL_List;
Set Dir_List ;
Where UpCase(Substr(Text,Index(Text,'.')+1,4)) in ("XLSX" "XLS");
Run;
Proc SQL NoPrint;
Select Text Into :XL_List Separated By '#' From XL_List;
Quit;
%Put **&XL_List.**;
%Macro Import_XL(XL_List=);
%Global i;
%Let i=1;
%Let Element=%Scan(&XL_List.,&i.,'#');
%Do %While ("&Element."^="");
%Put **&i.**&Element.**;
Proc Import Out=XL_&i
Datafile="&Directory.\&Element."
DBMS=&XL_vers
Replace;
*Sheet="";
Run;
%Let i=%Eval(&i.+1);
%Let Element=%Scan(&XL_List.,&i.,'#');
%End;
%Mend;
%Import_XL(XL_List=&XL_List.);
%Let i=%Eval(&i.-1);
%Put **&i.**;
%Macro Append_XL;
Data XL_ALL;
Set XL_1;
Run;
%If &i.=2 %Then %Do;
Proc Append Base=XL_All Data=XL_2;
Run;
%End;
%Else %Do;
%Do a=2 %To &i;
Proc Append Base=XL_All Data=XL_&a.;
Run;
%End;
%End;
%Mend;
%Append_XL;
Thanks! It works...
Hi!
Try for example: Range="Sheet1$A2:E12".
5444 - How to read an Excel range into SAS using PROC IMPORT
Yes! Range was a fault of mine...
BUT the files has different columns and the macro uses only the columns which are in all files.
How can I fix it, that I have also the other columns i.e. with missing as value?
You could try the force option (Proc Append Base=XL_All Data=XL_&a. FORCE;) - but this will only work if the first data set is complete.
If the files aren't too large "set" will do.
Replace
Proc Append Base=XL_All Data=XL_&a.;
Run;
With
Data XL_All;
Set XL_All XL_&a;
Run;
Thanks! FORCE option did it... I forced it, that the complete data set was first
When creating XL_List, consider replacing
Where UpCase(Substr(Text,Index(Text,'.')+1,4)) in ("XLSX" "XLS");
with
Test=Reverse(Trim(LowCase(Text)));
If "xslx." =: Test
or "slx." =: Test;
Though it may not be so elegant, this code will include files with more complex names, such as blah.blah.xls, whereas the original will ignore such files.
.. I like that a lot. But I think I would have to use Reverse('.xlsx') to avoid typos.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.