DATA Step, Macro, Functions and more

Need to Import more than 3000 xlsx files in sas

Reply
Contributor
Posts: 59

Need to Import more than 3000 xlsx files in sas

Dear All,

 

I have a more than 3000 xlsx files, I need to import those files into SAS. Is there any way other than macro? 

 

Like libname or SAS/Access Please help me out in this.

 

 

Regards,

Anil

 

Super User
Posts: 7,845

Re: Need to Import more than 3000 xlsx files in sas

Posted in reply to anilgvdbm

You need proc import for each sheet, or a libname for each xlsx and proc copy or similar to retrieve all sheets in each file.

So you need to set up a procedure that retrieves the names of all data items and then runs the import/copy process for each. This will involve macro and advanced data step programming.

 

Don't you have another way of getting the data from the original sources?

Or is that actually all data that exists purely in Excel?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 59

Re: Need to Import more than 3000 xlsx files in sas

Posted in reply to KurtBremser

Hi  

 

 

 

 

Super User
Super User
Posts: 7,981

Re: Need to Import more than 3000 xlsx files in sas

Posted in reply to anilgvdbm

Do you have command line access?  If so;

 

filename pipe tmp 'dir "c:\test\*.xlsx" /b';

data _null_;
  infile tmp dlm="¬";
  input;
  call execute('proc import datafile="c:\test\'||strip(_infile_)||'" out=indata'||strip(put(_n_,best.))||'; run;');
run;
  

What this does (and might be typos as not tested it) is get a list of filenames from command line, assuming your xlsx files are in c:\test, and for each of the filenames read, create a proc import statement to read the file out into a dataset called indata with a numeric suffix like, indata1 indata2 etc.  You could switch the language to libname.  You will still get mostly rubbish data you need to post-process however.

 

Super User
Posts: 11,343

Re: Need to Import more than 3000 xlsx files in sas

Posted in reply to anilgvdbm

anilgvdbm wrote:

Hi  

 

 

 

 


Have you written a program to successfully bring one of the data sets into SAS? I would strongly suggest that Proc Import is not the way to go as you are almost certainly going to have issues with variable lengths and/or types that should remain constant as the proc will make guesses as to those items based on each individual file.

 

I would suggest that you use a libname approach such as

libname inex EXCEL Path="<path to your folder>\onefile.xlsx";

And see if you can read the data in excel that way into the form you need.

 

Then expand that code either with a macro or call execute in a data step using the names or rules to build the names of the xlsx files in the libname statement.

 

Frequent Contributor
Posts: 149

Re: Need to Import more than 3000 xlsx files in sas

Posted in reply to anilgvdbm
Depending on what you have to do with the data, be prepared to fix variable type and length. This can be done automatically if you have the metadata.
Super User
Super User
Posts: 7,981

Re: Need to Import more than 3000 xlsx files in sas

Posted in reply to anilgvdbm

Have to agree with the other two posters.  Do you have any sort of specification, documentation, fixed structure, or any sort of control over the source data whatsoever?  If not then you have 3000 individual problems, and if you choose to use proc import then that can be exponentially larger.  Excel is a really poor data medium as it does not enforce any structure on the data, for instance you cell could contain text, numbers, formulas, missings, picture, wordart etc., the columns may have mixed formatting or special items etc.  Proc import is a guessing procedure and it does its best to guess what the data is.  Thus combining these two would generally result in a very poor dataset with need for more processing to get it into a usable structure.  WE can show you the logic for repeating an import, but that really is 0.001% of the problem you face here.

Super User
Posts: 19,855

Re: Need to Import more than 3000 xlsx files in sas

Posted in reply to anilgvdbm

You seem averse to using macros? Why? 

 

It’s the best method to import all at once. One other option, is to convert all to CSV - via a VBS script and then to import them all. 

 

Here’s an example of how this can work, with first finding all the names of files and then importing them all. You’ll want to design a process that will ensure the structure of the data is the same across all datasets and what you want.

 

https://github.com/statgeek/SAS-Tutorials/blob/master/Import_all_files_one_type

Ask a Question
Discussion stats
  • 7 replies
  • 178 views
  • 1 like
  • 6 in conversation