SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Importing multiple Excel sheets

Accepted Solution Solved
Reply
Contributor
Posts: 28
Accepted Solution

Importing multiple Excel sheets

I'm trying to import multiple Excel sheets. The problem is that I don't know the amount of Excel files that have to be combined into a single table in SAS.

Which Excel files that are used, are filled in by my client. So what I'm looking for is some generic code that takes some array as input and imports all the Excel sheets in that input to a single table.

 

The sheet name inside the Excel files don't change (it's the same for all files) and the fields the Excel sheets are always the same.

 

Thanks


Accepted Solutions
Solution
‎09-06-2016 11:32 AM
Super User
Posts: 19,815

Re: Importing multiple Excel sheets

Posted in reply to WouterKBC

Yeah, your question is unclear. i think your mixing workbook and sheets for Excel?

 

You can modify this macro by changing extension from CSV to XLSX. 

This asssumes all excel files are in one directory. 

 

http://support.sas.com/documentation/cdl/en/mcrolref/67912/HTML/default/viewer.htm#n0ctmldxf23ixtn1k...

 

 

View solution in original post


All Replies
Super Contributor
Posts: 266

Re: Importing multiple Excel sheets

[ Edited ]
Posted in reply to WouterKBC
LIBNAME myexcel EXCEL PATH='C:\PCFData\Demo.xlsx';

proc sql;
/* create table temp as */
select strip(a.libname)||"."||strip(a.memname) into : _table separated by ' '
from sashelp.vstable a
where libname='myexcel';
quit;

data want;
set
%put &_table ;
run;

1. Assign a library to excel, if you are successful to create a library out of excel then you would see all the excel sheets as a table in the library.

 

2. Try to find out list of all tables in EXCEL lib.

 

3. Create into one data set.

 

If you are able to do step 1 then definitely rest would be easy. 

Contributor
Posts: 28

Re: Importing multiple Excel sheets

Thanks for your reply.

 

 

Maybe I wasn't clear, but I need to import several physical Excel files of unknown quantity.

I'm pretty sure I need some sort of for loop,

Solution
‎09-06-2016 11:32 AM
Super User
Posts: 19,815

Re: Importing multiple Excel sheets

Posted in reply to WouterKBC

Yeah, your question is unclear. i think your mixing workbook and sheets for Excel?

 

You can modify this macro by changing extension from CSV to XLSX. 

This asssumes all excel files are in one directory. 

 

http://support.sas.com/documentation/cdl/en/mcrolref/67912/HTML/default/viewer.htm#n0ctmldxf23ixtn1k...

 

 

Contributor
Posts: 28

Re: Importing multiple Excel sheets

I almost have it. I run this code inside a macro (in1 and inb1 are filerefs):

 

 

data _NULL_;
 array bonds{2} in1 in1b;
run;

%do i=1 %to 2;
 PROC IMPORT OUT=&datalib..src_tmp_&dataset1.
  DBMS = xlsx
  datafile = bonds{i};
  replace;
  SHEET = "&bondsSheet";
  getnames = yes;
 RUN;
 
 proc append base = &datalib..src_&dataset1. data=&datalib..src_tmp_&dataset1.;
 run;
%end;

 

 

I still get the next error:

269 PROC IMPORT OUT=&datalib..src_tmp_&dataset1. DBMS = xlsx datafile = bonds{i};

                                                                                                                                          -

                                                                                                                                          22

                                                                                                                                          200

ERROR 22-322: Syntax error, expecting one of the following: ;, DATAFILE, DATATABLE, DBMS, DEBUG,

FILE, OUT, REPLACE, TABLE, _DEBUG_.

 

 

It fails when he tries to get element i from the bonds array.

 

Super User
Posts: 19,815

Re: Importing multiple Excel sheets

Posted in reply to WouterKBC

Where's the rest of the code? How do you pass the filenames in? 

It looks like your trying to use an array from previous step but that won't work. It doesn't exist any more. 

 

Creeate macro variables with your filepaths somehow and use that instead. 

Super User
Posts: 19,815

Re: Importing multiple Excel sheets

@WouterKBC Here's an example of what, in my opinion, is a good way to deal with this issue. 

Instead create a dataset with the file names and then use call execute.

1. Create dataset with parameters for macro execution

2. Develop macro to import data and append. For append dataset make sure dataset is empty before process - manual step. Dataset does not need to exist BEFORE the proc append, it will create the dataset if required.

3. Use Call Execute to call macro. 

 

This is untested, so there may be small bugs that you can probably debug. But the idea is there.

data filenames;
datalib="Work";
file1="Card1"; bondSheet="Sheet1";output;
file1="Class2";bondSheet="Sheet1";output;
run;

%macro import_append(datalib, file_name, bondsSheet);

 PROC IMPORT OUT=&datalib..src_tmp_&file_name.
  DBMS = xlsx
  datafile = "path to file /&file_name..xlsx";
  replace;
  SHEET = "&bondsSheet";
  getnames = yes;
 RUN;
 
 proc append base = &datalib..src_&file_name. data=&datalib..src_tmp_&file_name.;
 run;

%mend;

data _null_;
set filenames;
str= catt('%import_append(', datalib, ',', file1, ',', bondSheet, ');');
call execute(str);
run;

Potential issues with this method:

1. As always when using proc import if your dataset isn't consistent in Excel the field types/lengths may not be the same so you may run into issues with your append. If you were using a text file I would suggest a data step, but unfortuantely there isn't a workaround for Excel files.

 

Contributor
Posts: 28

Re: Importing multiple Excel sheets

@Reeza

Thanks for your answer, but this doesn't solve the core of the problem. Your solution is hardcoded. You have to run an 'import_append' for each file.

The problem is that I need to deliver some code to my clients and they add files to a certain folder. I don't know how many files they will use or how they will look like (except for some words in the file name). They don't have coding skills, but they can edit a 'settings' file where they can edit some variables, because that's easy.

 

I found a workaround that is based on regular expressions.

 

It's not really that difficult.

First open de directory and put all the files that match a certain expression in a certain table.

Then run over all the elements in this table and do a proc import and append for all those files.

 

The only problem I still have is some format mismatches between the files, but I can manage to fix this.

 

/* Gathers all the files in a certain location, based on a certain regular expression and store them in the filestable_regex table. */
%macro get_filenames(location, regex, filestable);
	filename _dir_ "%bquote(&location.)";
	data &filestable._&regex. (keep=fname);
	handle=dopen( '_dir_' );
	if handle > 0 then do;
		count=dnum(handle);
		do i=1 to count;
			fname=dread(handle,i);
			patternID = prxparse('m/' || "&regex." || '/i');
			position = input(prxmatch(patternID, fname) - 1, best32.);
			fname = "&location." || fname;
			if position ne -1 then do;
				output &filestable._&regex. ;
			end;
		end;
	end;
	rc=dclose(handle);
run;
filename _dir_ clear;
%mend;

/* Import the Excel sheets that are located in the filestable_regex table with the given sheet. */
%macro import_excel(sheet, output, filestable, regex);
	%local nobs iter;
	proc sql noprint;
		select count(*) into : nobs
	from &filestable._&regex.;
	quit;

	%let iter=1;
	%do %while (&iter. <= &nobs.);
		data _NULL_;
			set filenames_&regex. (firstobs=&iter. obs=&iter.);
			call symput("fname", fname);
		run;

		PROC IMPORT	OUT=tmp_&output.
			DBMS = xlsx
			datafile = "&fname."
			replace;
			SHEET = "&sheet.";
			getnames = yes;
		RUN;
		
		proc append base = &output. data = tmp_&output.;
		run;
		%let iter=%eval(&iter.+1);
	%end;
%mend;

/* Initialization of the variables. */
data _NULL_;
	location = "location of the files";
	regex = "word that is ONLY in the excel files you want to work with";
	sheet = "the sheet in the excel files";
	output = "output table";
	call symput('location', location);
	call symput('regex', regex);
	call symput('sheet', sheet);
	call symput('output', output);
run;

%get_filenames(location=&location, regex=&regex, filestable=filenames);
%import_excel(sheet=&sheet, output=&output, filestable=filenames, regex=&regex);

/* Drop the temp tables. */
data _NULL_;
	proc sql;
		drop table filenames_&regex.;
		drop table tmp_&output.;
	quit;
run;

 

Maybe there is a better solution based on a regex directly in the PROC IMPORT step, but this is what I was able to make.

Super User
Posts: 19,815

Re: Importing multiple Excel sheets

Posted in reply to WouterKBC

@WouterKBC Compare your solution to the very first solution I linked to. Your correct in that it's not very difficult, once the problem is defined. 

 

Super Contributor
Posts: 266

Re: Importing multiple Excel sheets

Posted in reply to WouterKBC
%let input_folder=/folders/myfolders/;
data xls_files;
keep filename;
length fref $8 filename $80;
rc = filename(fref, "&input_folder");
if rc = 0 then
do;
did = dopen(fref);
rc = filename(fref);
end;
else
do;
length msg $200.;
msg = sysmsg();
put msg=;
did = .;
end;
if did <= 0
then
putlog 'ERR' 'OR: Unable to open directory.';
dnum = dnum(did);
do i = 1 to dnum;
filename = dread(did, i);
/* If this entry is a file, then output. */
fid = mopen(did, filename);
if fid > 0 and index(filename,".xls")
then
output;
end;
rc = dclose(did);
run;

%macro first_xls(file);

PROC IMPORT OUT= WORK.auto1 DATAFILE= "&input_folder.&file"
DBMS=xls REPLACE;
SHEET="Sheet1";
GETNAMES=YES;
RUN;

data want;
set auto1;
run;



%mend;

%macro read_xls(file);

PROC IMPORT OUT= WORK.auto1 DATAFILE= "&input_folder.&file"
DBMS=xls REPLACE;
SHEET="Sheet1";
GETNAMES=YES;
RUN;

proc append base=work.want data=work.auto1 force;
run;



%mend;

data _null_;
set xls_files(obs=1);
file_name=substr(filename,1,index(filename,".xls")-1);
call execute('%first_xls('||file_name||');');
run;


data _null_;
set xls_files(firstobs=2 );
file_name=substr(filename,1,index(filename,".xls")-1);
call execute('%read_xls('||file_name||');');
run;



You need to provide input_folder path in the first line of the code.

Secondly, you can try getting rid of %first_xls. 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 2893 views
  • 0 likes
  • 3 in conversation