Solved: Re: How to import data from excel (xlsx) sheet with headers without sp...

satish78652 · Posted 01-29-2016 06:59 AM

Hi all,

i am facing a problem while importing data from excel 2010 in SAS Datasets. The data is present in different sheets and in each sheet headers are there till some rows i.e the original data starts from different rows in each sheet. i can import the data by specifying RANGE option. But without RANGE option i wanna import each sheet into a single dataset. i tried DATAROW option, FRISTOBS option but they didn't worked. Please help me in this.

Thanks

ChrisHemedinger · Posted 05-16-2017 09:56 AM

This is a popular question. I'm consolidating some of the best responses from this and other threads to provide some guidance.

First, with SAS 9.4 and later, SAS recommends using DBMS=XLSX for the most flexibility. It works on all operating systems without the need for additional components like the PC Files Server. Your Excel file must be in the Excel 2007-or-later format (XLSX). You do need a licence for SAS/ACCESS to PC Files.

If your Excel data does not begin in cell A1 (the default start point for an import process), then you can specify a RANGE= value that includes the specific cells. The easiest method is to use a Named Range in Excel to define the exact boundaries of the data.

To define a named range in Excel, highlight the range of cells to include and simply type the new name of the range in the Name Box:

Then save the Excel file.

Then to import into SAS, specify that range name in the RANGE= option:

proc import datafile="/myprojects/myfile.xlsx"
 out=mydata 
 replace;
range="myspecialrange";
run;

The original question asked for a method that didn't use RANGE=. You can use PROC IMPORT to read the entire sheet, but the result will not have the column headers and types you want. Consider a sheet like this:

This code will read it:

proc import datafile="/myprojects/middle.xlsx"
 out=mid dbms=xlsx
 replace;
run;

But the result will contain many empty cells, and the values will be read as all character types:

With additional coding, you can "fix" this result in another pass using DATA step. Or, if you're willing to add the RANGE option, you can read it properly in the first pass:

proc import datafile="/myprojects/middle.xlsx"
 out=mid dbms=xlsx
 replace;
 range="Sheet1$E7:K17" ;
run;

You can also use LIBNAME XLSX to read entire sheets from Excel, or simply as a discovery step to see what sheets the Excel file contains before you run PROC IMPORT. However, LIBNAME XLSX does not show the Excel named ranges.

On SAS for Windows systems, you can use LIBNAME EXCEL (32-bit) or LIBNAME PCFILES (64-bit) to reveal a little more information about the Excel file.

libname d pcfiles path=c:\myprojects\middle.xlsx";
proc datasets lib=d; quit;

/* always clear the libname, as it locks the file */
libname d clear;

Note that does DBMS=XLSX not support some of the options we see in the legacy DBMS=XLS (which supports only old-format XLS files), such as STARTROW and NAMEROW. DBMS=XLSX does support GETNAMES (treats the first record of the sheet or range as the variable names). See the full reference for Excel file import/export in the SAS documentation.

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.

View solution in original post

DanielaNRW · Posted 01-29-2016 07:59 AM

Hi,

could you post your code please?

Did you try sheet=sheetname instead of range=range?

satish78652 · Posted 01-29-2016 08:30 AM

Hi

proc import datafile="C:\Users\data.xlsx"
DBMS=excel replace
out=Report;
mixed= yes;
getnames= yes;
DBDSOPTS= 'FIRSTOBS=9';
sheet='Compliance Report';
run;

When i used the above code i am getting the data into dataset but the variable names become F1, F2, F3..........

proc import datafile="C:\Users\data.xlsx"
DBMS=excel replace
out=Report;
mixed= yes;
getnames= yes;
datarow= 9;

sheet='Compliance Report';
run;

When i am using the above code i am getting an error stating that "ERROR 180-322: Statement is not valid or it is used out of proper order."

RW9 · Posted 01-29-2016 08:45 AM

Your problem is that you are using Excel. Excel is not a strucuted dataset as database/SAS etc. software are. SAS datasets are created as tables, so each variable - column in a dataset has a fixed length/format. Excel does not impose this, each cell is different. Also, cells could be anywhere on the spreadsheet, as you have said there are heard rows and such like.

The second point to note on this is that proc import is a guessing procedure, it will attempt to do a best guess on your data. Not ideal really.

So as always, you should use the technology best suited to the task at hand. If you are transferring data, then a data transfer format would be best suited - CSV, XML etc. I understand that sometimes this is not possible, and that is why so manmy of these issues occur. In your case, you will need to look at each sheet you want to import, pull out the data you want into a dataset type structure, i.e. if your data starts in column D, at row 6, then copy from there, put the datainto a grid of its own. If you need titles then put them into a grid themselves. You can then import that data, although as always, I recommend saving that to CSV and writing a datastep import program to read the data in as specified by you, the person who knows the data best.

An alternative, if you know VBA, is to write a macro to do this and export the data to CSV (or if your feeling really brave, head directly in the Open Office Document format).

Ksharp · Posted 01-31-2016 11:24 PM

DATAROW option is not available for dbms=excel try STARTROW=

ChrisHemedinger · Posted 05-16-2017 09:56 AM

This is a popular question. I'm consolidating some of the best responses from this and other threads to provide some guidance.

First, with SAS 9.4 and later, SAS recommends using DBMS=XLSX for the most flexibility. It works on all operating systems without the need for additional components like the PC Files Server. Your Excel file must be in the Excel 2007-or-later format (XLSX). You do need a licence for SAS/ACCESS to PC Files.

If your Excel data does not begin in cell A1 (the default start point for an import process), then you can specify a RANGE= value that includes the specific cells. The easiest method is to use a Named Range in Excel to define the exact boundaries of the data.

To define a named range in Excel, highlight the range of cells to include and simply type the new name of the range in the Name Box:

Then save the Excel file.

Then to import into SAS, specify that range name in the RANGE= option:

proc import datafile="/myprojects/myfile.xlsx"
 out=mydata 
 replace;
range="myspecialrange";
run;

The original question asked for a method that didn't use RANGE=. You can use PROC IMPORT to read the entire sheet, but the result will not have the column headers and types you want. Consider a sheet like this:

This code will read it:

proc import datafile="/myprojects/middle.xlsx"
 out=mid dbms=xlsx
 replace;
run;

But the result will contain many empty cells, and the values will be read as all character types:

With additional coding, you can "fix" this result in another pass using DATA step. Or, if you're willing to add the RANGE option, you can read it properly in the first pass:

proc import datafile="/myprojects/middle.xlsx"
 out=mid dbms=xlsx
 replace;
 range="Sheet1$E7:K17" ;
run;

You can also use LIBNAME XLSX to read entire sheets from Excel, or simply as a discovery step to see what sheets the Excel file contains before you run PROC IMPORT. However, LIBNAME XLSX does not show the Excel named ranges.

On SAS for Windows systems, you can use LIBNAME EXCEL (32-bit) or LIBNAME PCFILES (64-bit) to reveal a little more information about the Excel file.

libname d pcfiles path=c:\myprojects\middle.xlsx";
proc datasets lib=d; quit;

/* always clear the libname, as it locks the file */
libname d clear;

Note that does DBMS=XLSX not support some of the options we see in the legacy DBMS=XLS (which supports only old-format XLS files), such as STARTROW and NAMEROW. DBMS=XLSX does support GETNAMES (treats the first record of the sheet or range as the variable names). See the full reference for Excel file import/export in the SAS documentation.

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.

s_manoj · Posted 12-26-2017 06:21 AM

Hi, your post is very informative, can you please give infile code to import excel file?

AbuTariq · Posted 02-04-2018 10:51 PM

Thank you Chris.

Just as what you suggest, when put DBMS = XLSX, only a few options are supported, according to SAS documentation.

Available Statements for Importing and Exporting Excel Files Using DBMS=XLS and DBMS=XLSX
DBMS= Identifier	Option	Valid Value	Default Value	PROC	PROC
				IMPORT	EXPORT
XLS	ENDCOL	Last column for data	Last column that contains data	Yes	No
	ENDNAMEROW	Last row for variable names	Same as NAMEROW	Yes	No
	ENDROW	Last row for data	Last row that contains data	Yes	No
	GETNAMES	Yes \| No	Yes	Yes	No
	NAMEROW	First row for variable names	First row that contains variable names	Yes	No
	NEWFILE	Yes \| No	No	No	Yes
	PUTNAMES	Yes \| No	Yes	No	Yes
	RANGE	name \| sheet$ul:lr	First row	Yes	No
	SHEET	Sheet name	First sheet	Yes	Yes
	STARTCOL	First column for data	Last column that contains data	Yes	No
	STARTROW	First row for data	First row that contains data	Yes	No
XLSX	GETNAMES	Yes \| No	Yes	Yes	No
	RANGE	name \| sheet$ul:lr	First row	Yes	No
	SHEET	Sheet name	First sheet	Yes	Yes

ChrisHemedinger · Posted 02-05-2018 07:56 AM

Yes, fewer (and different) options are supported with PROC IMPORT and DBMS=XLSX -- but the most important options are there. DBMS=XLS is a very old method that can work only with legacy XLS files.

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.

AbuTariq · Posted 02-06-2018 05:13 AM

True. looking at the SAS code generated by SAS when importing Excel files by using the wizard, SAS put DBMS=Excel by default, with several options that resemble those of DBMS=XLS.

It works fine with me (DBMS=Excel ) until I import Excel Sheets that have more than 255 column. In fact columns beyond 255 column are ignored by the program without any warning sign.

The only choice is to use DBMS=XLSX.

How to import data from excel (xlsx) sheet with headers and footers without specifying RANGE option

Re: How to import data from excel (xlsx) sheet with headers without specifying RANGE option

Re: How to import data from excel (xlsx) sheet with headers and footers without specifying RANGE opt

Re: How to import data from excel (xlsx) sheet with headers and footers without specifying RANGE opt

Re: How to import data from excel (xlsx) sheet with headers and footers without specifying RANGE opt

Re: How to import data from excel (xlsx) sheet with headers and footers without specifying RANGE opt

Re: How to import data from excel (xlsx) sheet with headers without specifying RANGE option

Re: How to import data from excel (xlsx) sheet with headers without specifying RANGE option

Re: How to import data from excel (xlsx) sheet with headers without specifying RANGE option

Re: How to import data from excel (xlsx) sheet with headers without specifying RANGE option

Re: How to import data from excel (xlsx) sheet with headers without specifying RANGE option

SAS Innovate 2025: Register Now

SAS Training: Just a Click Away