DATA Step, Macro, Functions and more

How to import data from excel (xlsx) sheet with headers and footers without specifying RANGE option

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

How to import data from excel (xlsx) sheet with headers and footers without specifying RANGE option

Hi all,

i am facing a problem while importing data from excel 2010 in SAS Datasets. The data is present in different sheets and in each sheet headers are there till some rows i.e the original data starts from different rows in each sheet. i can import the data by specifying RANGE option. But without RANGE option i wanna import each sheet into a single dataset. i tried DATAROW option, FRISTOBS option but they didn't worked. Please help me in this.

Thanks


Accepted Solutions
Solution
‎05-16-2017 09:56 AM
Community Manager
Posts: 2,764

Re: How to import data from excel (xlsx) sheet with headers without specifying RANGE option

This is a popular question.  I'm consolidating some of the best responses from this and other threads to provide some guidance.

 

First, with SAS 9.4 and later, SAS recommends using DBMS=XLSX for the most flexibility.  It works on all operating systems without the need for additional components like the PC Files Server.  Your Excel file must be in the Excel 2007-or-later format (XLSX).  You do need a licence for SAS/ACCESS to PC Files.

 

If your Excel data does not begin in cell A1 (the default start point for an import process), then you can specify a RANGE= value that includes the specific cells.  The easiest method is to use a Named Range in Excel to define the exact boundaries of the data.  

 

To define a named range in Excel, highlight the range of cells to include and simply type the new name of the range in the Name Box:

 

definenamerange.png

 

Then save the Excel file.

 

Then to import into SAS, specify that range name in the RANGE= option:

 

proc import datafile="/myprojects/myfile.xlsx"
 out=mydata 
 replace;
range="myspecialrange";
run;

 

The original question asked for a method that didn't use RANGE=.  You can use PROC IMPORT to read the entire sheet, but the result will not have the column headers and types you want.  Consider a sheet like this:

 

middlexlsx.png

 

This code will read it:

 

proc import datafile="/myprojects/middle.xlsx"
 out=mid dbms=xlsx
 replace;
run;

But the result will contain many empty cells, and the values will be read as all character types:

 

middleds.png

 

With additional coding, you can "fix" this result in another pass using DATA step.  Or, if you're willing to add the RANGE option, you can read it properly in the first pass:

 

proc import datafile="/myprojects/middle.xlsx"
 out=mid dbms=xlsx
 replace;
 range="Sheet1$E7:K17" ;
run;

You can also use LIBNAME XLSX to read entire sheets from Excel, or simply as a discovery step to see what sheets the Excel file contains before you run PROC IMPORT.  However, LIBNAME XLSX does not show the Excel named ranges.

 

On SAS for Windows systems, you can use LIBNAME EXCEL (32-bit) or LIBNAME PCFILES (64-bit) to reveal a little more information about the Excel file.

 

libname d pcfiles path=c:\myprojects\middle.xlsx";
proc datasets lib=d; quit;

/* always clear the libname, as it locks the file */
libname d clear;

 

dsout.png

 

Note that does DBMS=XLSX not support some of the options we see in the legacy DBMS=XLS (which supports only old-format XLS files), such as STARTROW and NAMEROW.  DBMS=XLSX does support GETNAMES (treats the first record of the sheet or range as the variable names).  See the full reference for Excel file import/export in the SAS documentation.

 

View solution in original post


All Replies
Occasional Contributor
Posts: 13

Re: How to import data from excel (xlsx) sheet with headers and footers without specifying RANGE opt

Hi,

 

could you post your code please?

 

Did you try sheet=sheetname instead of range=range?

Occasional Contributor
Posts: 8

Re: How to import data from excel (xlsx) sheet with headers and footers without specifying RANGE opt

Hi

 

proc import datafile="C:\Users\data.xlsx"
DBMS=excel replace
out=Report;
mixed= yes;
getnames= yes;
DBDSOPTS= 'FIRSTOBS=9';
sheet='Compliance Report';
run;

 

When i used the above code i am getting the data into dataset but the variable names become F1, F2, F3..........

 

proc import datafile="C:\Users\data.xlsx"
DBMS=excel replace
out=Report;
mixed= yes;
getnames= yes;
datarow= 9;

sheet='Compliance Report';
run;

 

When i am using the above code i am getting an error stating that "ERROR 180-322: Statement is not valid or it is used out of proper order."

 

Super User
Super User
Posts: 7,407

Re: How to import data from excel (xlsx) sheet with headers and footers without specifying RANGE opt

Your problem is that you are using Excel.  Excel is not a strucuted dataset as database/SAS etc. software are.  SAS datasets are created as tables, so each variable - column in a dataset has a fixed length/format.  Excel does not impose this, each cell is different.  Also, cells could be anywhere on the spreadsheet, as you have said there are heard rows and such like.  

The second point to note on this is that proc import is a guessing procedure, it will attempt to do a best guess on your data.  Not ideal really.

 

So as always, you should use the technology best suited to the task at hand.  If you are transferring data, then a data transfer format would be best suited - CSV, XML etc.  I understand that sometimes this is not possible, and that is why so manmy of these issues occur.  In your case, you will need to look at each sheet you want to import, pull out the data you want into a dataset type structure, i.e. if your data starts in column D, at row 6, then copy from there, put the datainto a grid of its own.  If you need titles then put them into a grid themselves.  You can then import that data, although as always, I recommend saving that to CSV and writing a datastep import program to read the data in as specified by you, the person who knows the data best.  

 

An alternative, if you know VBA, is to write a macro to do this and export the data to CSV (or if your feeling really brave, head directly in the Open Office Document format).

 

 

Super User
Posts: 9,682

Re: How to import data from excel (xlsx) sheet with headers and footers without specifying RANGE opt

DATAROW option is not available for dbms=excel  try STARTROW=

Solution
‎05-16-2017 09:56 AM
Community Manager
Posts: 2,764

Re: How to import data from excel (xlsx) sheet with headers without specifying RANGE option

This is a popular question.  I'm consolidating some of the best responses from this and other threads to provide some guidance.

 

First, with SAS 9.4 and later, SAS recommends using DBMS=XLSX for the most flexibility.  It works on all operating systems without the need for additional components like the PC Files Server.  Your Excel file must be in the Excel 2007-or-later format (XLSX).  You do need a licence for SAS/ACCESS to PC Files.

 

If your Excel data does not begin in cell A1 (the default start point for an import process), then you can specify a RANGE= value that includes the specific cells.  The easiest method is to use a Named Range in Excel to define the exact boundaries of the data.  

 

To define a named range in Excel, highlight the range of cells to include and simply type the new name of the range in the Name Box:

 

definenamerange.png

 

Then save the Excel file.

 

Then to import into SAS, specify that range name in the RANGE= option:

 

proc import datafile="/myprojects/myfile.xlsx"
 out=mydata 
 replace;
range="myspecialrange";
run;

 

The original question asked for a method that didn't use RANGE=.  You can use PROC IMPORT to read the entire sheet, but the result will not have the column headers and types you want.  Consider a sheet like this:

 

middlexlsx.png

 

This code will read it:

 

proc import datafile="/myprojects/middle.xlsx"
 out=mid dbms=xlsx
 replace;
run;

But the result will contain many empty cells, and the values will be read as all character types:

 

middleds.png

 

With additional coding, you can "fix" this result in another pass using DATA step.  Or, if you're willing to add the RANGE option, you can read it properly in the first pass:

 

proc import datafile="/myprojects/middle.xlsx"
 out=mid dbms=xlsx
 replace;
 range="Sheet1$E7:K17" ;
run;

You can also use LIBNAME XLSX to read entire sheets from Excel, or simply as a discovery step to see what sheets the Excel file contains before you run PROC IMPORT.  However, LIBNAME XLSX does not show the Excel named ranges.

 

On SAS for Windows systems, you can use LIBNAME EXCEL (32-bit) or LIBNAME PCFILES (64-bit) to reveal a little more information about the Excel file.

 

libname d pcfiles path=c:\myprojects\middle.xlsx";
proc datasets lib=d; quit;

/* always clear the libname, as it locks the file */
libname d clear;

 

dsout.png

 

Note that does DBMS=XLSX not support some of the options we see in the legacy DBMS=XLS (which supports only old-format XLS files), such as STARTROW and NAMEROW.  DBMS=XLSX does support GETNAMES (treats the first record of the sheet or range as the variable names).  See the full reference for Excel file import/export in the SAS documentation.

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 8661 views
  • 1 like
  • 5 in conversation