Help using Base SAS procedures

How to select Excel sheets to import based on position, and import sheet name with proc import?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 18
Accepted Solution

How to select Excel sheets to import based on position, and import sheet name with proc import?

Say I have an Excel Spreadsheet with several sheets such as "JAN2017" "FEB2017" and so on.

 

How do I import the data of the right most sheet in the document, if I don't know its name?

 

Also, I would like to add a column to the imported data with the sheet's name, so how do I store it as a character value?

 

Thanks in advance.


Accepted Solutions
Solution
‎02-27-2017 04:42 PM
Super User
Posts: 17,840

Re: How to select Excel sheets to import based on position, and import sheet name with proc import?

Ideally, fix your process to not have to deal with this. 

 

Ok, since that's probably not possible:

 

1. Assign a libname to the Excel file

2. Use PROC CONTENTS or SASHELP.VCOLUMN to retrieve the Sheet Names

3. Determine which one to import

4. Import said sheet, since you'll likely have it as a macro variable you can include it in the data step.

 

libname want xlsx 'path to excel file.xlsx';

 

data sheet_list;

set sashelp.vcolumn;

where libname='WANT';

keep memname;

date = input(memname, monyy5.); * this is probably wrong informat but you can fix that;

run;

 

proc sql noprint;

select put(max(date), monyy5.) into :latest_sheet

from sheet_list;

quit;

 

 

View solution in original post


All Replies
Solution
‎02-27-2017 04:42 PM
Super User
Posts: 17,840

Re: How to select Excel sheets to import based on position, and import sheet name with proc import?

Ideally, fix your process to not have to deal with this. 

 

Ok, since that's probably not possible:

 

1. Assign a libname to the Excel file

2. Use PROC CONTENTS or SASHELP.VCOLUMN to retrieve the Sheet Names

3. Determine which one to import

4. Import said sheet, since you'll likely have it as a macro variable you can include it in the data step.

 

libname want xlsx 'path to excel file.xlsx';

 

data sheet_list;

set sashelp.vcolumn;

where libname='WANT';

keep memname;

date = input(memname, monyy5.); * this is probably wrong informat but you can fix that;

run;

 

proc sql noprint;

select put(max(date), monyy5.) into :latest_sheet

from sheet_list;

quit;

 

 

Valued Guide
Posts: 505

Re: How to select Excel sheets to import based on position, and import sheet name with proc import?

Number of sheets and sheet names in the order they appear in an excel workbook

If you have IML interface to R you can cut and paste the code into IML.
Note proc contents gives the sheetnames in alphabetic order.

HAVE Excel workbook with two sheets (Tab1 is ZZZZ and Tab2 is AAAA)
====================================================================

SHEET(ZZZZ ist sheet) CLASS IN WORKBOOK D:/XLS/TWO_SEETS.XLSX


  +----------------------------------------------------------------+
  |     A      |    B       |     C      |    D       |    E       |
  +----------------------------------------------------------------+
1 |  NAME      |   SEX      |    AGE     |  HEIGHT    |  WEIGHT    |
  +------------+------------+------------+------------+------------+
2 | ALFRED     |    M       |    14      |    69      |  112.5     |
  +------------+------------+------------+------------+------------+
   ...
  +------------+------------+------------+------------+------------+
N | WILLIAM    |    M       |    15      |   66.5     |  112       |
  +------------+------------+------------+------------+------------+

[ZZZZ]

SHEET(AAAA 2nd sheet)

  +----------------------------------------------------------------+
  |     A      |    B       |     C      |    D       |    E       |
  +----------------------------------------------------------------+
1 |  NAME      |   SEX      |    AGE     |  HEIGHT    |  WEIGHT    |
  +------------+------------+------------+------------+------------+
2 | ALFRED     |    M       |    14      |    69      |  112.5     |
  +------------+------------+------------+------------+------------+
   ...
  +------------+------------+------------+------------+------------+
N | WILLIAM    |    M       |    15      |   66.5     |  112       |
  +------------+------------+------------+------------+------------+

[AAAA]

WANT Number of sheets in the order they appear in workbook
===========================================================

Up to 40 obs WORK.WANT total obs=1

       SHEETS_BY_    NUMBER_
Obs     POSITION      SHEETS

 1     ZZZZ AAAA        2


WORKING CODE

    R  strset<-paste0(getSheets(wb),collapse=" ");

FULL SOLUTION
=============


%utl_submit_r64('
  source("c:/Program Files/R/R-3.3.2/etc/Rprofile.site",echo=T);
  library(XLConnect);
  wb <- loadWorkbook("d:/xls/two_sheets.xlsx",create = FALSE);
  strset<-paste0(getSheets(wb),collapse=" ");
  strset;
  write(strset,"d:/txt/sheets.txt");
');

> source("c:/Program Files/R/R-3.3.2/etc/Rprofile.site",echo=T);
library(XLConnect);
wb <- loadWorkbook("d:/xls/two_sheets.xlsx",create = FALSE);
strset<-paste0(getShee
ts(wb),collapse=" ");
strset;  write(strset,"d:/txt/sheets.txt");

[1] "zzzz aaaa"

d:/txt/sheets.txt
zzzz aaaa

data want;
  infile "d:/txt/sheets.txt";
  input;
  sheets_by_position=_infile_;
  number_sheets=countw(_infile_);
run;quit;

Up to 40 obs WORK.WANT total obs=1

       SHEETS_BY_    NUMBER_
Obs     POSITION      SHEETS

 1     zzzz aaaa        2


Super User
Posts: 3,108

Re: How to select Excel sheets to import based on position, and import sheet name with proc import?

I know this is a little off topic but if your SAS application is reliant on picking up the rightmost sheet as being the "right" one to import, how do you know that is correct? Is there some other means of confirming you have the correct sheet?

 

I say that as many Excel users have a habit of arbitrarily changing their worksheets to suit themselves without giving any thought to others who are using them.

Super User
Super User
Posts: 6,500

Re: How to select Excel sheets to import based on position, and import sheet name with proc import?

[ Edited ]

If you are reading an XLSX file then perhaps you can just read the XML from within the XLSX file directly?

First set a macro variable to filename that is the XLSX file.

%let filename=myfile.xlsx ;

Also define an XML map for reading the workbook xml information out of the XLSX file.

*----------------------------------------------------------------------;
* Generate XMLMAP to read the sheetnames from xl/workbook.xml ;
*----------------------------------------------------------------------;
filename _wbmap temp;
data _null_;
  file _wbmap;
  put '<SXLEMAP version="2.1"><TABLE name="Sheets">'
    / '<TABLE-PATH>/workbook/sheets/sheet</TABLE-PATH>'
    / '<COLUMN name="Sheet"><TYPE>character</TYPE>'
    / '<DESCRIPTION>Sheet Name</DESCRIPTION>'
    / '<PATH>/workbook/sheets/sheet/@name</PATH>'
    / '<DATATYPE>string</DATATYPE><LENGTH>32</LENGTH>'
    / '</COLUMN>'
    / '<COLUMN name="State"><TYPE>character</TYPE>'
    / '<DESCRIPTION>Sheet State</DESCRIPTION>'
    / '<PATH>/workbook/sheets/sheet/@state</PATH>'
    / '<DATATYPE>string</DATATYPE><LENGTH>20</LENGTH>'
    / '</COLUMN>'
    / '</TABLE></SXLEMAP>'
  ;
run;

You then need to copy the XML file out of the XLSX file. This step is needed because the XML libname engine cannot read the XML file directly via the ZIP filename engine.

*----------------------------------------------------------------------;
* Copy XML file out of the XLSX file ;
*----------------------------------------------------------------------;
filename _wbin zip "&filename" member="xl/workbook.xml" ;
filename _wb temp;
data _null_;
  rc=fcopy('_wbin','_wb');
run;

Now assign a libref to the XML file and use it to read the sheetname out into a SAS dataset. You can add a sheet counter variable.  I also like to generate a new column that has a valid SAS member name for the XLSX sheet name.

*----------------------------------------------------------------------;
* Assign libref ;
*----------------------------------------------------------------------;
libname _wb xmlv2 xmlmap=_wbmap ;

*----------------------------------------------------------------------;
* Read sheet names from XLSX file into a SAS dataset. ;
* Create valid SAS dataset name from sheetname or sheetnumber. ;
*----------------------------------------------------------------------;
data sheets  ;
  number+1;
  set _wb.sheets;
  length Memname $32 ;
  label number='Sheet Number' memname='Mapped SAS Memname' ;
  if ^nvalid(compress(sheet),'v7') then memname = cats('Sheet',number);
  else memname = translate(trim(compbl(sheet)),'_',' ');
run;

Now that you have the list in a dataset you can remove the filerefs and libref.

*----------------------------------------------------------------------;
* Clear the libname and filenames used in reading the sheetnames. ;
*----------------------------------------------------------------------;
libname _wb clear ;
filename _wbin clear ;
filename _wb clear ;
filename _wbmap clear ;

 

Super User
Super User
Posts: 6,500

Re: How to select Excel sheets to import based on position, and import sheet name with proc import?

You could just read the member names that SAS sees and then convert the name into an actual date and take the last date.  Not really the same as taking the last one by position, but probably easier, and perhaps even more acurate, depending on how consistently they make the sheetnames.  First point a libref at the actual file.

 

libname in xlsx "myfile.xlsx";

Then get a list of the members. You could use the dictionary tables (or SASHELP.Vxxx views).  But why not just run a proc contents.

proc contents data=in._all_ noprint out=contents; run;

Now you have the memnames (and the variable names also).  So look for the memname that seems to represent the largest date.

%let last_sheet = ;
proc sql noprint ;
  select memname into :last_sheet trimmed
  from contents
  having input(memname,anydtdte.)=max(input(memname,anydtdte.))
  ;
quit;

Then use the macro variable and the libref to read the actual data.

data want ;
  set in.&last_sheet;
run;
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 225 views
  • 0 likes
  • 5 in conversation