BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Datino
Obsidian | Level 7

Say I have an Excel Spreadsheet with several sheets such as "JAN2017" "FEB2017" and so on.

 

How do I import the data of the right most sheet in the document, if I don't know its name?

 

Also, I would like to add a column to the imported data with the sheet's name, so how do I store it as a character value?

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Ideally, fix your process to not have to deal with this. 

 

Ok, since that's probably not possible:

 

1. Assign a libname to the Excel file

2. Use PROC CONTENTS or SASHELP.VCOLUMN to retrieve the Sheet Names

3. Determine which one to import

4. Import said sheet, since you'll likely have it as a macro variable you can include it in the data step.

 

libname want xlsx 'path to excel file.xlsx';

 

data sheet_list;

set sashelp.vcolumn;

where libname='WANT';

keep memname;

date = input(memname, monyy5.); * this is probably wrong informat but you can fix that;

run;

 

proc sql noprint;

select put(max(date), monyy5.) into :latest_sheet

from sheet_list;

quit;

 

 

View solution in original post

5 REPLIES 5
Reeza
Super User

Ideally, fix your process to not have to deal with this. 

 

Ok, since that's probably not possible:

 

1. Assign a libname to the Excel file

2. Use PROC CONTENTS or SASHELP.VCOLUMN to retrieve the Sheet Names

3. Determine which one to import

4. Import said sheet, since you'll likely have it as a macro variable you can include it in the data step.

 

libname want xlsx 'path to excel file.xlsx';

 

data sheet_list;

set sashelp.vcolumn;

where libname='WANT';

keep memname;

date = input(memname, monyy5.); * this is probably wrong informat but you can fix that;

run;

 

proc sql noprint;

select put(max(date), monyy5.) into :latest_sheet

from sheet_list;

quit;

 

 

rogerjdeangelis
Barite | Level 11
Number of sheets and sheet names in the order they appear in an excel workbook

If you have IML interface to R you can cut and paste the code into IML.
Note proc contents gives the sheetnames in alphabetic order.

HAVE Excel workbook with two sheets (Tab1 is ZZZZ and Tab2 is AAAA)
====================================================================

SHEET(ZZZZ ist sheet) CLASS IN WORKBOOK D:/XLS/TWO_SEETS.XLSX


  +----------------------------------------------------------------+
  |     A      |    B       |     C      |    D       |    E       |
  +----------------------------------------------------------------+
1 |  NAME      |   SEX      |    AGE     |  HEIGHT    |  WEIGHT    |
  +------------+------------+------------+------------+------------+
2 | ALFRED     |    M       |    14      |    69      |  112.5     |
  +------------+------------+------------+------------+------------+
   ...
  +------------+------------+------------+------------+------------+
N | WILLIAM    |    M       |    15      |   66.5     |  112       |
  +------------+------------+------------+------------+------------+

[ZZZZ]

SHEET(AAAA 2nd sheet)

  +----------------------------------------------------------------+
  |     A      |    B       |     C      |    D       |    E       |
  +----------------------------------------------------------------+
1 |  NAME      |   SEX      |    AGE     |  HEIGHT    |  WEIGHT    |
  +------------+------------+------------+------------+------------+
2 | ALFRED     |    M       |    14      |    69      |  112.5     |
  +------------+------------+------------+------------+------------+
   ...
  +------------+------------+------------+------------+------------+
N | WILLIAM    |    M       |    15      |   66.5     |  112       |
  +------------+------------+------------+------------+------------+

[AAAA]

WANT Number of sheets in the order they appear in workbook
===========================================================

Up to 40 obs WORK.WANT total obs=1

       SHEETS_BY_    NUMBER_
Obs     POSITION      SHEETS

 1     ZZZZ AAAA        2


WORKING CODE

    R  strset<-paste0(getSheets(wb),collapse=" ");

FULL SOLUTION
=============


%utl_submit_r64('
  source("c:/Program Files/R/R-3.3.2/etc/Rprofile.site",echo=T);
  library(XLConnect);
  wb <- loadWorkbook("d:/xls/two_sheets.xlsx",create = FALSE);
  strset<-paste0(getSheets(wb),collapse=" ");
  strset;
  write(strset,"d:/txt/sheets.txt");
');

> source("c:/Program Files/R/R-3.3.2/etc/Rprofile.site",echo=T);
library(XLConnect);
wb <- loadWorkbook("d:/xls/two_sheets.xlsx",create = FALSE);
strset<-paste0(getShee
ts(wb),collapse=" ");
strset;  write(strset,"d:/txt/sheets.txt");

[1] "zzzz aaaa"

d:/txt/sheets.txt
zzzz aaaa

data want;
  infile "d:/txt/sheets.txt";
  input;
  sheets_by_position=_infile_;
  number_sheets=countw(_infile_);
run;quit;

Up to 40 obs WORK.WANT total obs=1

       SHEETS_BY_    NUMBER_
Obs     POSITION      SHEETS

 1     zzzz aaaa        2


SASKiwi
PROC Star

I know this is a little off topic but if your SAS application is reliant on picking up the rightmost sheet as being the "right" one to import, how do you know that is correct? Is there some other means of confirming you have the correct sheet?

 

I say that as many Excel users have a habit of arbitrarily changing their worksheets to suit themselves without giving any thought to others who are using them.

Tom
Super User Tom
Super User

If you are reading an XLSX file then perhaps you can just read the XML from within the XLSX file directly?

First set a macro variable to filename that is the XLSX file.

%let filename=myfile.xlsx ;

Also define an XML map for reading the workbook xml information out of the XLSX file.

*----------------------------------------------------------------------;
* Generate XMLMAP to read the sheetnames from xl/workbook.xml ;
*----------------------------------------------------------------------;
filename _wbmap temp;
data _null_;
  file _wbmap;
  put '<SXLEMAP version="2.1"><TABLE name="Sheets">'
    / '<TABLE-PATH>/workbook/sheets/sheet</TABLE-PATH>'
    / '<COLUMN name="Sheet"><TYPE>character</TYPE>'
    / '<DESCRIPTION>Sheet Name</DESCRIPTION>'
    / '<PATH>/workbook/sheets/sheet/@name</PATH>'
    / '<DATATYPE>string</DATATYPE><LENGTH>32</LENGTH>'
    / '</COLUMN>'
    / '<COLUMN name="State"><TYPE>character</TYPE>'
    / '<DESCRIPTION>Sheet State</DESCRIPTION>'
    / '<PATH>/workbook/sheets/sheet/@state</PATH>'
    / '<DATATYPE>string</DATATYPE><LENGTH>20</LENGTH>'
    / '</COLUMN>'
    / '</TABLE></SXLEMAP>'
  ;
run;

You then need to copy the XML file out of the XLSX file. This step is needed because the XML libname engine cannot read the XML file directly via the ZIP filename engine.

*----------------------------------------------------------------------;
* Copy XML file out of the XLSX file ;
*----------------------------------------------------------------------;
filename _wbin zip "&filename" member="xl/workbook.xml" ;
filename _wb temp;
data _null_;
  rc=fcopy('_wbin','_wb');
run;

Now assign a libref to the XML file and use it to read the sheetname out into a SAS dataset. You can add a sheet counter variable.  I also like to generate a new column that has a valid SAS member name for the XLSX sheet name.

*----------------------------------------------------------------------;
* Assign libref ;
*----------------------------------------------------------------------;
libname _wb xmlv2 xmlmap=_wbmap ;

*----------------------------------------------------------------------;
* Read sheet names from XLSX file into a SAS dataset. ;
* Create valid SAS dataset name from sheetname or sheetnumber. ;
*----------------------------------------------------------------------;
data sheets  ;
  number+1;
  set _wb.sheets;
  length Memname $32 ;
  label number='Sheet Number' memname='Mapped SAS Memname' ;
  if ^nvalid(compress(sheet),'v7') then memname = cats('Sheet',number);
  else memname = translate(trim(compbl(sheet)),'_',' ');
run;

Now that you have the list in a dataset you can remove the filerefs and libref.

*----------------------------------------------------------------------;
* Clear the libname and filenames used in reading the sheetnames. ;
*----------------------------------------------------------------------;
libname _wb clear ;
filename _wbin clear ;
filename _wb clear ;
filename _wbmap clear ;

 

Tom
Super User Tom
Super User

You could just read the member names that SAS sees and then convert the name into an actual date and take the last date.  Not really the same as taking the last one by position, but probably easier, and perhaps even more acurate, depending on how consistently they make the sheetnames.  First point a libref at the actual file.

 

libname in xlsx "myfile.xlsx";

Then get a list of the members. You could use the dictionary tables (or SASHELP.Vxxx views).  But why not just run a proc contents.

proc contents data=in._all_ noprint out=contents; run;

Now you have the memnames (and the variable names also).  So look for the memname that seems to represent the largest date.

%let last_sheet = ;
proc sql noprint ;
  select memname into :last_sheet trimmed
  from contents
  having input(memname,anydtdte.)=max(input(memname,anydtdte.))
  ;
quit;

Then use the macro variable and the libref to read the actual data.

data want ;
  set in.&last_sheet;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 5888 views
  • 0 likes
  • 5 in conversation