A user asked me if it is possible to manipulate the data in the FIRST sheet like this:
data xlsxdata.<name>;
set xlsxdata.<name>;
...
do something
...
run;
The problem is: The name of the first sheet may vary - so what to write in <name>?
Let's say the sheets look like this - "Thematic" is the first sheet:
The first approach to get the name is via an EXCEL libname and PROC DATASETS/DICTIONARY TABLES. Unfortunately the sheet names are ordered alphabetically - you cannot decide which one is the first:
DBMS
Member Member
# Name Type Type
1 Details$ DATA TABLE
2 Summary$ DATA TABLE
3 Thematic$ DATA TABLE
Well, an xlsx file is also a zip file made up of several xml files. The "workbook.xml" in the xl folder has the wanted information:
<sheet r:id="rId1" sheetId="1" name="Thematic"/>
<sheet r:id="rId2" sheetId="2" name="Summary"/>
<sheet r:id="rId3" sheetId="3" name="Details"/>
That's great because we know that SAS is able to read an xml file - perhaps using an xml map. And the really big thing is that we can do it automatically:
/* read and create xml-file in the WORK folder */
filename _WRKBOOK ZIP "H:\test it now\test.xlsx" member='xl/workbook.xml';
filename _XMLFIL "%sysfunc(pathname(WORK))\workbook.xml";
data _null_;
infile _WRKBOOK recfm=n ;
file _XMLFIL recfm=n;
input char $char1. @@;
put char $char1. @@;
run;
/* Assign a xml LIBNAME and do an automap */
filename _SXLEMAP "%sysfunc(pathname(WORK))\xmlmap.map";
libname _XMLFIL xmlv2 automap=replace xmlmap=_SXLEMAP access=READONLY;
/* The SHEET table contains the sheet names */
data _null_;
set _xmlfil.sheet (obs=1);
call symputx("SHEET",sheet_name);
run;
/* de-assign */
filename _WRKBOOK;
filename _SXLEMAP;
filename _XMLFIL;
libname _XMLFIL;
and the happy user now has the solution:
libname xlsxdata XLSX "H:\test it now\test.xlsx";
data xlsxdata."&SHEET"n;
set xlsxdata."&SHEET"n;
<manipulations here>
run;
libname xlsxdata;
The only downside is the rather hefty intro so an idea is to wrap this up in a SAS Macro: SHEETNAMES. The macro generates macro variables SHEET1, SHEET2, etc. with the sheet names and the number of sheets, SHEETS, just in case:
%macro sheetnames(xlsxfile=);
options nonotes nosource;
%if NOT %sysfunc(fileexist(&xlsxfile)) %then %do;
%put ERROR: File %upcase(&xlsxfile) does not exist.;
options source notes;
%return;
%end;
/* read and create xml-file in the WORK folder */
filename _WRKBOOK ZIP "&xlsxfile" member='xl/workbook.xml';
filename _XMLFIL "%sysfunc(pathname(WORK))\workbook.xml";
data _null_;
infile _WRKBOOK recfm=n ;
file _XMLFIL recfm=n;
input char $char1. @@;
put char $char1. @@;
run;
/* Assign a xml LIBNAME and do a automap */
filename _SXLEMAP "%sysfunc(pathname(WORK))\xmlmap.map";
libname _XMLFIL xmlv2 automap=replace xmlmap=_SXLEMAP access=READONLY;
/* The SHEET table contains the sheet names */
data _null_;
set _xmlfil.sheet end=finish;
call symputx(cats("SHEET",_n_),sheet_name,'G');
if finish then
call symputx("SHEETS",_n_,'G');
run;
/* de-assign */
filename _WRKBOOK;
filename _SXLEMAP;
filename _XMLFIL;
libname _XMLFIL;
options notes source;
%mend;
The users code is now pretty compact:
%sheetnames(xlsxfile=H:\test it now\test.xlsx);
libname xlsxdata XLSX "H:\test it now\test.xlsx";
data xlsxdata."&SHEET1"n;
set xlsxdata."&SHEET1"n;
<manipulations here>
run;
libname xlsxdata;
What are the points:
Have a nice day 😉
Nice share,
I assume this is not a question, so I will nominate it as library contribution.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.