DATA Step, Macro, Functions and more

Names of Excel sheets

Reply
Occasional Contributor
Posts: 18

Names of Excel sheets

[ Edited ]

A user asked me if it is possible to manipulate the data in the FIRST sheet like this:

 

data xlsxdata.<name>;
    set xlsxdata.<name>;
    ...
    do something
    ...
run;

The problem is: The name of the first sheet may vary - so what to write in <name>?

 

Let's say the sheets look like this - "Thematic" is the first sheet:

 

Screen Shot 11-22-17 at 07.32 PM.PNG

 

The first approach to get the name is via an EXCEL libname and PROC DATASETS/DICTIONARY TABLES. Unfortunately the sheet names are ordered alphabetically - you cannot decide which one is the first:

 

                      DBMS
              Member  Member
#  Name       Type    Type

1  Details$   DATA    TABLE
2  Summary$   DATA    TABLE
3  Thematic$  DATA    TABLE

Well, an xlsx file is also a zip file made up of several xml files. The "workbook.xml" in the xl folder has the wanted information:

 

<sheet r:id="rId1" sheetId="1" name="Thematic"/>
<sheet r:id="rId2" sheetId="2" name="Summary"/>
<sheet r:id="rId3" sheetId="3" name="Details"/>

That's great because we know that SAS is able to read an xml file - perhaps using an xml map. And the really big thing is that we can do it automatically:

 

/* read and create xml-file in the WORK folder  */
filename _WRKBOOK ZIP "H:\test it now\test.xlsx" member='xl/workbook.xml';
filename _XMLFIL      "%sysfunc(pathname(WORK))\workbook.xml";

data _null_;
    infile _WRKBOOK recfm=n ;
    file   _XMLFIL   recfm=n;
    input char $char1. @@;
    put char $char1. @@;
run;

/* Assign a xml LIBNAME and do an automap */
filename  _SXLEMAP "%sysfunc(pathname(WORK))\xmlmap.map";
libname   _XMLFIL xmlv2 automap=replace xmlmap=_SXLEMAP access=READONLY;

/* The SHEET table contains the sheet names */
data _null_;
    set _xmlfil.sheet (obs=1);
    call symputx("SHEET",sheet_name);
run;

/* de-assign */
filename _WRKBOOK;
filename _SXLEMAP;
filename _XMLFIL;
libname  _XMLFIL;

and the happy user now has the solution:

  

libname xlsxdata XLSX "H:\test it now\test.xlsx";

data xlsxdata."&SHEET"n; set xlsxdata."&SHEET"n;
<manipulations here>
run;
libname xlsxdata;

The only downside is the rather hefty intro so an idea is to wrap this up in a SAS Macro: SHEETNAMES. The macro generates macro variables SHEET1, SHEET2, etc. with the sheet names and the number of sheets, SHEETS, just in case:

 

%macro sheetnames(xlsxfile=);
    options nonotes nosource;

    %if NOT %sysfunc(fileexist(&xlsxfile)) %then %do;
        %put ERROR: File %upcase(&xlsxfile) does not exist.;
        options source notes;
        %return;
    %end;

    /* read and create xml-file in the WORK folder  */
    filename _WRKBOOK ZIP "&xlsxfile" member='xl/workbook.xml';
    filename _XMLFIL      "%sysfunc(pathname(WORK))\workbook.xml";

    data _null_;
        infile _WRKBOOK recfm=n ;
        file   _XMLFIL   recfm=n;
        input char $char1. @@;
        put char $char1. @@;
    run;

    /* Assign a xml LIBNAME and do a automap */
    filename  _SXLEMAP "%sysfunc(pathname(WORK))\xmlmap.map";
    libname   _XMLFIL xmlv2 automap=replace xmlmap=_SXLEMAP access=READONLY;

    /* The SHEET table contains the sheet names */
    data _null_;
        set _xmlfil.sheet end=finish;
        call symputx(cats("SHEET",_n_),sheet_name,'G');
        if finish then
            call symputx("SHEETS",_n_,'G');
    run;

    /* de-assign */
    filename _WRKBOOK;
    filename _SXLEMAP;
    filename _XMLFIL;
    libname  _XMLFIL;

    options notes source;
%mend;

The users code is now pretty compact:

 

%sheetnames(xlsxfile=H:\test it now\test.xlsx);

libname xlsxdata XLSX "H:\test it now\test.xlsx";

data xlsxdata."&SHEET1"n; set xlsxdata."&SHEET1"n;
<manipulations here>
run;
libname xlsxdata;

What are the points:

 

  1. An xlsx file can be treated as a zip file - easy to read SOME information with FILENAME and ZIP engine
  2. You can do an automap in the XMLV2 LIBNAME - no need for the XML Mapper.

 Have a nice day ;-)

Super User
Posts: 5,824

Re: Names of Excel sheets

Posted in reply to PeterKellberg

Nice share,

I assume this is not a question, so I will nominate it as library contribution.

Data never sleeps
Ask a Question
Discussion stats
  • 1 reply
  • 119 views
  • 3 likes
  • 2 in conversation