BookmarkSubscribeRSS Feed
Quentin
Super User

 

I have a bunch of Excel files, where each file has data for one month (and sheet named for that month).  I was hoping to use a concatenated library that would point to all the files, and then read them in a single data step:

libname MyXL xlsx ("c:\junk\jan.xlsx" "c:\junk\feb.xlsx") ;

proc datasets library=myxl  ;
run ;

data want ;
  set
    MyXL.jan
    MyXL.feb
  ;
run ;

The LIBNAME statement works. 

 

PROC DATASETS sees both the Jan sheet and the Feb sheet:

 

           Member
#  Name    Type    Level

1  FEB     DATA      2
2  JAN     DATA      1
3  SHEET2  DATA      1
4  SHEET3  DATA      1

 

But the data step can't see them:

 

6    data want ;
7      set
8        MyXL.jan
9        MyXL.feb
10     ;
ERROR: Couldn't find range or sheet in spreadsheet
ERROR: Couldn't find range or sheet in spreadsheet
ERROR: File MYXL.feb.DATA does not exist.
11   run ;

 

I can of course macro loop over the files to generate a separate libref for each excel file, and a step to read in that file.  But was wondering if XLSX engine is expected to support concatenated libraries?  

 

I'm on PC SAS 9.4M4.

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in March 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
12 REPLIES 12
Ksharp
Super User

Maybe there are some blanks or tab character at end of JAN .

Use dictionary table to check it .

 


proc sql;
select nliteral(memname) into : list separated by ' '
 from dictionary.tables
  where libname='X';
quit;
%put &list ;
Quentin
Super User

Thanks @Ksharp, but it's not bad characters.  Your code returns what I would expect:

49   proc sql;
50   select nliteral(memname) into : list separated by ' '
51    from dictionary.tables
52     where libname='MYXL';
53   quit;

54   %put &list ;
FEB JAN SHEET2 SHEET3

And if I don't use a concatenated library, it works, i.e. below code works:

libname jan xlsx ("c:\junk\jan.xlsx") ;
libname feb xlsx ("c:\junk\feb.xlsx") ;

data want ;
  set
    jan.jan
    feb.feb
  ;
run ;
The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in March 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
Ksharp
Super User

Could you try 

 

 set
        MyXL.'jan$'n
       MyXL.'feb$'n
     ;
Quentin
Super User

That errors. The XLSX enginge doesn't like the explicit $ at the end of a sheet name, unless you are specifying a range.  When using the concatenated library, the DATA step sees the sheet in the first level (jan), but not in the second level (feb).

 

 

1    libname MyXL xlsx ("c:\junk\jan.xlsx" "c:\junk\feb.xlsx") ;
NOTE: Libref MYXL was successfully assigned as follows:
      Levels:           2
      Engine(1):        XLSX
      Physical Name(1): c:\junk\jan.xlsx
      Engine(2):        XLSX
      Physical Name(2): c:\junk\feb.xlsx
2
3    data want ;
4      set
5        MyXL.'jan$'n
6        MyXL.'feb$'n
7      ;
ERROR: Couldn't find range or sheet in spreadsheet
ERROR: Couldn't find range or sheet in spreadsheet
ERROR: File MYXL.'jan$'n.DATA does not exist.
ERROR: Couldn't find range or sheet in spreadsheet
ERROR: Couldn't find range or sheet in spreadsheet
ERROR: File MYXL.'feb$'n.DATA does not exist.
8    run ;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.WANT may be incomplete.  When this step was stopped there were 0
         observations and 0 variables.
WARNING: Data set WORK.WANT was not replaced because this step was stopped.

9
10   data want ;
11     set
12       MyXL.'jan$A1:A3'n
13       MyXL.'feb$A1:A3'n
14     ;
Couldn't find sheet in spreadsheet
Requested Input File Is Invalid
Couldn't find sheet in spreadsheet
Requested Input File Is Invalid
ERROR: File MYXL.'feb$A1:A3'n.DATA does not exist.
15   run ;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.WANT may be incomplete.  When this step was stopped there were 0
         observations and 1 variables.
WARNING: Data set WORK.WANT was not replaced because this step was stopped.

16
17   data want ;
18     set
19       MyXL.'jan$A1:A3'n
20     ;
21   run ;

NOTE: The import data set has 2 observations and 1 variables.
NOTE: There were 2 observations read from the data set MYXL.'jan$A1:A3'n.
NOTE: The data set WORK.WANT has 2 observations and 1 variables.

22
23   data want ;
24     set
25       MyXL.'feb$A1:A3'n
26     ;
Couldn't find sheet in spreadsheet
Requested Input File Is Invalid
Couldn't find sheet in spreadsheet
Requested Input File Is Invalid
ERROR: File MYXL.'feb$A1:A3'n.DATA does not exist.
27   run ;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.WANT may be incomplete.  When this step was stopped there were 0
         observations and 0 variables.
WARNING: Data set WORK.WANT was not replaced because this step was stopped.

28
29   libname noconcat xlsx ("c:\junk\feb.xlsx") ;
NOTE: Libref NOCONCAT was successfully assigned as follows:
      Levels:           1
      Engine(1):        XLSX
      Physical Name(1): c:\junk\feb.xlsx
30
31   data want ;
32     set
33       noconcat.'feb$A1:A3'n
34     ;
35   run ;

NOTE: The import data set has 2 observations and 1 variables.
NOTE: There were 2 observations read from the data set NOCONCAT.'feb$A1:A3'n.
NOTE: The data set WORK.WANT has 2 observations and 1 variables.
The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in March 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
Ksharp
Super User

Maybe your concatenated library only recognize the first Excel file and ignore the following Excel file ?

Quentin
Super User

Yes, that's the problem.  It seems to me like the XLSX engine does not support concatenated libraries correctly.  Note that PROC DATASETS sees both Excel files.

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in March 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
ballardw
Super User

@Quentin wrote:

Yes, that's the problem.  It seems to me like the XLSX engine does not support concatenated libraries correctly.  Note that PROC DATASETS sees both Excel files.


I suspect that may have been a decision by the SAS developers so that they didn't have to deal with someone trying to use 15 XLSX all with Sheet1 Sheet2 and Sheet3 tabs. Since each of those should be a different data set you would end up with something that would have to provide renaming and then everyone would be complaining about the "data sets don't match the sheet names".

 

And I've had XLSX with the same named 40+ tabs in each of 40 monthly workbooks.

Quentin
Super User

That seems possible, @ballardw, but if the developers didn't intend for concatenated libraries to work with the XLSX engine, I would expect the LIBNAME statement to throw an error.  As is, it succeeds, and PROC DATASETS shows that it worked (i.e. the differently named sheets from two different excel files are both seen).

 

I didn't test, but I think as it is now if someone created a concatenated library with 15 excel files, each with tabs for Sheet1 Sheet2 Sheet3 the concatenated library would work as documented.  Only the sheets from the first excel file would be seen, because the rest are duplicates (and the rule is the "dataset" is read from the first library in which it was seen).  But the data step wouldn't error.  

 

Same named tabs?  Another reason to hate excel. : )

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in March 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
ChrisHemedinger
Community Manager

@Quentin, I don't have the official word, but I'd guess this is not supported.  I will ask around though.

 

And as you/others have pointed out, even if it worked it would be very fragile, as one duplicate sheet name across workbooks would obscure any others that occurred in a later "level."

 

With a little bit of macro or DATA step/CALL EXECUTE, you can probably create something more robust.

 

Register for SAS Innovate 2025!! The premier event for SAS users, May 6-9 in Orlando FL. Sign up now for the best deals!
Quentin
Super User

Thanks @ChrisHemedinger.  I look forward to hearing what you learn.  I would think it would be better for the libname statement to work or error, but not half work. : )

 

Right now the libname statement "works" (does not error), and the proc datasets can see all the sheets (i.e. it can see all the sheets that exist in the first level, and the distinctly named sheets that exist in the second level).   The dictionary tables have the right data too:

 

 

libname MyXL xlsx ("c:\junk\jan.xlsx" "c:\junk\feb.xlsx") ;

proc sql ; 
  select libname, path, level from dictionary.libnames
  where libname='MYXL'
 ;
quit ;

/*
Library
Library Concatenation
Name Pathname Level
MYXL c:\junk\jan.xlsx 1
MYXL c:\junk\feb.xlsx 2

*/
proc sql ; select libname, memname from dictionary.tables where libname='MYXL' ; quit ;

/***
Library
Name Member Name
MYXL FEB
MYXL JAN
MYXL SHEET2
MYXL SHEET3

***/

 

It's just that the rest of SAS can't seem to use the tables in the second level.  

 

I recognize the possibility of sheet naming collisions causing confusion and calls to tech support.  But note that that's the case as it is right now.  That is, if I had 10 excel files all with sheet1 sheet2 sheet3, and use a concatenated library, SAS would happily read only from the first sheets in the first level.  So if this were fixed to let distinctly named sheets in lower levels be read, I don't think it would introduce much more risk (unless the concatenated library approach becomes more publicized).

 

Yes, macro should work fine (especially since I can pull sheet names from dictionary.tables, which is great).  I was just so excited to have a use-case for concatenated libraries. : )

 

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in March 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
ChrisHemedinger
Community Manager

Here's my guidance for using LIBNAME XLSX.  I like it for two situations:

  • Convenience -- extract data from a well-formed spreadsheet (with field names and all) using standard libname.member syntax. 
  • Discovery - quick way to find out what's in a spreadsheet (sheet names, field names) so that you can build more robust code to import it.

LIBNAME XLSX isn't a good fit for sheets that don't have field names, or that have data outside of the contiguous range beginning at A1.  For those situations, you really need to use PROC IMPORT with DBMS=XLSX.

 

LIBNAME EXCEL | EXCELCS can do more, because they use Microsoft APIs to dive into more of the spreadsheet attributes.  But those work only on Windows or via PC Files Server (using LIBNAME PCFILES).

Register for SAS Innovate 2025!! The premier event for SAS users, May 6-9 in Orlando FL. Sign up now for the best deals!
Quentin
Super User

Yeah, I've been loving XLSX engine cuz linux, and our PC FILES server never seemed to be given enough attention by our admins.  But there are definitely some limitations (like lack of dbsastype).  I'm hoping to see XLSX engine continue to grow over time.

 

You can trick linbame XLSX to read ranges with stuff like  lib."Sheet1$A5:Q"n .  Not sure if that's supported.  I think I saw a post here that it would throw an error if the specification for sheet name and range ("Sheet1$A5:Q") was longer that 32 characters, but that a future fix might allow it to work as long as the sheet name itself was less than 32 characters.

 

Will try to keep in mind that PROC IMPORT with DBMS=XLSX could offer more flexibility than the libname approach.  Thx.

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in March 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 1437 views
  • 1 like
  • 4 in conversation