I have a bunch of Excel files, where each file has data for one month (and sheet named for that month). I was hoping to use a concatenated library that would point to all the files, and then read them in a single data step:
libname MyXL xlsx ("c:\junk\jan.xlsx" "c:\junk\feb.xlsx") ;
proc datasets library=myxl ;
run ;
data want ;
set
MyXL.jan
MyXL.feb
;
run ;
The LIBNAME statement works.
PROC DATASETS sees both the Jan sheet and the Feb sheet:
Member # Name Type Level 1 FEB DATA 2 2 JAN DATA 1 3 SHEET2 DATA 1 4 SHEET3 DATA 1
But the data step can't see them:
6 data want ; 7 set 8 MyXL.jan 9 MyXL.feb 10 ; ERROR: Couldn't find range or sheet in spreadsheet ERROR: Couldn't find range or sheet in spreadsheet ERROR: File MYXL.feb.DATA does not exist. 11 run ;
I can of course macro loop over the files to generate a separate libref for each excel file, and a step to read in that file. But was wondering if XLSX engine is expected to support concatenated libraries?
I'm on PC SAS 9.4M4.
Maybe there are some blanks or tab character at end of JAN .
Use dictionary table to check it .
proc sql;
select nliteral(memname) into : list separated by ' '
from dictionary.tables
where libname='X';
quit;
%put &list ;
Thanks @Ksharp, but it's not bad characters. Your code returns what I would expect:
49 proc sql; 50 select nliteral(memname) into : list separated by ' ' 51 from dictionary.tables 52 where libname='MYXL'; 53 quit; 54 %put &list ; FEB JAN SHEET2 SHEET3
And if I don't use a concatenated library, it works, i.e. below code works:
libname jan xlsx ("c:\junk\jan.xlsx") ;
libname feb xlsx ("c:\junk\feb.xlsx") ;
data want ;
set
jan.jan
feb.feb
;
run ;
Could you try
set MyXL.'jan$'n MyXL.'feb$'n ;
That errors. The XLSX enginge doesn't like the explicit $ at the end of a sheet name, unless you are specifying a range. When using the concatenated library, the DATA step sees the sheet in the first level (jan), but not in the second level (feb).
1 libname MyXL xlsx ("c:\junk\jan.xlsx" "c:\junk\feb.xlsx") ; NOTE: Libref MYXL was successfully assigned as follows: Levels: 2 Engine(1): XLSX Physical Name(1): c:\junk\jan.xlsx Engine(2): XLSX Physical Name(2): c:\junk\feb.xlsx 2 3 data want ; 4 set 5 MyXL.'jan$'n 6 MyXL.'feb$'n 7 ; ERROR: Couldn't find range or sheet in spreadsheet ERROR: Couldn't find range or sheet in spreadsheet ERROR: File MYXL.'jan$'n.DATA does not exist. ERROR: Couldn't find range or sheet in spreadsheet ERROR: Couldn't find range or sheet in spreadsheet ERROR: File MYXL.'feb$'n.DATA does not exist. 8 run ; NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.WANT may be incomplete. When this step was stopped there were 0 observations and 0 variables. WARNING: Data set WORK.WANT was not replaced because this step was stopped. 9 10 data want ; 11 set 12 MyXL.'jan$A1:A3'n 13 MyXL.'feb$A1:A3'n 14 ; Couldn't find sheet in spreadsheet Requested Input File Is Invalid Couldn't find sheet in spreadsheet Requested Input File Is Invalid ERROR: File MYXL.'feb$A1:A3'n.DATA does not exist. 15 run ; NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.WANT may be incomplete. When this step was stopped there were 0 observations and 1 variables. WARNING: Data set WORK.WANT was not replaced because this step was stopped. 16 17 data want ; 18 set 19 MyXL.'jan$A1:A3'n 20 ; 21 run ; NOTE: The import data set has 2 observations and 1 variables. NOTE: There were 2 observations read from the data set MYXL.'jan$A1:A3'n. NOTE: The data set WORK.WANT has 2 observations and 1 variables. 22 23 data want ; 24 set 25 MyXL.'feb$A1:A3'n 26 ; Couldn't find sheet in spreadsheet Requested Input File Is Invalid Couldn't find sheet in spreadsheet Requested Input File Is Invalid ERROR: File MYXL.'feb$A1:A3'n.DATA does not exist. 27 run ; NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.WANT may be incomplete. When this step was stopped there were 0 observations and 0 variables. WARNING: Data set WORK.WANT was not replaced because this step was stopped. 28 29 libname noconcat xlsx ("c:\junk\feb.xlsx") ; NOTE: Libref NOCONCAT was successfully assigned as follows: Levels: 1 Engine(1): XLSX Physical Name(1): c:\junk\feb.xlsx 30 31 data want ; 32 set 33 noconcat.'feb$A1:A3'n 34 ; 35 run ; NOTE: The import data set has 2 observations and 1 variables. NOTE: There were 2 observations read from the data set NOCONCAT.'feb$A1:A3'n. NOTE: The data set WORK.WANT has 2 observations and 1 variables.
Maybe your concatenated library only recognize the first Excel file and ignore the following Excel file ?
Yes, that's the problem. It seems to me like the XLSX engine does not support concatenated libraries correctly. Note that PROC DATASETS sees both Excel files.
@Quentin wrote:
Yes, that's the problem. It seems to me like the XLSX engine does not support concatenated libraries correctly. Note that PROC DATASETS sees both Excel files.
I suspect that may have been a decision by the SAS developers so that they didn't have to deal with someone trying to use 15 XLSX all with Sheet1 Sheet2 and Sheet3 tabs. Since each of those should be a different data set you would end up with something that would have to provide renaming and then everyone would be complaining about the "data sets don't match the sheet names".
And I've had XLSX with the same named 40+ tabs in each of 40 monthly workbooks.
That seems possible, @ballardw, but if the developers didn't intend for concatenated libraries to work with the XLSX engine, I would expect the LIBNAME statement to throw an error. As is, it succeeds, and PROC DATASETS shows that it worked (i.e. the differently named sheets from two different excel files are both seen).
I didn't test, but I think as it is now if someone created a concatenated library with 15 excel files, each with tabs for Sheet1 Sheet2 Sheet3 the concatenated library would work as documented. Only the sheets from the first excel file would be seen, because the rest are duplicates (and the rule is the "dataset" is read from the first library in which it was seen). But the data step wouldn't error.
Same named tabs? Another reason to hate excel. : )
@Quentin, I don't have the official word, but I'd guess this is not supported. I will ask around though.
And as you/others have pointed out, even if it worked it would be very fragile, as one duplicate sheet name across workbooks would obscure any others that occurred in a later "level."
With a little bit of macro or DATA step/CALL EXECUTE, you can probably create something more robust.
Thanks @ChrisHemedinger. I look forward to hearing what you learn. I would think it would be better for the libname statement to work or error, but not half work. : )
Right now the libname statement "works" (does not error), and the proc datasets can see all the sheets (i.e. it can see all the sheets that exist in the first level, and the distinctly named sheets that exist in the second level). The dictionary tables have the right data too:
libname MyXL xlsx ("c:\junk\jan.xlsx" "c:\junk\feb.xlsx") ;
proc sql ;
select libname, path, level from dictionary.libnames
where libname='MYXL'
;
quit ;
/*
Library
Library Concatenation
Name Pathname Level
MYXL c:\junk\jan.xlsx 1
MYXL c:\junk\feb.xlsx 2
*/
proc sql ;
select libname, memname from dictionary.tables
where libname='MYXL'
;
quit ;
/***
Library
Name Member Name
MYXL FEB
MYXL JAN
MYXL SHEET2
MYXL SHEET3
***/
It's just that the rest of SAS can't seem to use the tables in the second level.
I recognize the possibility of sheet naming collisions causing confusion and calls to tech support. But note that that's the case as it is right now. That is, if I had 10 excel files all with sheet1 sheet2 sheet3, and use a concatenated library, SAS would happily read only from the first sheets in the first level. So if this were fixed to let distinctly named sheets in lower levels be read, I don't think it would introduce much more risk (unless the concatenated library approach becomes more publicized).
Yes, macro should work fine (especially since I can pull sheet names from dictionary.tables, which is great). I was just so excited to have a use-case for concatenated libraries. : )
Here's my guidance for using LIBNAME XLSX. I like it for two situations:
LIBNAME XLSX isn't a good fit for sheets that don't have field names, or that have data outside of the contiguous range beginning at A1. For those situations, you really need to use PROC IMPORT with DBMS=XLSX.
LIBNAME EXCEL | EXCELCS can do more, because they use Microsoft APIs to dive into more of the spreadsheet attributes. But those work only on Windows or via PC Files Server (using LIBNAME PCFILES).
Yeah, I've been loving XLSX engine cuz linux, and our PC FILES server never seemed to be given enough attention by our admins. But there are definitely some limitations (like lack of dbsastype). I'm hoping to see XLSX engine continue to grow over time.
You can trick linbame XLSX to read ranges with stuff like lib."Sheet1$A5:Q"n . Not sure if that's supported. I think I saw a post here that it would throw an error if the specification for sheet name and range ("Sheet1$A5:Q") was longer that 32 characters, but that a future fix might allow it to work as long as the sheet name itself was less than 32 characters.
Will try to keep in mind that PROC IMPORT with DBMS=XLSX could offer more flexibility than the libname approach. Thx.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.