I'm trying to get some excel data into SAS. The workbooks contain an unpredictable number of sheets, and I am using the xlsx engine to try to access them. I'm on 9.4 ts level 1m4 on x64eso8r2 platform.
My current data has 10 sheets in the workbook, 8 of them I can accesss without a problem, but 2 are an issue.
Here is the code I am running:
libname multi clear;
libname multi XLSX 'H:\GCTT\site_tracker\multitab\Japan MDC_Study Tracker_20190820-2.xlsx';
%macro dummy;
proc sql;
select count(distinct memname) into: ntables from sashelp.vtable where libname = 'MULTI';
%let ntables = %sysfunc(compress(&ntables.));
select distinct memname from sashelp.vtable where libname = 'MULTI';
select distinct memname into :table1-:table&ntables. from sashelp.vtable where libname = 'MULTI';
run; quit;
%do i =1 %to &ntables.;
data work.t&i;
set MULTI."&&table&i"n;
run;
%end;
%mend dummy;
%dummy;
the proc sql returns 10 members in the libname multi as expected:
8 data sets are created without issue, but for the sheets Defintions_Recruitment Status and E708-C086-108 I get a the following error in the log:
MLOGIC(DUMMY): %DO loop index variable I is now 7; loop will iterate again.
SYMBOLGEN: Macro variable I resolves to 7
MPRINT(DUMMY): data work.t7;
SYMBOLGEN: && resolves to &.
SYMBOLGEN: Macro variable I resolves to 7
SYMBOLGEN: Macro variable TABLE7 resolves to E7080-C086-108
MPRINT(DUMMY): set MULTI."E7080-C086-108"n;
ERROR: Couldn't find range or sheet in spreadsheet
ERROR: File MULTI.'E7080-C086-108'n.DATA does not exist.
MPRINT(DUMMY): run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.T7 may be incomplete. When this step was stopped there were 0
observations and 0 variables.
WARNING: Data set WORK.T7 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.05 seconds
cpu time 0.04 seconds
MLOGIC(DUMMY): %DO loop index variable I is now 8; loop will iterate again.
SYMBOLGEN: Macro variable I resolves to 8
MPRINT(DUMMY): data work.t8;
SYMBOLGEN: && resolves to &.
SYMBOLGEN: Macro variable I resolves to 8
SYMBOLGEN: Macro variable TABLE8 resolves to E7080-C086-308
MPRINT(DUMMY): set MULTI."E7080-C086-308"n;
MPRINT(DUMMY): run;
NOTE: The import data set has 24 observations and 12 variables.
NOTE: There were 24 observations read from the data set MULTI.'E7080-C086-308'n.
NOTE: The data set WORK.T8 has 24 observations and 12 variables.
NOTE: DATA statement used (Total process time):
I am stumped. Any ideas on how to either correct this or a viable work around? I've tried a proc import with the same results, different access methods, etc.
Thanks,
Elaine
You might have blanks or other invisible characters in the sheet names in the workbook.
Some things to try.
Dump the macro for now and try to get one of the problem sheets read using hardcoded names. Does it help to use the old "'sheetname'$" syntax (or whatever it was) that worked with PROC IMPORT in the past for goofy sheetnames?
Does it help to set VALIDMEMNAME option to EXTEND?
Does PROC COPY find them?
libname multi XLSX 'H:\GCTT\site_tracker\multitab\Japan MDC_Study Tracker_20190820-2.xlsx';
proc copy inlib=multi out=work;
run;
Can you get PROC IMPORT to read them?
validmemname was set to extend, so that was not the problem.
Proc copy gave the same error on the two sheets, that they could not be found, plus had issue on writing out to datasets with dashes in the dataset names. Proc contents listed all 10 datasets as member, but gave errors in the log for the two datasets that are problematic:
ERROR: Couldn't find range or sheet in spreadsheet
ERROR: File MULTI.'DEFINITIONS_RECRUITMENT STATUS'n.DATA does not exist.
ERROR: Couldn't find range or sheet in spreadsheet
ERROR: File MULTI.'E7080-C086-108'n.DATA does not exist.
Porc import failed as well:
proc import datafile = "H:\GCTT\site_tracker\multitab\Japan MDC_Study
1377! Tracker_20190820-2.xlsx"
1378 dbms = xlsx
1379 out= work.try replace ;
1380 sheet = "DEFINITIONS_RECRUITMENT STATUS"n ;
1381 run;
ERROR: Couldn't find sheet in spreadsheet
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.05 seconds
cpu time 0.00 seconds
I hate when excel is used as a data source, unfortunately I am not in a position to insist on a different format.
Try using the $ trick:
sheet = "'DEFINITIONS_RECRUITMENT STATUS$'"n ;
SAS/ACCESS treats an Excel workbook as a database and a worksheet or a range (subset of cells in a worksheet) as a table. A range name must be defined in the Excel file before SAS can use it. A worksheet is treated as a special range. A worksheet name appended with a $ character is treated as a range.
For example, Sheet1 is a sheet name in an Excel workbook. SAS treats Sheet1$ as a valid range name and uses it to refer to the whole worksheet. You need to use a SAS name literal when referring to the sheet name; for example, specify
SHEET="'Sheet1$'n"
, where the sheet name is enclosed in double quotation marks. The first row of data in a range is usually treated as a column heading and used to create a SAS variables name.
Other things to check is does the sheet name have a space? (is that possible in XLSX file?)
Also I see JAPAN in the path. Is it possible the sheet names are using non-7bit ASCII characters in their name? So what looks like a hyphen might actually be some type of en-dash or em-dash.
Everything's possible in Excel. Including leading and trailing blanks in sheet names, I guess.
If you cannot get anything else to work you might look to write your own way to modify the sheet names. An XLSX file is just a ZIP file that contains a lot of XML files. One of those XML files has the like of sheet names. You could try copying the XLSX file and in the copy update that xl/workbook.xml file so that the sheets have usable names.
Here is example of the part of that file that has the sheet names.
<sheets> <sheet name="Sheet1" sheetId="1" r:id="rId1"/> <sheet name="Sheet2" sheetId="2" r:id="rId2"/> <sheet name="Sheet3" sheetId="3" r:id="rId3"/> </sheets>
Problem is for sure in the tab names. If I manually rename them in excel it is smooth sailing. I'll give the xml method a go.
Thanks,
Elaine
Try this. what are you seeing .
select distinct nliteral( memname ) from sashelp.vtable where libname = 'MULTI';
I meet similar problem while using xlsx library. When there are tail blank in the sheetname, some error would occur.
For example, there is a sheet named 'test 1 ' in file 'test.xlsx', and we use xlsx library to read this file:
libname test xlsx '....../test.xlsx';
then SAS will treat the dataset as test.'test 1'n rather than test.'test 1 'n.
ERROR: Couldn't find range or sheet in spreadsheet
ERROR: File MULTI.'E7080-C086-108'n.DATA does not exist.
Assume that the sheetname of table7 is 'E7080-C086-108 ' rather than 'E7080-C086-108'.
Macro variable TABLE7 resolves to E7080-C086-108, without the tail blank.
So File MULTI.'E7080-C086-108'n.DATA does not exist, however the correct file name is MULTI.'E7080-C086-108 'n.DATA
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.