BookmarkSubscribeRSS Feed

When using the XLSX option to read Excel data, I would like to be able to force an Excel column to be read into a character variable, even if the column contains only numeric values.  With other engines, I believe this would be done with the DBSASTYPE option.

 

Sample code:

libname myfile xlsx "%sysfunc(pathname(work))\myexcel.xlsx" ;

*write sashelp.class to excel ;
data myfile.class ;
  set sashelp.class ;
run ;

*Failing pseudo code to read in the Excel file, forcing Age to be character ;
data ClassCharAge ;
  set myfile.class(dbsastype=(Age='CHAR(20)')) ;
run ;

Use case is that suppose I receive an incremental Excel file weekly, and I have a SAS job to load the data into a database.  I have a variable that is defined as character in database, but some incremental Excel files may only have numeric values for that variable.  When I import such an Excel file, it is loaded into a numeric variable in SAS even though the database and my SAS code expect it to be character.  

 

Yes I can add code to by ETL job to check if the field is numeric or character and then convert it from numeric to character myself, and yes I hate Excel and could avoid this by converting the Excel file to CSV or just telling the customer not to send me Excel in the first place.  But since many engines support the DBSASTYPE option, would be nice to see it supported by the XLSX engine.

 

Related thread:

https://communities.sas.com/t5/Base-SAS-Programming/Reading-XLSX-file-and-force-SAS-column-types/m-p...

3 Comments
gsh
Calcite | Level 5
Calcite | Level 5

I have a same issue, could you able to solve it

Peter_L
Quartz | Level 8
Yes please! I just commented on a SAS blog as follows:
 
The XLSX LIBNAME engine has a big disadvantage compared with the EXCEL LIBNAME engine: it does not support SAS data set options. This is a showstopper for many Excel files, where the type of data (character or numeric) can be anarchic.
 
The only way I have found to read Excel files reliably is to define the type of every column with the DBSASTYPE= dataset option *.
 
This prompts the question: When does SAS plan to allow data set options with the XLSX LIBNAME engine?
 
 
* (not in blog comment) The way I do this is to have a table of metadata which describes the columns I expect in the Excel sheets and the types they must have. I use DBSASTYPE= to force these and read any other columns the user created as character because they are used only for labelling. I can do this only with the EXCEL or PCFILES LIBNAME engines, which require troublesome Microsoft components. Try working in a mixed 32- / 64- bit environment with Microsoft Office updates. The Office 365 subscription model makes the problem unmanageable.
 
 
DebG
Obsidian | Level 7

I would  upvote this 1000 times if I could.  🙂  Please support the XLSX DBSASTYPE or something similar.  It would be a game changer for sure!