I am using the XLSX engine (on PC SAS 9.4M4) to read an Excel file. Even if an Excel column has all numeric values, I would like to force it to be read into a character variable. Because I will receive this file weekly, and it's possible that next week's data will have some character values in the column. Is there a good way to do this?
Similar to this old post:
I think what I want is something like the DBSASTYPE option, but it doesn't seem to be supported by the XLSX engine.
Sample code:
libname myfile xlsx "%sysfunc(pathname(work))\myexcel.xlsx" ;
*write sashelp.class to excel ;
data myfile.class ;
set sashelp.class ;
run ;
proc contents data=myfile.class ;
run ;
*Failing pseudo code to read in the Excel file, forcing Age to be character ;
data ClassCharAge ;
set myfile.class(dbsastype=(Age='CHAR(20)')) ;
run ;
data ClassCharAge ;
length Age $20 ;
set myfile.class ;
run ;
I know there are workarounds like adding a row of dummy character data to the Excel file, or exporting the excel file to CSV before importing to SAS, or writing a macro to do the conversion. And yes I hate reading from Excel just as much as everyone else. Just wondering if there is a way to import numeric columns as character.
I think this needs to be made as an enhancement request as the current driver does not support this.
If you are going to receive this file weekly, why not define max length of variables, just before the SET staement,
as you have done in your last step.
@Shmuel wrote:
If you are going to receive this file weekly, why not define max length of variables, just before the SET staement,
as you have done in your last step.
It is too late then. The read from XLSX to SAS has already made the column as numeric/character based on what is this week's file.
@Shmuel, because it doesn't work. : )
216 data ClassCharAge ; 217 length Age $20 ; 218 set myfile.class ; ERROR: Variable Age has been defined as both character and numeric. 219 run ;
I think this needs to be made as an enhancement request as the current driver does not support this.
I added a SASware Ballot Idea to have XLSX support DBSASTYPE:
Since Excel does not have any "data type" for any column any connection to Excel "guesses" what the variable type is.
Likely if you examine your column cell formatting that has numeric appearing values that you want as character you will find it is the Excel "general". If the column were "text" it might import as desired. Might.
So far the only reliable method I have discovered to import Excel files with any consistency and control is to save them as CSV and use a data step to read the CSV file. That allows controlling lengths and types of specific variables as well as variable names.
Understood @ballardw. I just wish there was a way to tell the XLSX engine that a column is character, instead of having it guess. Since the XLSX engine is developed by SAS, rather than relying on Microsoft's ACE driver, I think there may be hope that SAS could give us this option. I feel like Henry Feldman had the option in DBMS copy years ago, and he's been working for SAS for quite a while now... : )
If I were you, I'd be strongly tempted to set up a process that uses OpenOffice (or a VB script for MS Office) in batch mode to convert the Excel to CSV, and then read that into SAS. If XCMD is enabled, it all could be done from SAS code.
Using a direct read from something as unreliable as Excel in a production environment gives me the creeps.
I agree with all have been said upto now.
When excel displays a value like 25.5 - it may be 25.4999 or 25.50001 etc.
What would you like to have in a char type variable ?
Anyway, you can still have it by "hard work" done once, like in next code:
data classchar;
set myfile.class(rename=(
age=v3 height=v4 weight=v5));
length age $3 height $6 weight $8;
age = put(v3, 2.);
height = put(v4, best6.1);
weight = put(v5, best8.1);
drop v3-v5;
run;
@Shmuel asked:
When excel displays a value like 25.5 - it may be 25.4999 or 25.50001 etc.
What would you like to have in a char type variable ?
Fair question. Probably just use whatever default best. format seems reasonable, just like SAS would do for any implied automatic numeric to character conversion. Not that I'm a fan of automatic conversions, but when dealing with Excel, that would be the least of my problems. : )
I'd even accept an option which said "Forget about the values that are actually stored in Excel, just read the displayed (formatted) values into a character variable."
@Quentin wrote:
@Shmuel asked:
When excel displays a value like 25.5 - it may be 25.4999 or 25.50001 etc.
What would you like to have in a char type variable ?
Fair question. Probably just use whatever default best. format seems reasonable, just like SAS would do for any implied automatic numeric to character conversion. Not that I'm a fan of automatic conversions, but when dealing with Excel, that would be the least of my problems. : )
I'd even accept an option which said "Forget about the values that are actually stored in Excel, just read the displayed (formatted) values into a character variable."
If you apply a display type such as numeric with one decimal for a column and then do a file save as CSV the formatted value is what is saved in the CSV.
I determined that when I had problems with Excel files containing manually entered dates (gak, the things Excel can do with such things) and found that formatting a column with a single date type helped immensely prior to reading into SAS.
Can you use the PCFILES engine?
%let WORKPATH=%sysfunc(pathname(work));
* Create test data;
libname xl_out xlsx "&WORKPATH\class.xlsx";
proc copy in=sashelp out=xl_out;
select class;
run; quit;
libname xl_out clear;
* Import the data;
libname xl_in pcfiles path="&WORKPATH\class.xlsx";
data work.class;
set xl_in.class(dbsastype=(Age='char(20)'));
run;
libname xl_in clear;
Vince DelGobbo
SAS R&D
You could use OS command to run a VBA(google it) to transform it into CSV ,and then import it .
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.