- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am using the XLSX engine (on PC SAS 9.4M4) to read an Excel file. Even if an Excel column has all numeric values, I would like to force it to be read into a character variable. Because I will receive this file weekly, and it's possible that next week's data will have some character values in the column. Is there a good way to do this?
Similar to this old post:
I think what I want is something like the DBSASTYPE option, but it doesn't seem to be supported by the XLSX engine.
Sample code:
libname myfile xlsx "%sysfunc(pathname(work))\myexcel.xlsx" ;
*write sashelp.class to excel ;
data myfile.class ;
set sashelp.class ;
run ;
proc contents data=myfile.class ;
run ;
*Failing pseudo code to read in the Excel file, forcing Age to be character ;
data ClassCharAge ;
set myfile.class(dbsastype=(Age='CHAR(20)')) ;
run ;
data ClassCharAge ;
length Age $20 ;
set myfile.class ;
run ;
I know there are workarounds like adding a row of dummy character data to the Excel file, or exporting the excel file to CSV before importing to SAS, or writing a macro to do the conversion. And yes I hate reading from Excel just as much as everyone else. Just wondering if there is a way to import numeric columns as character.
Next up: Rick Wicklin presents Ten Tips for Effective Statistical Graphics (with SAS code) on Wednesday March 26.
Register now at https://www.basug.org/events.
- Tags:
- Excel importing
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think this needs to be made as an enhancement request as the current driver does not support this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you are going to receive this file weekly, why not define max length of variables, just before the SET staement,
as you have done in your last step.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Shmuel wrote:
If you are going to receive this file weekly, why not define max length of variables, just before the SET staement,
as you have done in your last step.
It is too late then. The read from XLSX to SAS has already made the column as numeric/character based on what is this week's file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Shmuel, because it doesn't work. : )
216 data ClassCharAge ; 217 length Age $20 ; 218 set myfile.class ; ERROR: Variable Age has been defined as both character and numeric. 219 run ;
Next up: Rick Wicklin presents Ten Tips for Effective Statistical Graphics (with SAS code) on Wednesday March 26.
Register now at https://www.basug.org/events.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think this needs to be made as an enhancement request as the current driver does not support this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I added a SASware Ballot Idea to have XLSX support DBSASTYPE:
Next up: Rick Wicklin presents Ten Tips for Effective Statistical Graphics (with SAS code) on Wednesday March 26.
Register now at https://www.basug.org/events.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Since Excel does not have any "data type" for any column any connection to Excel "guesses" what the variable type is.
Likely if you examine your column cell formatting that has numeric appearing values that you want as character you will find it is the Excel "general". If the column were "text" it might import as desired. Might.
So far the only reliable method I have discovered to import Excel files with any consistency and control is to save them as CSV and use a data step to read the CSV file. That allows controlling lengths and types of specific variables as well as variable names.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Understood @ballardw. I just wish there was a way to tell the XLSX engine that a column is character, instead of having it guess. Since the XLSX engine is developed by SAS, rather than relying on Microsoft's ACE driver, I think there may be hope that SAS could give us this option. I feel like Henry Feldman had the option in DBMS copy years ago, and he's been working for SAS for quite a while now... : )
Next up: Rick Wicklin presents Ten Tips for Effective Statistical Graphics (with SAS code) on Wednesday March 26.
Register now at https://www.basug.org/events.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If I were you, I'd be strongly tempted to set up a process that uses OpenOffice (or a VB script for MS Office) in batch mode to convert the Excel to CSV, and then read that into SAS. If XCMD is enabled, it all could be done from SAS code.
Using a direct read from something as unreliable as Excel in a production environment gives me the creeps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I agree with all have been said upto now.
When excel displays a value like 25.5 - it may be 25.4999 or 25.50001 etc.
What would you like to have in a char type variable ?
Anyway, you can still have it by "hard work" done once, like in next code:
data classchar;
set myfile.class(rename=(
age=v3 height=v4 weight=v5));
length age $3 height $6 weight $8;
age = put(v3, 2.);
height = put(v4, best6.1);
weight = put(v5, best8.1);
drop v3-v5;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Shmuel asked:
When excel displays a value like 25.5 - it may be 25.4999 or 25.50001 etc.
What would you like to have in a char type variable ?
Fair question. Probably just use whatever default best. format seems reasonable, just like SAS would do for any implied automatic numeric to character conversion. Not that I'm a fan of automatic conversions, but when dealing with Excel, that would be the least of my problems. : )
I'd even accept an option which said "Forget about the values that are actually stored in Excel, just read the displayed (formatted) values into a character variable."
Next up: Rick Wicklin presents Ten Tips for Effective Statistical Graphics (with SAS code) on Wednesday March 26.
Register now at https://www.basug.org/events.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Quentin wrote:
@Shmuel asked:
When excel displays a value like 25.5 - it may be 25.4999 or 25.50001 etc.
What would you like to have in a char type variable ?
Fair question. Probably just use whatever default best. format seems reasonable, just like SAS would do for any implied automatic numeric to character conversion. Not that I'm a fan of automatic conversions, but when dealing with Excel, that would be the least of my problems. : )
I'd even accept an option which said "Forget about the values that are actually stored in Excel, just read the displayed (formatted) values into a character variable."
If you apply a display type such as numeric with one decimal for a column and then do a file save as CSV the formatted value is what is saved in the CSV.
I determined that when I had problems with Excel files containing manually entered dates (gak, the things Excel can do with such things) and found that formatting a column with a single date type helped immensely prior to reading into SAS.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Can you use the PCFILES engine?
%let WORKPATH=%sysfunc(pathname(work));
* Create test data;
libname xl_out xlsx "&WORKPATH\class.xlsx";
proc copy in=sashelp out=xl_out;
select class;
run; quit;
libname xl_out clear;
* Import the data;
libname xl_in pcfiles path="&WORKPATH\class.xlsx";
data work.class;
set xl_in.class(dbsastype=(Age='char(20)'));
run;
libname xl_in clear;
Vince DelGobbo
SAS R&D
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You could use OS command to run a VBA(google it) to transform it into CSV ,and then import it .