DATA Step, Macro, Functions and more

Reading XLSX file and force SAS column types

Accepted Solution Solved
Reply
PROC Star
Posts: 1,434
Accepted Solution

Reading XLSX file and force SAS column types

I am using the XLSX engine (on PC SAS 9.4M4) to read an Excel file.  Even if an Excel column has all numeric values, I would like to force it to be read into a character variable.  Because I will receive this file weekly, and it's possible that next week's data will have some character values in the column.  Is there a good way to do this?

 

Similar to this old post:

https://communities.sas.com/t5/SAS-Procedures/PROC-IMPORT-EXPORT-forcing-column-to-be-character-or-n...

 

I think what I want is something like the DBSASTYPE option, but it doesn't seem to be supported by the XLSX engine. 

 

Sample code:

libname myfile xlsx "%sysfunc(pathname(work))\myexcel.xlsx" ;

*write sashelp.class to excel ;
data myfile.class ;
  set sashelp.class ;
run ;

proc contents data=myfile.class ;
run ;

*Failing pseudo code to read in the Excel file, forcing Age to be character ;
data ClassCharAge ;
  set myfile.class(dbsastype=(Age='CHAR(20)')) ;
run ;

data ClassCharAge ;
  length Age $20 ;
  set myfile.class ;
run ;

 

I know there are workarounds like adding a row of dummy character data to the Excel file, or exporting the excel file to CSV before importing to SAS, or writing a macro to do the conversion.  And yes I hate reading from Excel just as much as everyone else.  Just wondering if there is a way to import numeric columns as character.


Accepted Solutions
Solution
‎01-31-2018 09:15 AM
Super User
Super User
Posts: 7,860

Re: Reading XLSX file and force SAS column types

I think this needs to be made as an enhancement request as the current driver does not support this.

View solution in original post


All Replies
Trusted Advisor
Posts: 1,827

Re: Reading XLSX file and force SAS column types

If you are going to receive this file weekly, why not define max length of variables, just before the SET staement,

as you have done in your last step.

 

 

Super User
Super User
Posts: 7,860

Re: Reading XLSX file and force SAS column types


Shmuel wrote:

If you are going to receive this file weekly, why not define max length of variables, just before the SET staement,

as you have done in your last step.

 

 


It is too late then.  The read from XLSX to SAS has already made the column as numeric/character based on what is this week's file.

PROC Star
Posts: 1,434

Re: Reading XLSX file and force SAS column types

@Shmuel, because it doesn't work. : )

216  data ClassCharAge ;
217    length Age $20 ;
218    set myfile.class ;
ERROR: Variable Age has been defined as both character and numeric.
219  run ;
Solution
‎01-31-2018 09:15 AM
Super User
Super User
Posts: 7,860

Re: Reading XLSX file and force SAS column types

I think this needs to be made as an enhancement request as the current driver does not support this.

PROC Star
Posts: 1,434

Re: Reading XLSX file and force SAS column types

Super User
Posts: 13,084

Re: Reading XLSX file and force SAS column types

Since Excel does not have any "data type" for any column any connection to Excel "guesses" what the variable type is.

 

Likely if you examine your column cell formatting that has numeric appearing values that you want as character you will find it is the Excel "general". If the column were "text" it might import as desired. Might.

 

So far the only reliable method I have discovered to import Excel files with any consistency and control is to save them as CSV and use a data step to read the CSV file. That allows controlling lengths and types of specific variables as well as variable names.

PROC Star
Posts: 1,434

Re: Reading XLSX file and force SAS column types

Understood @ballardw.  I just wish there was a way to tell the XLSX engine that a column is character, instead of having it guess.  Since the XLSX engine is developed by SAS, rather than relying on Microsoft's ACE driver, I think there may be hope that SAS could give us this option.  I feel like Henry Feldman had the option in DBMS copy years ago, and he's been working for SAS for quite a while now... : )

Super User
Posts: 9,611

Re: Reading XLSX file and force SAS column types

If I were you, I'd be strongly tempted to set up a process that uses OpenOffice (or a VB script for MS Office) in batch mode to convert the Excel to CSV, and then read that into SAS. If XCMD is enabled, it all could be done from SAS code.

Using a direct read from something as unreliable as Excel in a production environment gives me the creeps.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Trusted Advisor
Posts: 1,827

Re: Reading XLSX file and force SAS column types

Posted in reply to KurtBremser

I agree with all have been said upto now.

 

When excel displays a value like 25.5 - it may be 25.4999 or 25.50001 etc.

What would you like to have in a char type variable ?

 

Anyway, you can still have it by "hard work" done once, like in next code:

data classchar;
 set myfile.class(rename=(
       age=v3 height=v4 weight=v5));
     length age $3 height $6 weight $8;
     age = put(v3, 2.);
     height = put(v4, best6.1);
     weight = put(v5, best8.1);
     drop v3-v5;
run;    

 

 

Trusted Advisor
Posts: 1,827

Re: Reading XLSX file and force SAS column types

and it can be done programatically instead manually.
PROC Star
Posts: 1,434

Re: Reading XLSX file and force SAS column types

@Shmuel asked:

When excel displays a value like 25.5 - it may be 25.4999 or 25.50001 etc.

What would you like to have in a char type variable ?

Fair question.  Probably just use whatever default best. format seems reasonable, just like SAS would do for any implied automatic numeric to character conversion.  Not that I'm a fan of automatic conversions, but when dealing with Excel, that would be the least of my problems. : )

 

I'd even accept an option which said "Forget about the values that are actually stored in Excel, just read the displayed (formatted) values into a character variable."

Super User
Posts: 13,084

Re: Reading XLSX file and force SAS column types


Quentin wrote:

@Shmuel asked:

When excel displays a value like 25.5 - it may be 25.4999 or 25.50001 etc.

What would you like to have in a char type variable ?

Fair question.  Probably just use whatever default best. format seems reasonable, just like SAS would do for any implied automatic numeric to character conversion.  Not that I'm a fan of automatic conversions, but when dealing with Excel, that would be the least of my problems. : )

 

I'd even accept an option which said "Forget about the values that are actually stored in Excel, just read the displayed (formatted) values into a character variable."


If you apply a display type such as numeric with one decimal for a column and then do a file save as CSV the formatted value is what is saved in the CSV.

 

I determined that when I had problems with Excel files containing manually entered dates (gak, the things Excel can do with such things) and found that formatting a column with a single date type helped immensely prior to reading into SAS.

SAS Super FREQ
Posts: 345

Re: Reading XLSX file and force SAS column types

Can you use the PCFILES engine?

 

%let WORKPATH=%sysfunc(pathname(work));

*  Create test data;

libname xl_out xlsx "&WORKPATH\class.xlsx";

proc copy in=sashelp out=xl_out;
  select class;
run; quit;

libname xl_out clear;

*  Import the data;

libname xl_in pcfiles path="&WORKPATH\class.xlsx";

data work.class;
set xl_in.class(dbsastype=(Age='char(20)'));
run;

libname xl_in clear;

 

Vince DelGobbo

SAS R&D

 

Super User
Posts: 10,623

Re: Reading XLSX file and force SAS column types

You could use OS command to run a VBA(google it) to transform it into CSV ,and then import it .

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 262 views
  • 4 likes
  • 7 in conversation