BookmarkSubscribeRSS Feed
Yaela
Calcite | Level 5

Hi,
I need to read in SAS a Excel sheet where one column has mixed-type data numeric and text, with the text data appearing after many rows:
NUM  EXTERNAL VISIT
1                1
2                
3                3
4                
5                5
…… more rows, in a similar fashion
130            FU1
131            FU2
132            EOT

 

As probably everyone knows by now, PROC IMPORT fails here – we get in SAS a column EXTERNAL_VISIT of numeric data and the text values are lost. I know that:

  1. This is due to SAS scanning a fixed number of records to assess the type, and that this number is rather small.
  2. The number can be made larger by changing a parameter in the Windows registry, but this is not allowed.
  3. Saving the Excel file as CSV and importing that into SAS, enables using the option GUESSINGROWS to force scanning more records but my process does not allow this.


I tried using LIBNAME engine, in 2 ways. Both ran error-free, with the same output – column of numeric data and text values lost. I read several sources, including the SUGI 32 source https://support.sas.com/resources/papers/proceedings/proceedings/sugi31/024-31.pdf  “De-Mystifying the SAS® LIBNAME Engine in Microsoft Excel: A Practical Guide”(section “ACCESSING DATA FROM A SPREADSHEET” ) “Note the libname option “MIXED=YES” is used to read mixed character and numeric data as character data

 

  1. LIBNAME WrkBk EXCEL 'My Workbook.xls' MIXED=YES;
    data test;
         set WrkBk.'Sheet_name$'n;
    run;
  2. LIBNAME WrkBk EXCEL 'My Workbook.xls';
    data test;
          set WrkBk.'Sheet_name$'n (dbsastype=('EXTERNAL VISIT'=char10) ) ;
    run;

 

Does anyone know if LIBNAME engine also scans just a limited number of records? If so, can this number be set in the code? As I said, I can’t update the Windows registry.
Does anyone have any other ideas or thoughts?


I would appreciate any help,
Yael

 

6 REPLIES 6
ballardw
Super User

Try adding the SCANTEXT option to your proc import if you must use proc import.

That should examine more rows before determining whether a variable should be numeric or text. But if some files do not have any of the text values you will still have issues.

 

Reliance on Proc Import for multiple files often leads to yet other issues such as lengths and names of variables being inconsistent as well as the type. For date, time or datetime type data you can also run into inconsistent formats as well.

Yaela
Calcite | Level 5
Thank yo for the suggestion, but it doesn't work. I still get a numeric variable in SAS and lose the text values.
This is similar to the way in several tutorials they recommend using MIXED=YES but it doesn't help.

I'm not sure why you mentioned multiple files - I'm not trying to read multiple files. Just 1 files, with a column that has mixed-type data.
SASKiwi
PROC Star

AFAIK, the Windows Registry is the only place you can set GUESSINGROWS for Excel spreadsheets. A quick and dirty fix would be to add a dummy row of data at the top of the spreadsheet after the column headings to enforce the treating of the column as character or try changing the column type from General to Text.  

Yaela
Calcite | Level 5
Thank you for the suggestions.

Sadly, I can't touch the source data to add lines etc.

I did try the suggestion to change the format in Excel and that did work! So great! Nice idea.

However, that's not really a complete solution to the problem. It helped me at this point, but in general, one can never be sure what the format will be and whether someone changed it.
andreas_lds
Jade | Level 19

@Yaela wrote:

However, that's not really a complete solution to the problem. It helped me at this point, but in general, one can never be sure what the format will be and whether someone changed it.

Replacing excel-files as data-source with a file-format more reliable is the only way to get a complete solution. Any automated processing of excel-files (proc import., libname) relies on making assumptions, because there is no mechanism in excel forcing one and only one data-type per column.

Yaela
Calcite | Level 5

Hi,

 

I agree with what you say, in general, regarding the use of Excel files.

 

However, the reason I brought up this topic is that according to the documentation it's supposed to work.

 

I mean, both for PROC IMPORT and for LIBNAME engine, option MIXED=YES should work. I already know it doesn't work in PROC IMPORT. I hoped maybe it can be made to work in LIBNAME.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 3653 views
  • 0 likes
  • 4 in conversation