BookmarkSubscribeRSS Feed
mgrasmussen
Quartz | Level 8

Dear SAS experts

 

I am trying to import a dataset using the following code:

 

proc important datafile="Datafilepath"

dbms=xlsx replace

out=newdatasetname;

run;

 

The first variable - A - contains letters and the row with the most characters (including spaces) contains 32 characters. When I import the excel file using the command above it appears that the variable is truncated at 17 characters (length: 17, /format/informat: $17) .

I was inspired from reading post online to try to include a length command, a format command and a guessrows option (one at a time) in the proc import to change this but thus far with no success. I included these above the run statement.

This variable (A) is the only character variable in the dataset and is placed as the first variable in the dataset.

 

Can anyone help me understand this issue? Am I on the right track by including a format, length or guessrows command?

 

Thank you

 

12 REPLIES 12
maguiremq
SAS Super FREQ

I would consider using the GUESSINGROWS = option in PROC IMPORT. I think that it is trying to guess the length of the column based on the first twenty observations.

 

You could do this:

 

proc import datafile = "your-data-file"
out = want
dbms = xlsx
replace;
guessingrows = 1000; /* or 50 or 100 or 5000 or MAX - depends on size and time constraints. */
run;

You could also trying using the LIBNAME statement with the XLSX engine. That may be better -- not quite sure without seeing the data. I personally loathe when data is sent to me in excel, and I would use the LIBNAME statement with the XLSX engine if data is sent to me in this form.

 

libname path2xl xlsx "your-excel-file-here";

 

LIBNAME XLSX post 

GUESSINGROWS statement 

mgrasmussen
Quartz | Level 8

Hey Maguiremq

 

Unfortunately the guessingrows option does not work. I have already tried it. 

 

The first variable does not have a variable name and is just named "VAR1" when imported. This might cause an issue?

SASKiwi
PROC Star

The most likely cause of this problem is that your longer columns are not in the first few rows of your spreadsheet and the Excel engine SAS uses to import the data only checks the first few rows of data.

 

A quick fix is to add a dummy first row of data to your spreadsheet and ensure column A has 32 characters in it. Read that into SAS  and delete this row.   

 

GUESSINGROWS is not a valid option for DBMS = XLSX.

mgrasmussen
Quartz | Level 8

Hey SASKIWI

 

You are right. If the first column in the first row there are only few letters. There are more characters in the rows lower in the first column. 

Reeza
Super User
Unfortunately most of the standard suggestions (length/informat/guessingrows) don't apply when importing Excel files. If you can, save it as CSV and importing that will allow you way more control. Obviously this is easy and straightforward if its a one time process but a pain if it's required to be an automation of some sort.
mgrasmussen
Quartz | Level 8

Hey Reeza

 

Thanks.

 

I have changed it now so I do not import an excel file but a .csv using proc import. But there is still some truncation (at 19 characters now instead of 17, for some reason).

Tom
Super User Tom
Super User

I would be very surprised if SAS truncated a character string from an XLSX file.  You could try using the XLSX libref engine instead of PROC IMPORT and see if that helps.

libname in xlsx "Datafilepath";
proc copy inlib=in outlib=work;
run;

 

Otherwise try making a small file that replicates the issue and posting it here.

Or better open a support ticket with SAS support and let them figure it out.

mgrasmussen
Quartz | Level 8

Hey Kurt

 

But using a data step and the infile statement requires that I know the names of all variables that I am importing, correct? Or can I simply tell SAS to make the most appropriate formatting? I am asking this based on the example some 3/4s down on this site:

 

https://sasexamplecode.com/3-ways-to-import-a-csv-file-into-sas-examples/

 

Thank you

mgrasmussen
Quartz | Level 8

Dear all @Kurt_Bremser @Tom @Reeza @SASKiwi @maguiremq 

It seems that issue is that I am doing all of this within a macro. I tried importing it as an .csv file instead and added the guessingrow option outside a macro and it works as intended. The same code does not work within a macro. The files are the same, but when created within a macro there is this truncation.

Tom
Super User Tom
Super User

Macro's just generate SAS code for SAS to run.  If you are running the same code it does not matter whether you hard-coded the text of the program or let a macro generate the code for you.  Turn on the MPRINT option to see what SAS code the macro is actually generating.

Tom
Super User Tom
Super User

@mgrasmussen wrote:

Hey Kurt

 

But using a data step and the infile statement requires that I know the names of all variables that I am importing, correct? Or can I simply tell SAS to make the most appropriate formatting? I am asking this based on the example some 3/4s down on this site:

 

https://sasexamplecode.com/3-ways-to-import-a-csv-file-into-sas-examples/

 

Thank you


That description of how to write a data step is a little messed up.  If users have been following that advice maybe that is why I see such convoluted confused data steps posted on this forum.   Two key things:

1) You should DEFINE the variables first using a LENGTH statement (or LENGTH= option on ATTRIB statement).  INFORMAT and FORMAT are rarely needed. SAS knows how to read and write normal numbers and text.  Don't force SAS to guess how you want the variables defined by how the variable is first used in other statements like INFORMAT, FORMAT or INPUT in the data step.  It can work, but you really need to understand the rules SAS uses to guess the type and length of the variable to be sure the variable is defined as you intended, so not the advice to be giving novices.

 

2)  You can use the older MISSOVER option on the INFILE statement, but make sure you don't slip into using formatted input instead of list mode input or it might cause text at the end of the line to be ignored.  The TRUNCOVER option is safer, especially for novices.

 

If you want a better tool for guessing how to read a delimited text file try this macro: https://github.com/sasutils/macros/blob/master/csv2ds.sas

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 7759 views
  • 7 likes
  • 6 in conversation