- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Dear SAS experts
I am trying to import a dataset using the following code:
proc important datafile="Datafilepath"
dbms=xlsx replace
out=newdatasetname;
run;
The first variable - A - contains letters and the row with the most characters (including spaces) contains 32 characters. When I import the excel file using the command above it appears that the variable is truncated at 17 characters (length: 17, /format/informat: $17) .
I was inspired from reading post online to try to include a length command, a format command and a guessrows option (one at a time) in the proc import to change this but thus far with no success. I included these above the run statement.
This variable (A) is the only character variable in the dataset and is placed as the first variable in the dataset.
Can anyone help me understand this issue? Am I on the right track by including a format, length or guessrows command?
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I would consider using the GUESSINGROWS = option in PROC IMPORT. I think that it is trying to guess the length of the column based on the first twenty observations.
You could do this:
proc import datafile = "your-data-file"
out = want
dbms = xlsx
replace;
guessingrows = 1000; /* or 50 or 100 or 5000 or MAX - depends on size and time constraints. */
run;
You could also trying using the LIBNAME statement with the XLSX engine. That may be better -- not quite sure without seeing the data. I personally loathe when data is sent to me in excel, and I would use the LIBNAME statement with the XLSX engine if data is sent to me in this form.
libname path2xl xlsx "your-excel-file-here";
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hey Maguiremq
Unfortunately the guessingrows option does not work. I have already tried it.
The first variable does not have a variable name and is just named "VAR1" when imported. This might cause an issue?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The most likely cause of this problem is that your longer columns are not in the first few rows of your spreadsheet and the Excel engine SAS uses to import the data only checks the first few rows of data.
A quick fix is to add a dummy first row of data to your spreadsheet and ensure column A has 32 characters in it. Read that into SAS and delete this row.
GUESSINGROWS is not a valid option for DBMS = XLSX.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hey SASKIWI
You are right. If the first column in the first row there are only few letters. There are more characters in the rows lower in the first column.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hey Reeza
Thanks.
I have changed it now so I do not import an excel file but a .csv using proc import. But there is still some truncation (at 19 characters now instead of 17, for some reason).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I would be very surprised if SAS truncated a character string from an XLSX file. You could try using the XLSX libref engine instead of PROC IMPORT and see if that helps.
libname in xlsx "Datafilepath";
proc copy inlib=in outlib=work;
run;
Otherwise try making a small file that replicates the issue and posting it here.
Or better open a support ticket with SAS support and let them figure it out.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you want control over how the data is imported, save as csv from Excel and read that with a data step.
Do not use Excel files, do not use PROC IMPORT.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hey Kurt
But using a data step and the infile statement requires that I know the names of all variables that I am importing, correct? Or can I simply tell SAS to make the most appropriate formatting? I am asking this based on the example some 3/4s down on this site:
https://sasexamplecode.com/3-ways-to-import-a-csv-file-into-sas-examples/
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Dear all @Kurt_Bremser @Tom @Reeza @SASKiwi @maguiremq
It seems that issue is that I am doing all of this within a macro. I tried importing it as an .csv file instead and added the guessingrow option outside a macro and it works as intended. The same code does not work within a macro. The files are the same, but when created within a macro there is this truncation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Macro's just generate SAS code for SAS to run. If you are running the same code it does not matter whether you hard-coded the text of the program or let a macro generate the code for you. Turn on the MPRINT option to see what SAS code the macro is actually generating.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@mgrasmussen wrote:
Hey Kurt
But using a data step and the infile statement requires that I know the names of all variables that I am importing, correct? Or can I simply tell SAS to make the most appropriate formatting? I am asking this based on the example some 3/4s down on this site:
https://sasexamplecode.com/3-ways-to-import-a-csv-file-into-sas-examples/
Thank you
That description of how to write a data step is a little messed up. If users have been following that advice maybe that is why I see such convoluted confused data steps posted on this forum. Two key things:
1) You should DEFINE the variables first using a LENGTH statement (or LENGTH= option on ATTRIB statement). INFORMAT and FORMAT are rarely needed. SAS knows how to read and write normal numbers and text. Don't force SAS to guess how you want the variables defined by how the variable is first used in other statements like INFORMAT, FORMAT or INPUT in the data step. It can work, but you really need to understand the rules SAS uses to guess the type and length of the variable to be sure the variable is defined as you intended, so not the advice to be giving novices.
2) You can use the older MISSOVER option on the INFILE statement, but make sure you don't slip into using formatted input instead of list mode input or it might cause text at the end of the line to be ignored. The TRUNCOVER option is safer, especially for novices.
If you want a better tool for guessing how to read a delimited text file try this macro: https://github.com/sasutils/macros/blob/master/csv2ds.sas