Hi all,
I'm using proc import to read in an .xls file, variables in the file are all numeric. However, for some reason, SAS decided to change one of the variable into character. I tired put function for the array of variables but wasn't successful as some variables are already in numeric format. Is there anyway to fix the problem?
@wylamw1 wrote:
Hi all,
I'm using proc import to read in an .xls file, variables in the file are all numeric. However, for some reason, SAS decided to change one of the variable into character. I tired put function for the array of variables but wasn't successful as some variables are already in numeric format. Is there anyway to fix the problem?
Save the file as CSV and write a data step to read the text file.
Proc Import only examines a very small number of rows of data to set properties from XLS. So if there are no values for a variable in the first 8 rows you can get character values, if your "numeric" data contains values like "null" "na" or "." (just a decimal point with no numerals) or something else that is not actually a number then you get character values.
You can write a data step, or modify one generated by Proc Import for CSV to force reading a variable as numeric.
@wylamw1 wrote:
Hi all,
I'm using proc import to read in an .xls file, variables in the file are all numeric. However, for some reason, SAS decided to change one of the variable into character. I tired put function for the array of variables but wasn't successful as some variables are already in numeric format. Is there anyway to fix the problem?
Save the file as CSV and write a data step to read the text file.
Proc Import only examines a very small number of rows of data to set properties from XLS. So if there are no values for a variable in the first 8 rows you can get character values, if your "numeric" data contains values like "null" "na" or "." (just a decimal point with no numerals) or something else that is not actually a number then you get character values.
You can write a data step, or modify one generated by Proc Import for CSV to force reading a variable as numeric.
If you import the same XLS file the result will always be the same. Most likely the problem you are having is that you are trying to import different files and expecting the structure generate to always be the same. You cannot do that with PROC IMPORT from unstructured data sources like Excel files (or text files). It will create the structure that matches the sample of data in the one file it is reading.
There are probably two ways the variable could be created as character when you expected numeric. One is that some cell in the column contains a character string. And the other is if all of the cells the column are empty. In that case PROC IMPORT will probably create a character variable of length 1.
If you have control over the source then don't use XLS files for your data transfer. Use something that contain metadata to define the variables (like a SAS dataset) or use simple text files that you can write your own data step to read so that you have control over the structure of the dataset that is generated.
@wylamw1 - If the spreadsheets contain a top row of column names followed by data in the next row that is consistent across all rows then it is possible to use them for (mostly) reliable data exchange. The problem is most users of Excel don't appreciate this fact and muck up a simple tabular layout with reporting customisations like blank lines, spanning headers and extra comments that screw up reading the data cleanly and reliably. Add to that any old typo that Excel happily accepts.
As an experiment, get rid of any blank lines and reduce a spreadsheet to just one column header / column name row followed immediately by consistent data. You'd be surprised how often a rogue and hidden space or carriage return can magically change an Excel column from numeric to character. Getting a clean spreadsheet is half the battle, and as the saying goes, garbage in garbage out - the problem is Excel accepts any old garbage while SAS is much more disciplined!
@wylamw1 wrote:
Hi all,
I'm using proc import to read in an .xls file, variables in the file are all numeric. However, for some reason, SAS decided to change one of the variable into character. I tired put function for the array of variables but wasn't successful as some variables are already in numeric format. Is there anyway to fix the problem?
With .xls use GUESSINGROW=MAX; in your Proc Import statement to ensure SAS scans all rows before defining the data type.
IF there is in any cell under a column a character then the created SAS variable will be of type character.
Is "guessingrows" available when reading xls-files? Isn't that option implemented for text-files only?
@andreas_lds wrote:
Is "guessingrows" available when reading xls-files? Isn't that option implemented for text-files only?
You might be right. I thought it works with dbms=excel but couldn't find this anywhere documented. I haven't worked with .xls since long now and all these issues just go away with .xlsx and dbmx=xlsx
If that's an adhoc job then I probably would just save your .xls as a .xlsx and go from there.
Then follow the advice given by @ballardw: convert the file into a proper file-format and write a data step to read it.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.