Hello,
I'm trying to import a number of Excel files to SAS using the IMPORT procedure. My data have an ID string in the first column and numeric data (only numbers plus a few periods for missing data) in the other columns.
Now my problem is that SAS imports some of the numeric variables as numeric and some as characters. There doesn't seem to be any pattern. I even changed the Excel cell formats to numbers to be sure everything is alright. I'm working with a lot of different tables so it's not feasible to correct the data types manually.
What am I doing wrong? What exactly is the criterion for SAS to treat variables as characters instead of numbers?
The problem is with Excel. Excel is unstrcutured/uncontrolled. In each cell you could put anything you like, including pictures and control objects etc. Most sensible data processing environments fix the structure of the dataset (be it a database or SAS or anything else) so that programming can be done on the data efficiently. When SAS reads in data from Excel, it looks through a number of observations and guesses what the type should be - this is the guessingrows registry parameter.
Now if your importing data, it is better to get the data in a proper machine readbale format (with an assciated validated and repeatable procedure). Say you save the Excel file as CSV, you can then read in the file using datastep infile, and at that point you would fix informats, formats, lengths etc. and so have full control over the data that is read in.
This will help with CSV import.
Do you have missing values/blanks in your excel files?
Look into GUESSINROWS option but it's a system registry change.
Hi Reeza,
I have periods as missing values in the data but for some reason SAS imports a couple of empty rows after the last observation and those are of course blanks.
In a foreign data source such as Excel a period is a character. If the import procedure runs into characters within the rows it examines, I believe 20 for Excel, then the column is treated as character.
You may be better off by deleting the periods in the Excel file and reimporting where you use a search and replace for contents of cell are .
The problem is with Excel. Excel is unstrcutured/uncontrolled. In each cell you could put anything you like, including pictures and control objects etc. Most sensible data processing environments fix the structure of the dataset (be it a database or SAS or anything else) so that programming can be done on the data efficiently. When SAS reads in data from Excel, it looks through a number of observations and guesses what the type should be - this is the guessingrows registry parameter.
Now if your importing data, it is better to get the data in a proper machine readbale format (with an assciated validated and repeatable procedure). Say you save the Excel file as CSV, you can then read in the file using datastep infile, and at that point you would fix informats, formats, lengths etc. and so have full control over the data that is read in.
This will help with CSV import.
Thanks a lot, it worked!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.