01-10-2018 10:18 AM - edited 01-16-2018 07:06 AM
Edit 16 Jan 2018: I cannot now reproduce the problem described, so the original question is invalid. Sorry.
I'll leave a description of my mistake as it might be helpful.
Proc import with DBMS=XLSX is importing mixed character and number data in variables correctly. It is reading all the decimal places of the numbers, so the number -0.0621009999999962 is being imported as "-6.2100999999996187E-2".
I think the problem arose when I dumped the character rows and converted the numbers to numeric. With the statement
it was not picking up the final "E-2" in the example above, so converting it to the number -6.21009
gives the correct output.
Thanks to those who replied to the original question. The data is being imported into SAS to avoid the human errors that can occur in creating CSV from XLSX, so the question meant to ask for solutions apart from altering the input. The only relevant solution in the support page http://support.sas.com/kb/46/472.html is to use DBMS=XLSX.
So in summary, as far as I can see, Proc Import with DBMS=XLSX works correctly, and gets numbers right, even when the variables are treated as character, and the numbers are not displayed to all decimal places. Care is required when converting to numeric format.
-end of edit-
ORIGINAL QUESTION: Proc import XLSX NOT reading numbers correctly
Using this code running SAS 9.4 I am importing a file with the variable names in the first row, then 8 rows of character data not needed, then 100 rows of numeric data shown to 2 decimal places.
PROC IMPORT DATAFILE=&file
DBMS=xlsx REPLACE OUT=SAMPLE ;
I want to start at row 1 to use the variable names (there are hundreds of columns so I don't want to specify each one).
I am then discarding the 8 unneeded rows and converting character to numeric to get numbers I can manipulate.
I only need the numeric data rounded to the nearest integer.
This generally works fine but the value -0.06 is really -0.0621009999999962 and is being imported as "-6.021", and 0.00 is really 0.00320000 is imported as "3.2"
From Ballardw's answer here https://communities.sas.com/t5/SAS-Procedures/Incorrect-values-is-datasets-imported-from-CSV/td-p/17...
I'm guessing it's because the full 16 digit number is either not being accurately stored or read.
SAS is determining the columns to be length 15.
From this note on character strings being truncated
it looks there is no way in SAS of controlling length with proc import of XLSX.
If I change the format in Excel to display to 16 decimal places I can get the right output as SAS makes the columns length longer.
Apart from this workaround, or otherwise altering the input, is there any way of getting SAS to read these small numbers correctly?
01-10-2018 10:28 AM - edited 01-10-2018 10:30 AM
First of all, take control of the data transfer process. This specifically excludes the use of the Excel file format, as it is unsuitable for this. See ~327543 posts about this here on the SAS communities (slightly exaggerating)
Write your data from Excel to a csv file, and import that. Either use proc import and take the resulting data step from the log, and adapt it to your needs. Or write a data step yourself, according to the specification of the Excel data. If no documentation is given, run tests, inspect the results, and read the log to see if your code works as expected/desired.
01-10-2018 11:40 AM
Did you use the information in http://support.sas.com/kb/46/472.html to change the rows examined to "guess" variable type or length?
Also if the values imported to your data set only show a few decimals you might try changing the display format to see if the SAS value is carrying more decimal places than you think.
Without seeing the actual source data I would hesitate to guess about your issue with "0.00 is really 0.00320000 is imported as "3.2" ".
And I'm not going to look at an Excel file from an unknown source.
Especially if a data set is in a repeat process I always convert Excel to CSV and have a data set to read values consistently. (I've done this twice this morning already) as @KurtBremser suggests and for the same reasons.
01-10-2018 12:35 PM
What is the question?
SAS can read the metadata of an XLSX file and will make the column match the data that is in it. If your columns have mixed types then that might be your main issue.
But if you are changing character variables in SAS to numeric values then we need to see the values in the character variables and the SAS code you used to convert them.
If changing the format in Excel fixes the problem than I assume that is the right solution.