09-19-2013 07:08 AM
I need to backfill a data mart with data from 4000 historic spreadsheets. The following code runs as part of a DI Studio job.
proc import dbms=xls datafile="Report.xls" out=&_OUTPUT replace;
sheet="Cumulative Daily Flows";
The code runs without any warnings or errors but truncates the values in a significant number of cells. For example, in the workbook we have a cell containing a formatted value of (10,037,753) with the actual cell value being -10037753.42. Following the import into SAS the corresponding unformatted value in the output dataset is -10037753.4 (losing the final decimal place).
It seems to happen when the length of the unformatted cell value, including the ‘-‘ and ‘.’ is 12 characters or more and is not an integer, and occurs for both positive and negative numbers. I’ve tried setting the relevant fields to numerics and character with a variety of informats in both the input file and output table metadata but in all cases the truncation occurs. I’ve also set guessingrows to the maximum value but, again, this makes no difference.
The only way I that I’ve been able to make this work properly so far is to paste the values from the original report into a completely new workbook. In the original workbook, the cells have a custom format applied to allow brackets to indicate negative values so I thought that this may be the cause, but resetting all cell formats to “General” (which is what they are in the new xls) in the original report again makes no difference. Now, I could write some VBA to copy the contents of 4000 spreadsheets into new documents but this would take forever to run and I really don't want to do this unless I absolutely have to. I don't have the option of using a datastep approach using the Excel libname as this doesn't work on our servers due to driver incompatibility.
Any suggestions would be greatly appreciated.
09-19-2013 10:44 AM
although your example has exactly 12 digits including the '-' and the '.', I think the problem is caused by the default informat is best12. As a first workaround I would save the excel sheet as a csv file, runnig proc import with dbms = csv (or dlm?), then in the log you will see the SAS code and you could adapt the input statement with the correct informat.
09-19-2013 02:45 PM
Unforunately if you can't use the Excel Libname and you can't convert the file to csv dynamically (i would also recommend a Excel macro that you simply call from Excel) then you are pretty limited in your options. I don't think there is a way to tell proc import to specifically apply one format to a given columns (I could very well be wrong).
Data stored in excel = my deepest and darkest nightmares
09-19-2013 03:35 PM
Lets make sure you are actually seeing what you have. Compare your imported value to the value in question using HEX16 format.
09-20-2013 07:02 AM
Thanks for taking the time to respond. As stated in my original post, due to driver incompatibilities on our server data step options are very limited and, even using the Import Wizard in EG on an individual file and using various lengths of the comma and best informats makes no difference to the truncation. Even using a character informat imports the truncated value.
I've asked one of the SAS consultants that works with us to check this on their internal database of known issues and there are numerous examples. It does appear to be caused by a combination of custom formats in the workbook cells and the interface between the SAS PC fileserver and Microsoft Jet Database engines that are involved in the import process. There appears to be no workaround other than writing a VBA script to remove the custom formatting in the relevant cells (reformatting as Number with 2 dp is sufficient does appear to resolve the problem) in all 4000 reports.
02-21-2014 12:43 PM
I am a new SAS user and I had similar problem with numbers with commas. What I did was to copy and paste the log part of the Proc Import into my program. That was basically a data step. Then, I modified the formats and informats for the variables of interest. The program worked well. Hope that helps.