BookmarkSubscribeRSS Feed
LosEndos
Calcite | Level 5

I need to backfill a data mart with data from 4000 historic spreadsheets. The following code runs as part of a DI Studio job.

proc import dbms=xls datafile="Report.xls" out=&_OUTPUT replace;

sheet="Cumulative Daily Flows";

getnames=no;

datarow=3;

guessingrows=32767;

run;

The code runs without any warnings or errors but truncates the values in a significant number of cells. For example, in the workbook we have a cell containing a formatted value of (10,037,753) with the actual cell value being -10037753.42.  Following the import into SAS the corresponding unformatted value in the output dataset is -10037753.4 (losing the final decimal place). 

It seems to happen when the length of the unformatted cell value, including the ‘-‘ and ‘.’ is 12 characters or more and is not an integer, and occurs for both positive and negative numbers. I’ve tried setting the relevant fields to numerics and character with a variety of informats in both the input file and output table metadata but in all cases the truncation occurs. I’ve also set guessingrows to the maximum value but, again, this makes no difference.

The only way I that I’ve been able to make this work properly so far is to paste the values from the original report into a completely new workbook.  In the original workbook, the cells have a custom format applied to allow brackets to indicate negative values so I thought that this may be the cause, but resetting all cell formats to “General” (which is what they are in the new xls) in the original report again makes no difference. Now, I could write some VBA to copy the contents of 4000 spreadsheets into new documents but this would take forever to run and I really don't want to do this unless I absolutely have to.  I don't have the option of using a datastep approach using the Excel libname as this doesn't work on our servers due to driver incompatibility.

Any suggestions would be greatly appreciated.

7 REPLIES 7
Amarnath7
Fluorite | Level 6

Hi,

Instead of Proc import, if you use data step this can be fixed either to read single sheet or multiple sheets


HWSteinberg
Calcite | Level 5

Hi,

although your example has exactly 12 digits including the '-' and the '.', I think the problem is caused by the default informat is best12. As a first workaround I would save the excel sheet as a csv file, runnig proc import with dbms = csv (or dlm?), then in the log you will see the SAS code and you could adapt the input statement with the correct informat.

Anotherdream
Quartz | Level 8

Unforunately if you can't use the Excel Libname and you can't convert the file to csv dynamically (i would also recommend a Excel macro that you simply call from Excel) then you are pretty limited in your options. I don't think there is a way to tell proc import to specifically apply one format to a given columns (I could very well be wrong).

Data stored in excel = my deepest and darkest nightmares

data_null__
Jade | Level 19

Lets make sure you are actually seeing what you have.  Compare your imported value to the value in question using HEX16 format.

24         data _null_;
25            x=-10037753.42;
26            put x=hex16.;
27            put x=;
28            put x=best11.;
29            run;

x=C163253F2D70A3D7
x=-
10037753.42
x=-
10037753.4
LosEndos
Calcite | Level 5

All

Thanks for taking the time to respond.  As stated in my original post, due to driver incompatibilities on our server data step options are very limited and, even using the Import Wizard in EG on an individual file and using various lengths of the comma and best informats makes no difference to the truncation. Even using a character informat imports the truncated value. 

I've asked one of the SAS consultants that works with us to check this on their internal database of known issues and there are numerous examples. It does appear to be caused by a combination of custom formats in the workbook cells and the interface between the SAS PC fileserver and Microsoft Jet Database engines that are involved in the import process.  There appears to be no workaround other than writing a VBA script to remove the custom formatting in the relevant cells (reformatting as Number  with 2 dp is sufficient does appear to resolve the problem) in all 4000 reports.

khari
Calcite | Level 5

I had truncation issues with Proc Import a while back..so I used the INFILE statement with lrecl=32767 and that seemed to work.

TahaAzizi
Calcite | Level 5

I am a new SAS user and I had similar problem with numbers with commas. What I did was to copy and paste the log part of the Proc Import into my program. That was basically a data step. Then, I modified the formats and informats for the variables of interest. The program worked well. Hope that helps.

-Taha

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 5287 views
  • 0 likes
  • 7 in conversation