An Idea Exchange for SAS software and services

by Super User
on ‎03-08-2017 11:53 AM

You don't show the EXCEL file you imported. Did you by any chance examime what you get if your second column contains TEXT values?


Since EXCEL files are subject to random (and fairly often undocumented) behaviors at Microsoft whim I don't expect consistency. For instance XLXS is ZIPPED XML but XLS is a proprietary binary format.


CSV and any other delimited file behaves the same.And if I have to import the same structured file then I use a data step so I control the variable names, labels, formats, informats. IMNSHO the best use for proc import is to generate base code to read delimited files so that I can edit it to behave correctly. Why trust any guesses to determine my data.

As a side effect of controlling my informats I don't get things like variable type mismatches or length variance issues when I combine data.

by Super Contributor
on ‎03-08-2017 12:32 PM

I feel your pain Phil. And I do support your ballot idea.


One thing all theses scenarios for automatic column naming have in common though is they all yield equally meaningless and useless results. You will have to deal with them in one way or another. ;-)


- Jan.

by Contributor hollandnumerics
on ‎03-08-2017 03:01 PM



It actually doesn't matter what the contents of the Excel file look like, as they are given the same variable name whether character or numeric. The point is that I would like to have some control, and therefore some consistency, in the variable names generated by PROC IMPORT, so I don't get any unpleasant surprises.


Way back in the dim and distant past I was a FORTRAN programmer, and so I tend to use i, j and k for DO loop variables. While using DBMS=EXCEL or CSV I am fairly safe, but switching from DBMS=EXCEL to DBMS=XLSX could cause me considerable problems if my spreadsheet has more than 8 columns, and I have a Data Step containing a DO loop using i !!



by Super User
on ‎03-08-2017 05:07 PM

WATFOR, WATFIV, FORTRAN77 or other dialect? I also started with FORTRAN and that is one of the reasons I don't like Excel in just about any flavor. I programmed for something like 20 years before working with an actual spreadsheet. The fact that they do not have variables or type restrictions in columns means that spreadsheets are just about so much garbase. (Merged cells are the devil's playground)


Which is why I turn them into CSV, import with a datastep so I know what I actually have.

by Valued Guide
on ‎04-11-2017 03:25 AM
In a word of support for Base SAS - I believe that the code generated by the IMPORT procedure was established outside the dev team for Base SAS - probably it was a brainchild of the interfaces team responsible for upgrades to DMS. For example, 1 generating an automatic conversion of a constant from numeric to char (very much discouraged among those who dislike such NOTEs); and 2, overloading the INPUT statement with redundant $ symbols. I could go on....
The generated code as a template does not lead to better understanding of data steps (imo).
At least we have to support Phil's suggestion!🤓Perhaps a reviewer at SAS might consider not only how straight forwardly Phil's proposal could be implemented, but also, how good generated code should become a teaching aid for those who are still learning about the amazing functionality available in the languages of SAS.
Idea Statuses
Top Liked Authors