Import from EXCEL SAS 9.3 vs. SAS 9.4

Accepted Solution Solved
Reply
Super Contributor
Posts: 543
Accepted Solution

Import from EXCEL SAS 9.3 vs. SAS 9.4

Hi!

 

Yesterday I ran into a peculiar problem:

I have a simple EXCEL spreadsheet (attached) with 3 sheets: HSA1, HSA2, HSA3

In SAS 9.4 - using PROC IMPORT, I need to specify which sheet I want,otherwise it reads first sheet (by default?)

proc import datafile = "test.xlsx"
    out = test dbms = xlsx replace;
run;
proc print;run;

 

In SAS 9.3,it seems to be reading just what seems to me random sheet (without me ever sepcifying anything)...

proc import datafile = "test.xlsx"
    out = test dbms = excel replace;
run;
proc print;run;

 

I wonder if anyone else can replicate this behavior, and why would this happen?

 

Thank you!

 

Anca.


Accepted Solutions
Solution
‎06-07-2016 12:13 PM
Esteemed Advisor
Esteemed Advisor
Posts: 7,232

Re: Import from EXCEL SAS 9.3 vs. SAS 9.4

Well, firstly I don't think there is a dbms=excel, that is for libname access to Excel, so that isn't going to work.  The second point is that each DBMS uses a different "engine" under the hood.  How much control SAS has over these engines I can't say, but normally they would be proprietary.  Its a bit like if you write a program in SAS to import a file, store that in a compiled macro library so no one else can read it.  Then someone else writes a program which does a very similar thing, but say orders the file in a different way, then compiles to macro catalog.  You can use each and they will do a similar thing, but they are in essence doing different code.  There is many reasons why that happens, new versions of software, new methodologies etc.  Now in your example the XLSX engine is built specifically and can only imort the latest version of M$ file formats - that being Open Office XML documents zipped up (or XLSX), that engine can't read the older binary XLS files.  Thus the underlying engine there is going to be different to the old way of reading in binary files which have different strcuture.  Even something simple like Excel having an inconsistent sheet index item across the engines will change the output.  I personally would only be worried if the same engine was giving results, but if you want to follow up pop a ticket into the help desk and they will likely be able to point you to the documentation on the changes between versions in Excel and SAS.  User guide on DBMS:

http://support.sas.com/documentation/cdl/en/acpcref/63181/HTML/default/viewer.htm#p0jf3o1i67m044n1j0...

View solution in original post


All Replies
Esteemed Advisor
Esteemed Advisor
Posts: 7,232

Re: Import from EXCEL SAS 9.3 vs. SAS 9.4

Hi,

 

I don't have 9.3 to test this out (and I don't download Excel files).  However at first glance you are using two different DMBS= statements there, Excel and XLSX.  Just that alone may be the difference.  It could also be that in 9.4 they applied some additional logic to ascertain sheet index, I think only the developer would be able to confirm that. 

 

However the emphasis, as with any data, is on you, the owner of the data to specify to the SAS system what you want to read, and how you want to read it.  If you let any software start guessing for you then your bound to run into odd things like this - and Excel is a particularly bad example of this. 

Super Contributor
Posts: 543

Re: Import from EXCEL SAS 9.3 vs. SAS 9.4

Right, I tend to agree with you, however, there should be some level of "trust" in a behaviour of a given software;i.e., PROC IMPORT should work consistently - whichever way that might be and if one does not specify the sheet, the first sheet should be the default.

While importing is not the best procedure, it is applied over and over in both academic and work settings.

 

I am merely bringing up a point of inconsistencies and a rather curious one to me.

 

SAS 9.4 does not allow for dbms = EXCEL specification, as you might know.

 

 

Solution
‎06-07-2016 12:13 PM
Esteemed Advisor
Esteemed Advisor
Posts: 7,232

Re: Import from EXCEL SAS 9.3 vs. SAS 9.4

Well, firstly I don't think there is a dbms=excel, that is for libname access to Excel, so that isn't going to work.  The second point is that each DBMS uses a different "engine" under the hood.  How much control SAS has over these engines I can't say, but normally they would be proprietary.  Its a bit like if you write a program in SAS to import a file, store that in a compiled macro library so no one else can read it.  Then someone else writes a program which does a very similar thing, but say orders the file in a different way, then compiles to macro catalog.  You can use each and they will do a similar thing, but they are in essence doing different code.  There is many reasons why that happens, new versions of software, new methodologies etc.  Now in your example the XLSX engine is built specifically and can only imort the latest version of M$ file formats - that being Open Office XML documents zipped up (or XLSX), that engine can't read the older binary XLS files.  Thus the underlying engine there is going to be different to the old way of reading in binary files which have different strcuture.  Even something simple like Excel having an inconsistent sheet index item across the engines will change the output.  I personally would only be worried if the same engine was giving results, but if you want to follow up pop a ticket into the help desk and they will likely be able to point you to the documentation on the changes between versions in Excel and SAS.  User guide on DBMS:

http://support.sas.com/documentation/cdl/en/acpcref/63181/HTML/default/viewer.htm#p0jf3o1i67m044n1j0...

Super Contributor
Posts: 543

Re: Import from EXCEL SAS 9.3 vs. SAS 9.4

Thank you!

 

Esteemed Advisor
Posts: 6,690

Re: Import from EXCEL SAS 9.3 vs. SAS 9.4

DBMS=EXCEL relies on a Microsoft-supplied module to access Excel files. This module has a rich history of unexpected changes in it's behaviour over the different releases I had to cope with (note that I intentionally did not use "work"!).

DBMS=XLSX, OTOH, is written following the offical documentation of the Microsoft "Open" Office Format, is platform-independent, and hopefully will be more stable.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 614 views
  • 1 like
  • 3 in conversation