Desktop productivity for business analysts and programmers

Import from Excel Problem: Data step vs Proc Import

Reply
Occasional Contributor
Posts: 5

Import from Excel Problem: Data step vs Proc Import

I'm importing a dataset from an Excel file that contains a date column. In Excel, this column is properly a data (i.e., it's a date serial) formatted as a short date (m/d/yyyy).  In EG, I'm using the Import Data task to, well, import the data.  On the Define Field Attributes screen, I have the following settings for the date column:

  • Type: Date
  • Source Informat: ANYDTDTE9.
  • Len.: 8
  • Output Format: Date9.
  • Output Informat: ANYDTDTE9.

Now here's the confusing part: on the Advanced Options page, if I select "Import the data using SAS/Access Interface to PC Files whenever possible" (yes, I have this component), the dates all import as 01Jan1960 (and when formatted as a number, they are all 0).  However, if I uncheck that box, the dates import just fine.  For reference, the SAS code generated by checking the box, or leaving it blank are below.  Does anyone know why simply checking that box would cause it to not import correctly, despite the same import settings?

 

Here's the code generated by the import data task for the SAS/ACCESS version

PROC IMPORT
        DATAFILE="I:\SASWork\_TD6320_CMHPRDSASMET01_\#LN00957.xlsx"
        OUT=WORK.VACATION_SWAPS_TO_LOAD2
        REPLACE
        DBMS=EXCEL;
    RANGE="Sheet1$A1:C101";
    GETNAMES=YES;
RUN;

/* --------------------------------------------------------------------
   This DATA step is used to both modify the attributes of fields
   imported by PROC IMPORT and perform any field type specific
   conversions needed.
   -------------------------------------------------------------------- */

DATA WORK.VACATION_SWAPS_TO_LOAD2;
    LENGTH
        FLEET            $ 32
        RANK             $ 3
        WEEK               8 ;
    SET WORK.VACATION_SWAPS_TO_LOAD2;
    FORMAT
        FLEET            $CHAR32.
        RANK             $CHAR3.
        WEEK             DATE9. ;
    INFORMAT
        FLEET            $CHAR32.
        RANK             $CHAR3.
        WEEK             ANYDTDTE9. ;
    /* ----------------------------------------------------------------
       For fields that were imported as datetime values but need to be
       treated as date or time values, a conversion is required in
       order to extract the date or time portion from the datetime
       value.
       ---------------------------------------------------------------- */
    WEEK = DATEPART(WEEK);
RUN;

...And here's the code generated by the import data task for the non-SAS/ACCESS version

/* --------------------------------------------------------------------
   This DATA step reads the data values from a temporary text file
   created by the Import Data wizard. The values within the temporary
   text file were extracted from the Excel source file.
   -------------------------------------------------------------------- */

DATA WORK.VACATION_SWAPS_TO_LOAD;
    LENGTH
        FLEET            $ 32
        RANK             $ 3
        WEEK               8 ;
    FORMAT
        FLEET            $CHAR32.
        RANK             $CHAR3.
        WEEK             DATE9. ;
    INFORMAT
        FLEET            $CHAR32.
        RANK             $CHAR3.
        WEEK             ANYDTDTE9. ;
    INFILE 'I:\SASWork\_TD6320_CMHPRDSASMET01_\#LN00954'
        LRECL=27
        ENCODING="WLATIN1"
        TERMSTR=CRLF
        DLM='7F'x
        MISSOVER
        DSD ;
    INPUT
        FLEET            : $CHAR32.
        RANK             : $CHAR3.
        WEEK             : ANYDTDTE9. ;
RUN;

 

Super User
Posts: 11,134

Re: Import from Excel Problem: Data step vs Proc Import

This code destroyed in actual imported dataset:

DATA WORK.VACATION_SWAPS_TO_LOAD2;
    LENGTH
        FLEET            $ 32
        RANK             $ 3
        WEEK               8 ;
    SET WORK.VACATION_SWAPS_TO_LOAD2;

So you are not looking at the data as imported.

 

Please rerun the Proc import and then run Proc Contents and show the results.

 

I suspect that your treatment of an original DATE value as datetime reduced the values so much. Since a Date value in Excel of 1/1/2015 would be 42005 and you treated it as a Datetime (which is seconds) then that became day 0 which is 1Jan1960.

See this code:

data junk;
   x=42005;
   put x= datetime20.;
   y = datepart(42005);
   put y;

run;

I suspect the actual Import created a DATE value that you may have assumed was a datetime.

 

Occasional Contributor
Posts: 5

Re: Import from Excel Problem: Data step vs Proc Import

It looks like you are right, but that goes a bit beyond my question.  To be clear, I did not write any of the code above, it is all autogenerated by EG, and the only difference between the two is due to the SAS/ACCESS checkbox being checked or not.  I'm more interested in figuring out if what I'm expecting from checking that box is different from what I should be expecting to happen. 

Super User
Posts: 11,134

Re: Import from Excel Problem: Data step vs Proc Import

Please post the proc contents after the import. The comments you say about "all the dates are " indicate to me that you did not look at the imported data before manipulating it.

Or import it to one dataset using the proc import and then use the datastep to read the file into a differently named dataset and then run PROC COMPARE on the two before doing any manipulation.

 

You do not say how the TEXT file was created. It is possible that the step creating the file manipulates values.

Super User
Posts: 7,447

Re: Import from Excel Problem: Data step vs Proc Import

Excel files are among the worst types of data sources I ever came across in my time as a DWH admin/developer.

You are much better off saving the data to a textual format and importing that.

Why the MS-supplied Data Access Objects treats Excel dates as datetime values when exporting is a secret that probably will never be lifted.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 5

Re: Import from Excel Problem: Data step vs Proc Import

Judging from the first few responses, I think I need to restate my problem/question.  I am not asking how to fix the code to get data into SAS correctly (which, again, is wholly autogenerated by EG); I can get the data in just fine so long as I don't use SAS/ACCESS.  I'm also not asking whether I should be using Excel or not.  For various reasons, Excel is the preferred method here for entering the data. In fact, I would probably prefer to use a CSV here, but as I have a method that can import the Excel data fine, and Excel is the preferred method for this project, it's staying in Excel.

 

The question I am asking is this:  Why does Enterprise Guide's (v7.1) Import Data task, when given identical input parameters, import data differently when using SAS/ACCESS or not?  Is this expected behavior?  If I use SAS/ACCESS do I need to use different input parameters?

Super User
Super User
Posts: 6,848

Re: Import from Excel Problem: Data step vs Proc Import

Why?  Because you are using two different methods.

If you ask it to use SAS/Access then it will use PROC IMPORT to read the file. Otherwise Enterprise Guide will convert the Excel file to a text file, copy the text file to your SAS server and write a data step to read it.

 

If you are always going to use Enterprise Guide and the source file is available to use on the machine where Enterprise Guide is running then I would recommend using that tool as I think you will have better control over the result.

 

Using PROC IMPORT to convert a file requires it to guess at what the file contains.

Ask a Question
Discussion stats
  • 6 replies
  • 620 views
  • 0 likes
  • 4 in conversation