Desktop productivity for business analysts and programmers

Import Data with mixed column

Reply
New Contributor
Posts: 2

Import Data with mixed column

[ Edited ]

Hi 

I have a exel table (xlsx) with Date column where some values are text 'NULL' while some values have date in format mmddyy.

I'm able to import this data in sas eg thr' import wizard only with defining field attributes as date and not any other method.

the import wizard creates sas program which i can not use 

so everytime i have to use same folder and location to immport and i have many files

which is a lot of manual wok 

can i reduce manual work of importing xlsx files with mixed columns to a simple sas program which i can modify and use for other files too.

 

plz inform if i'm missing something. 

Thanks in advance

Super Contributor
Posts: 406

Re: Import Data with mixed column

What do you mean when you say "I cannot use"? When the text NULL is translated to a missing value in SAS that would be a valid result. Could you share that code and its log? 

 

Otherwise you would have to create a character variable from your date and do some postprocessing like

 

if datevar_char=' NULL' then 
   datevar_num=.;
else
   datevar_num=input(datevar_char, anydtdte.);

The informat ANYDTDTE could be replaced with any informat that works for your specific data.

 

Hope this helps,

- Jan.

New Contributor
Posts: 2

Re: Import Data with mixed column

*proc import datafile="C:\Attrition Data\1607\Active_Employees.xlsx" dbms=xlsx out=abc replace; run;



options validvarname=any;


data at;
    LENGTH
        EmployeeCode     $ 6
        DateofJoin         8
        DateOfRelieving    8;
    FORMAT
        EmployeeCode     $CHAR6.
        DateofJoin       DATE9.
        DateOfRelieving  DATE9.;
    INFORMAT
        EmployeeCode     $CHAR6.
        DateofJoin       DATE9.
        DateOfRelieving  DATE9.;
infile 'C:\Attrition Data\1607\Book1.xlsx'
       LRECL=915
        DLM='7F'x
        MISSOVER
        DSD ;
    INPUT
        EmployeeCode     : $CHAR6.
        DateofJoin       : BEST32.
        DateOfRelieving  : ANYDTDTE10.;
RUN;

table i am using is Book1

 EmployeeCode        DateofJoin          DateOfRelieving 
R11/1/2000NULL
R21/2/200010/12/2012
R31/3/200010/12/2012
R41/4/2000NULL
R51/5/2000NULL
R61/6/200010/9/2012
R71/7/200010/15/2012
R81/8/2000NULL
R91/9/2000NULL
R101/10/2000NULL
R111/11/200010/15/2012
R121/12/2000NULL
R131/13/2000NULL
R141/14/200010/15/2012
R151/15/2000NULL

and the output i am getting is for data step >at

EmployeeCode DateofJoin DateOfRelieving
PK . .

output for proc import is (whre dates are from 01/01/1900 hence i can not do post processing after import.)

R1 01/01/2000 NULL
R2 01/02/2000 41194
R3 01/03/2000 41194
R4 01/04/2000 NULL
R5 01/05/2000 NULL
R6 01/06/2000 41191
R7 01/07/2000 41197
R8 01/08/2000 NULL
R9 01/09/2000 NULL
R10 01/10/2000 NULL
R11 01/11/2000 41197
R12 01/12/2000 NULL
R13 01/13/2000 NULL
R14 01/14/2000 41197
R15 01/15/2000 NULL

 

and expected output is 

 EmployeeCode        DateofJoin          DateOfRelieving 
R11/1/2000.
R21/2/200010/12/2012
R31/3/200010/12/2012
R41/4/2000.
R51/5/2000.
R61/6/200010/9/2012
R71/7/200010/15/2012
R81/8/2000.
R91/9/2000.
R101/10/2000.
R111/11/200010/15/2012
R121/12/2000.
R131/13/2000.
R141/14/200010/15/2012
R151/15/2000.
Grand Advisor
Posts: 10,251

Re: Import Data with mixed column

Did you assign a FORMAT to DateOfRelieving?

 

I would expect to see a Format DateOfRelieving  mmddyy10.; statement.

Grand Advisor
Posts: 17,464

Re: Import Data with mixed column

In short not easily. SAS doesn't have an easy way to specify the field types from an Excel. Your also compounded by using EG which may mean your files have to be uploaded to a server before importing them in an automated fashion. 

 

In the EG wizard you can manually change the field type. 

Valued Guide
Posts: 505

Re: Import Data with mixed column

SAS Forum: Import Data with mixed column

https://goo.gl/HEclpw
https://communities.sas.com/t5/SAS-Enterprise-Guide/Import-Data-with-mixed-column/m-p/292974

HAVE this in Excel
(I cut and pasted the excel sheet from the solution BELOW)


EMPLOYEECODE  DATEOFJOIN       DATEOFRELIEVING
R1              1/1/2000            NULL
R2              1/2/2000            10/12/2012
R3              1/3/2000            10/12/2012
R4              1/4/2000            NULL
R5              1/5/2000            NULL
R6              1/6/2000            10/9/2012
R7              1/7/2000            10/15/2012
R8              1/8/2000            NULL
R9              1/9/2000            NULL
R10             1/10/2000           NULL
R11             1/11/2000           10/15/2012
R12             1/12/2000           NULL
R13             1/13/2000           NULL
R14             1/14/2000           10/15/2012
R15             1/15/2000           NULL

WANT SAS DATASET

Up to 40 obs from sample total obs=15

Obs    EMPLOYEECODE    DATEOFJOIN    DATEOFRELIEVING

  1        R1          2000-01-01      .
  2        R2          2000-01-02      2012-10-12
  3        R3          2000-01-03      2012-10-12
  4        R4          2000-01-04      .
  5        R5          2000-01-05      .
  6        R6          2000-01-06      2012-10-09
  7        R7          2000-01-07      2012-10-15
  8        R8          2000-01-08      .
  9        R9          2000-01-09      .
 10        R10         2000-01-10      .
 11        R11         2000-01-11      2012-10-15
 12        R12         2000-01-12      .
 13        R13         2000-01-13      .
 14        R14         2000-01-14      2012-10-15
 15        R15         2000-01-15      .


SOLUTION (This is the working code)

proc sql dquote=ansi;
  connect to excel (Path="d:\xls\sample.xlsx" mixed=yes);
    create table sample as
    select * from connection to Excel
        (
         Select
               EmployeeCode
              ,format(DateofJoin,'yyyy-mm-dd') as DateofJoin
              ,iif(DateOfRelieving="NULL",".",FORMAT(DateOfRelieving, 'yyyy-mm-dd')) as DateOfRelieving
         from
              sample
        );
    disconnect from Excel;
Quit;

COMMENTS

I believe it is possible to create an excel sheet where
the meta data on a cell type(format) does not
match text in the cell. Manually editing
a cell that has a date format does not change the underlying
meta data(type). Just type ROGER in a date formatted cell.
In this case I think you would
need to programatically create an image file and use
the SAS tesseract tool to convert the image to text.
Python OpenxL can extract the cell format(c.type)

full solution

%utlfkil(d:/xls/sample.xlsx);
libname xls "d:/xls/sample.xlsx";
data xls.sample;
  informat EmployeeCode $3.
  DateofJoin mmddyy10.
  DateOfRelieving $10.;
  format DateofJoin mmddyys10.;
  input
    EmployeeCode   DateofJoin   DateOfRelieving;
cards4;
R1 1/1/2000 NULL
R2 1/2/2000 10/12/2012
R3 1/3/2000 10/12/2012
R4 1/4/2000 NULL
R5 1/5/2000 NULL
R6 1/6/2000 10/9/2012
R7 1/7/2000 10/15/2012
R8 1/8/2000 NULL
R9 1/9/2000 NULL
R10 1/10/2000 NULL
R11 1/11/2000 10/15/2012
R12 1/12/2000 NULL
R13 1/13/2000 NULL
R14 1/14/2000 10/15/2012
R15 1/15/2000 NULL
;;;;
;run;quit;
libname xls clear;

libname xls "d:/xls/sample.xlsx";
proc print data=xls.sample width=min;
run;quit;
libname xls clear;


proc sql dquote=ansi;
  connect to excel (Path="d:\xls\sample.xlsx" mixed=yes);
    create table sample as
    select * from connection to Excel
        (
         Select
               EmployeeCode
              ,format(DateofJoin,'yyyy-mm-dd') as DateofJoin
              ,iif(DateOfRelieving="NULL",".",FORMAT(DateOfRelieving, 'yyyy-mm-dd')) as DateOfRelieving
         from
              sample
        );
    disconnect from Excel;
Quit;


WANT SAS DATASET

Up to 40 obs from sample total obs=15

Obs    EMPLOYEECODE    DATEOFJOIN    DATEOFRELIEVING

  1        R1          2000-01-01      .
  2        R2          2000-01-02      2012-10-12
  3        R3          2000-01-03      2012-10-12
  4        R4          2000-01-04      .
  5        R5          2000-01-05      .
  6        R6          2000-01-06      2012-10-09
  7        R7          2000-01-07      2012-10-15
  8        R8          2000-01-08      .
  9        R9          2000-01-09      .
 10        R10         2000-01-10      .
 11        R11         2000-01-11      2012-10-15
 12        R12         2000-01-12      .
 13        R13         2000-01-13      .
 14        R14         2000-01-14      2012-10-15
 15        R15         2000-01-15      .



Ask a Question
Discussion stats
  • 5 replies
  • 313 views
  • 2 likes
  • 5 in conversation