Hi
I have a exel table (xlsx) with Date column where some values are text 'NULL' while some values have date in format mmddyy.
I'm able to import this data in sas eg thr' import wizard only with defining field attributes as date and not any other method.
the import wizard creates sas program which i can not use
so everytime i have to use same folder and location to immport and i have many files
which is a lot of manual wok
can i reduce manual work of importing xlsx files with mixed columns to a simple sas program which i can modify and use for other files too.
plz inform if i'm missing something.
Thanks in advance
What do you mean when you say "I cannot use"? When the text NULL is translated to a missing value in SAS that would be a valid result. Could you share that code and its log?
Otherwise you would have to create a character variable from your date and do some postprocessing like
if datevar_char=' NULL' then
datevar_num=.;
else
datevar_num=input(datevar_char, anydtdte.);
The informat ANYDTDTE could be replaced with any informat that works for your specific data.
Hope this helps,
- Jan.
Thanks jklaverstijn,
here is code i wrote
*proc import datafile="C:\Attrition Data\1607\Active_Employees.xlsx" dbms=xlsx out=abc replace; run;
options validvarname=any;
data at;
LENGTH
EmployeeCode $ 6
DateofJoin 8
DateOfRelieving 8;
FORMAT
EmployeeCode $CHAR6.
DateofJoin DATE9.
DateOfRelieving DATE9.;
INFORMAT
EmployeeCode $CHAR6.
DateofJoin DATE9.
DateOfRelieving DATE9.;
infile 'C:\Attrition Data\1607\Book1.xlsx'
LRECL=915
DLM='7F'x
MISSOVER
DSD ;
INPUT
EmployeeCode : $CHAR6.
DateofJoin : BEST32.
DateOfRelieving : ANYDTDTE10.;
RUN;
table i am using is Book1
EmployeeCode | DateofJoin | DateOfRelieving |
R1 | 1/1/2000 | NULL |
R2 | 1/2/2000 | 10/12/2012 |
R3 | 1/3/2000 | 10/12/2012 |
R4 | 1/4/2000 | NULL |
R5 | 1/5/2000 | NULL |
R6 | 1/6/2000 | 10/9/2012 |
R7 | 1/7/2000 | 10/15/2012 |
R8 | 1/8/2000 | NULL |
R9 | 1/9/2000 | NULL |
R10 | 1/10/2000 | NULL |
R11 | 1/11/2000 | 10/15/2012 |
R12 | 1/12/2000 | NULL |
R13 | 1/13/2000 | NULL |
R14 | 1/14/2000 | 10/15/2012 |
R15 | 1/15/2000 | NULL |
and the output i am getting is for data step >at
EmployeeCode DateofJoin DateOfRelieving
PK . .
output for proc import is (whre dates are from 01/01/1900 hence i can not do post processing after import.)
R1 01/01/2000 NULL
R2 01/02/2000 41194
R3 01/03/2000 41194
R4 01/04/2000 NULL
R5 01/05/2000 NULL
R6 01/06/2000 41191
R7 01/07/2000 41197
R8 01/08/2000 NULL
R9 01/09/2000 NULL
R10 01/10/2000 NULL
R11 01/11/2000 41197
R12 01/12/2000 NULL
R13 01/13/2000 NULL
R14 01/14/2000 41197
R15 01/15/2000 NULL
and expected output is
EmployeeCode | DateofJoin | DateOfRelieving |
R1 | 1/1/2000 | . |
R2 | 1/2/2000 | 10/12/2012 |
R3 | 1/3/2000 | 10/12/2012 |
R4 | 1/4/2000 | . |
R5 | 1/5/2000 | . |
R6 | 1/6/2000 | 10/9/2012 |
R7 | 1/7/2000 | 10/15/2012 |
R8 | 1/8/2000 | . |
R9 | 1/9/2000 | . |
R10 | 1/10/2000 | . |
R11 | 1/11/2000 | 10/15/2012 |
R12 | 1/12/2000 | . |
R13 | 1/13/2000 | . |
R14 | 1/14/2000 | 10/15/2012 |
R15 | 1/15/2000 | . |
Did you assign a FORMAT to DateOfRelieving?
I would expect to see a Format DateOfRelieving mmddyy10.; statement.
In short not easily. SAS doesn't have an easy way to specify the field types from an Excel. Your also compounded by using EG which may mean your files have to be uploaded to a server before importing them in an automated fashion.
In the EG wizard you can manually change the field type.
SAS Forum: Import Data with mixed column
https://goo.gl/HEclpw
https://communities.sas.com/t5/SAS-Enterprise-Guide/Import-Data-with-mixed-column/m-p/292974
HAVE this in Excel
(I cut and pasted the excel sheet from the solution BELOW)
EMPLOYEECODE DATEOFJOIN DATEOFRELIEVING
R1 1/1/2000 NULL
R2 1/2/2000 10/12/2012
R3 1/3/2000 10/12/2012
R4 1/4/2000 NULL
R5 1/5/2000 NULL
R6 1/6/2000 10/9/2012
R7 1/7/2000 10/15/2012
R8 1/8/2000 NULL
R9 1/9/2000 NULL
R10 1/10/2000 NULL
R11 1/11/2000 10/15/2012
R12 1/12/2000 NULL
R13 1/13/2000 NULL
R14 1/14/2000 10/15/2012
R15 1/15/2000 NULL
WANT SAS DATASET
Up to 40 obs from sample total obs=15
Obs EMPLOYEECODE DATEOFJOIN DATEOFRELIEVING
1 R1 2000-01-01 .
2 R2 2000-01-02 2012-10-12
3 R3 2000-01-03 2012-10-12
4 R4 2000-01-04 .
5 R5 2000-01-05 .
6 R6 2000-01-06 2012-10-09
7 R7 2000-01-07 2012-10-15
8 R8 2000-01-08 .
9 R9 2000-01-09 .
10 R10 2000-01-10 .
11 R11 2000-01-11 2012-10-15
12 R12 2000-01-12 .
13 R13 2000-01-13 .
14 R14 2000-01-14 2012-10-15
15 R15 2000-01-15 .
SOLUTION (This is the working code)
proc sql dquote=ansi;
connect to excel (Path="d:\xls\sample.xlsx" mixed=yes);
create table sample as
select * from connection to Excel
(
Select
EmployeeCode
,format(DateofJoin,'yyyy-mm-dd') as DateofJoin
,iif(DateOfRelieving="NULL",".",FORMAT(DateOfRelieving, 'yyyy-mm-dd')) as DateOfRelieving
from
sample
);
disconnect from Excel;
Quit;
COMMENTS
I believe it is possible to create an excel sheet where
the meta data on a cell type(format) does not
match text in the cell. Manually editing
a cell that has a date format does not change the underlying
meta data(type). Just type ROGER in a date formatted cell.
In this case I think you would
need to programatically create an image file and use
the SAS tesseract tool to convert the image to text.
Python OpenxL can extract the cell format(c.type)
full solution
%utlfkil(d:/xls/sample.xlsx);
libname xls "d:/xls/sample.xlsx";
data xls.sample;
informat EmployeeCode $3.
DateofJoin mmddyy10.
DateOfRelieving $10.;
format DateofJoin mmddyys10.;
input
EmployeeCode DateofJoin DateOfRelieving;
cards4;
R1 1/1/2000 NULL
R2 1/2/2000 10/12/2012
R3 1/3/2000 10/12/2012
R4 1/4/2000 NULL
R5 1/5/2000 NULL
R6 1/6/2000 10/9/2012
R7 1/7/2000 10/15/2012
R8 1/8/2000 NULL
R9 1/9/2000 NULL
R10 1/10/2000 NULL
R11 1/11/2000 10/15/2012
R12 1/12/2000 NULL
R13 1/13/2000 NULL
R14 1/14/2000 10/15/2012
R15 1/15/2000 NULL
;;;;
;run;quit;
libname xls clear;
libname xls "d:/xls/sample.xlsx";
proc print data=xls.sample width=min;
run;quit;
libname xls clear;
proc sql dquote=ansi;
connect to excel (Path="d:\xls\sample.xlsx" mixed=yes);
create table sample as
select * from connection to Excel
(
Select
EmployeeCode
,format(DateofJoin,'yyyy-mm-dd') as DateofJoin
,iif(DateOfRelieving="NULL",".",FORMAT(DateOfRelieving, 'yyyy-mm-dd')) as DateOfRelieving
from
sample
);
disconnect from Excel;
Quit;
WANT SAS DATASET
Up to 40 obs from sample total obs=15
Obs EMPLOYEECODE DATEOFJOIN DATEOFRELIEVING
1 R1 2000-01-01 .
2 R2 2000-01-02 2012-10-12
3 R3 2000-01-03 2012-10-12
4 R4 2000-01-04 .
5 R5 2000-01-05 .
6 R6 2000-01-06 2012-10-09
7 R7 2000-01-07 2012-10-15
8 R8 2000-01-08 .
9 R9 2000-01-09 .
10 R10 2000-01-10 .
11 R11 2000-01-11 2012-10-15
12 R12 2000-01-12 .
13 R13 2000-01-13 .
14 R14 2000-01-14 2012-10-15
15 R15 2000-01-15 .
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.