BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Barkat
Pyrite | Level 9

I have a dataset that contains both text and date value in  one column. When I import the dataset in SAS, that column is imported as char. How do I keep the dates in DATE9. format and convert the texts as missing in SAS?

 

identifiereffectiveFromTimeeffectiveThruTime
14/23/1987 23:00NULL
26/4/1987 0:557/1/1997 10:50
35/27/1987 8:207/15/2015 11:20
45/28/1987 5:208/4/2011 0:00
56/1/1987 14:56NULL
65/28/1987 4:50NULL
74/15/1987 0:20NULL
85/11/1987 17:069/6/1996 21:30
96/8/1987 10:407/1/1997 11:50

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Columns that contained non numeric values were not recognized as numeric (including dates) during the import process. This is why you have to do some gymnastics, such as:

 

proc import datafile="/..../have.xlsx"
out=have
dbms=xlsx
replace;
getnames=yes;
run;

data want;
set have;
/* This field was recognized as a datetime during the import process. Convert SAS datetime to SAS date */ 
effectiveFromDate = datepart(effectiveFromTime);
/* This field was NOT recognized as a datetime during the import process. Read Windows date value */
WindowsDateValue = input(effectiveThruTime, ?? best.);
/* Convert to SAS date: Correct for reference date difference between Windows and SAS */
if not missing(WindowsDateValue) then effectiveThruDate = WindowsDateValue + '30DEC1899'd;
format effectiveFromDate effectiveThruDate date9.;
drop effectiveFromTime effectiveThruTime WindowsDateValue;
run;

proc print data=want noobs; run;
identifier 	personIdentifier 	closureReasonCode 	effectiveFromDate 	effectiveThruDate
1 	4 	NULL 	23APR1987 	.
2 	6 	NULL 	04JUN1987 	01JUL1997
3 	10 	2268 	27MAY1987 	15JUL2015
4 	13 	2425 	28MAY1987 	04AUG2011
5 	14 	NULL 	01JUN1987 	.
6 	18 	NULL 	28MAY1987 	.
7 	25 	NULL 	15APR1987 	.
8 	27 	NULL 	11MAY1987 	06SEP1996
9 	29 	NULL 	08JUN1987 	01JUL1997
10 	43 	NULL 	23JUN1987 	.
11 	46 	2425 	17NOV1987 	04AUG2011
12 	47 	2268 	11MAY1987 	04AUG2011
13 	48 	2425 	11MAY1987 	04AUG2011
14 	62 	NULL 	12AUG1987 	.
15 	63 	NULL 	30JUL1987 	.
16 	68 	2425 	12JUN1987 	04AUG2011
17 	81 	2572 	02JUN1987 	04AUG2011
18 	86 	2425 	28JUL1987 	04AUG2011
19 	87 	2425 	28JUL1987 	04AUG2011
20 	94 	NULL 	22SEP1987 	.
21 	98 	2425 	20APR1987 	04AUG2011
22 	99 	2140 	11JUN1987 	04AUG2011
23 	110 	NULL 	21AUG1987 	03JUL1997
24 	119 	2425 	22JUN1987 	04AUG2011
25 	121 	NULL 	07JUL1987 	.
PG

View solution in original post

10 REPLIES 10
PGStats
Opal | Level 21

Use the SCAN function to get the first part of the text, and the INPUT function with the ?? modifier to convert to a SAS date value or a missing value without issuing a warning:

 

data have;
input Identifier   effectiveFromTime :&$20.   effectiveThruTime :&$20.;
datalines;
1   4/23/1987 23:00  NULL
2   6/4/1987 0:55   7/1/1997 10:50
3   5/27/1987 8:20  7/15/2015 11:20
4   5/28/1987 5:20  8/4/2011 0:00
5   6/1/1987 14:56  NULL
6   5/28/1987 4:50  NULL
7   4/15/1987 0:20  NULL
8   5/11/1987 17:06  9/6/1996 21:30
9   6/8/1987 10:40  7/1/1997 11:50
;

data want;
set have;
effectiveFromDate = input(scan(effectiveFromTime, 1, " "), ?? mmddyy.);
effectiveThruDate = input(scan(effectiveThruTime, 1, " "), ?? mmddyy.);
format effectiveFromDate effectiveThruDate date9.;
drop effectiveFromTime effectiveThruTime;
run;
PG
Barkat
Pyrite | Level 9
Thanks PGStats. That worked. Could you explain the purpose of "??", please.
PGStats
Opal | Level 21

?? tells SAS not to issue a warning message when a data conversion fails, such as when you try to convert "NULL" into a SAS date.

PG
Barkat
Pyrite | Level 9
I am sorry to respond late. I just realized that the want dataset is not giving the correct date. The output dataset is as bellow.

Identifier effectiveFromDate effectiveThruDate
1 23-Apr-01 .
2 4-Jun-19 1-Jul-19
3 27-May-01 15-Jul-02
4 28-May-01 4-Aug-20
5 1-Jun-19 .
6 28-May-01 .
7 15-Apr-01 .
8 11-May-01 6-Sep-19
9 8-Jun-19 1-Jul-19
PGStats
Opal | Level 21

Oups. You need to give a length to the informat (the default is 6 - too short)

 

data want;
set have;
effectiveFromDate = input(scan(effectiveFromTime, 1, " "), ?? mmddyy10.);
effectiveThruDate = input(scan(effectiveThruTime, 1, " "), ?? mmddyy10.);
format effectiveFromDate effectiveThruDate date9.;
drop effectiveFromTime effectiveThruTime;
run;
PG
Barkat
Pyrite | Level 9

Thanks so much!
Unfortunately when I import the data from my excel file (attached here) it gives weird value. Could you help me survive, please. I used the following step to import file. I also saved the excel file into csv and imported, but that did not work as well.

proc import datafile="C:\Users\bullah\Desktop\have.xlsx"
out=have
dbms=xlsx
replace;
getnames=yes;
run;

 

PGStats
Opal | Level 21

Columns that contained non numeric values were not recognized as numeric (including dates) during the import process. This is why you have to do some gymnastics, such as:

 

proc import datafile="/..../have.xlsx"
out=have
dbms=xlsx
replace;
getnames=yes;
run;

data want;
set have;
/* This field was recognized as a datetime during the import process. Convert SAS datetime to SAS date */ 
effectiveFromDate = datepart(effectiveFromTime);
/* This field was NOT recognized as a datetime during the import process. Read Windows date value */
WindowsDateValue = input(effectiveThruTime, ?? best.);
/* Convert to SAS date: Correct for reference date difference between Windows and SAS */
if not missing(WindowsDateValue) then effectiveThruDate = WindowsDateValue + '30DEC1899'd;
format effectiveFromDate effectiveThruDate date9.;
drop effectiveFromTime effectiveThruTime WindowsDateValue;
run;

proc print data=want noobs; run;
identifier 	personIdentifier 	closureReasonCode 	effectiveFromDate 	effectiveThruDate
1 	4 	NULL 	23APR1987 	.
2 	6 	NULL 	04JUN1987 	01JUL1997
3 	10 	2268 	27MAY1987 	15JUL2015
4 	13 	2425 	28MAY1987 	04AUG2011
5 	14 	NULL 	01JUN1987 	.
6 	18 	NULL 	28MAY1987 	.
7 	25 	NULL 	15APR1987 	.
8 	27 	NULL 	11MAY1987 	06SEP1996
9 	29 	NULL 	08JUN1987 	01JUL1997
10 	43 	NULL 	23JUN1987 	.
11 	46 	2425 	17NOV1987 	04AUG2011
12 	47 	2268 	11MAY1987 	04AUG2011
13 	48 	2425 	11MAY1987 	04AUG2011
14 	62 	NULL 	12AUG1987 	.
15 	63 	NULL 	30JUL1987 	.
16 	68 	2425 	12JUN1987 	04AUG2011
17 	81 	2572 	02JUN1987 	04AUG2011
18 	86 	2425 	28JUL1987 	04AUG2011
19 	87 	2425 	28JUL1987 	04AUG2011
20 	94 	NULL 	22SEP1987 	.
21 	98 	2425 	20APR1987 	04AUG2011
22 	99 	2140 	11JUN1987 	04AUG2011
23 	110 	NULL 	21AUG1987 	03JUL1997
24 	119 	2425 	22JUN1987 	04AUG2011
25 	121 	NULL 	07JUL1987 	.
PG
Barkat
Pyrite | Level 9
Thanks! You are a life saver.
Patrick
Opal | Level 21

One of the challenges is that SAS will import/convert the source date string differently if there is or isn't a string in one of the cells. 

To ensure SAS always imports the data the same way below code uses getnames=no for Proc Import. This will treat the header row as data and as these headers are always strings will create all SAS columns of type character. This guarantees a stable outcome of Proc Import independent what values you've actually got in the cells (like all Date strings or also come cells with word NULL).

Now that you've got a stable result from the import into SAS you can write code to post-process your data and create table want.

/* import Excel sheet
   - use getnames=no to guarantee that columns always character
*/
proc import 
  out=inter
  datafile='~/test/have.xlsx' 
  dbms=xlsx 
  replace
  ;
  sheet='Sheet1';
  getnames=no;
 run;

proc print data=inter;
run;

/* convert data to desired result */
data want;
  /* Exclude column names: start reading from obs 2 */
  set inter(firstobs=2);
  /* remove NULL string from source data */
  array vars {*} _character_;
  do _i=1 to dim(vars);
    vars[_i]=tranwrd(vars[_i],'NULL',' ');
  end;
  /* convert Excel date values to SAS Date values */
  format effectiveFromTime effectiveThruTime date9.;
  effectiveFromTime=input(d,?? best32.)-21916;
  effectiveThruTime=input(e,?? best32.)-21916;
  /* rename and drop columns to get desired output structure */
  rename 
    a=identifier
    b=personIdentifier
    c=closureReason
    ;
  drop d e _:;
run;

proc print data=want;
run;
Barkat
Pyrite | Level 9
Thanks so much!