Hi there,
I'm getting some raw data that has the following dates in the file: 01.01.0001 but when I read it in, it comes in as 01.01.2001. Here is an example:
DATA temp;
input polno $
prdtstrt $10.
prdtend $10.;
cards;
AA1 01.01.000131.12.9999
AA2 01.01.000131.12.9999
AA3 01.01.200115.10.2009
AA4 01.11.201001.11.2011
AA5 01.03.201101.03.2012
AA6 01.05.201101.05.2011
;
run;
I need to be able to distinguish 0001 from 2001. I don't mind if all these values are set to missing, just don't want them in the dataset as 2001.
Thanks,
Christy
There's a disconnect here. The code uses a character informat ($10.) but the narrative describes behavior of a date informat (ddmmyy10.). So consider instead this example:
options yearcutoff=5050 ;
data _null_ ;;
input vtext $10.
@1 vdate ddmmyy10. ;
put vtext= vdate= date9. / ;
cards ;
01.01.0000
01.01.0001
31.12.0049
01.01.0050
31.12.0099
01.01.0100
31.12.1581
01.01.1582
;
Results:
vtext=01.01.0000 vdate=01JAN5100
vtext=01.01.0001 vdate=01JAN5101
vtext=31.12.0049 vdate=31DEC5149
vtext=01.01.0050 vdate=01JAN5050
vtext=31.12.0099 vdate=31DEC5099
vtext=01.01.0100 vdate=.
vtext=31.12.1581 vdate=.
vtext=01.01.1582 vdate=01JAN1582
So years from 0001 to 0099, which should be rejected as out-of-range, are instead processed using the YEARCUTOFF option, as if the two leading zeroes weren't there. Documentation says that YEARCUTOFF only applies to 2-digit years.
Here's a workaround. Define an informat like
proc format ;
invalue dmy111_
'01.01.0001' = .
other = [ddmmyy10.] ;
run ;
and use it to process the date fields.
christyh wrote:
Hi there,
I'm getting some raw data that has the following dates in the file: 01.01.0001 but when I read it in, it comes in as 01.01.2001. Here is an example:
DATA temp;
input polno $
prdtstrt $10.
prdtend $10.;
cards;
AA1 01.01.000131.12.9999
AA2 01.01.000131.12.9999
AA3 01.01.200115.10.2009
AA4 01.11.201001.11.2011
AA5 01.03.201101.03.2012
AA6 01.05.201101.05.2011
;
run;
I need to be able to distinguish 0001 from 2001. I don't mind if all these values are set to missing, just don't want them in the dataset as 2001.
Thanks,
Christy
Hi Crhirsy,
Use YEARCUTOFFoption
If you attempt to read in an external file without scrubbing the data, SAS will convert dates older than 1582 A.D. to null dates.
SAS can perform calculations on dates ranging from A.D. 1582 to A.D. 19,900
http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a002200738.htm
Randy
Hi Randy,
Thanks for your response... so in this case, the raw year is 0001, why is it being imputted as 2001? Based on your response I believe the value should be null (and that would be great) but it actually is being read in as a 2001 date which I don't want.
Cheers,
Christy
You could always simply parse the string into numbers that represent month, day and year. That way, you could treat years like 0001 however you care to treat them.
e.g.,
DATA temp;
input polno $
prdtstrt $10.
prdtend $10.;
prdtstrt_day=input(scan(prdtstrt,1,"."),best12.);
prdtstrt_month=input(scan(prdtstrt,2,"."),best12.);
prdtstrt_year=input(scan(prdtstrt,3,"."),best12.);
prdtend_day=input(scan(prdtend,1,"."),best12.);
prdtend_month=input(scan(prdtend,2,"."),best12.);
prdtend_year=input(scan(prdtend,3,"."),best12.);
cards;
AA1 01.01.000131.12.9999
AA2 01.01.000131.12.9999
AA3 01.01.200115.10.2009
AA4 01.11.201001.11.2011
AA5 01.03.201101.03.2012
AA6 01.05.201101.05.2011
;
run;
There's a disconnect here. The code uses a character informat ($10.) but the narrative describes behavior of a date informat (ddmmyy10.). So consider instead this example:
options yearcutoff=5050 ;
data _null_ ;;
input vtext $10.
@1 vdate ddmmyy10. ;
put vtext= vdate= date9. / ;
cards ;
01.01.0000
01.01.0001
31.12.0049
01.01.0050
31.12.0099
01.01.0100
31.12.1581
01.01.1582
;
Results:
vtext=01.01.0000 vdate=01JAN5100
vtext=01.01.0001 vdate=01JAN5101
vtext=31.12.0049 vdate=31DEC5149
vtext=01.01.0050 vdate=01JAN5050
vtext=31.12.0099 vdate=31DEC5099
vtext=01.01.0100 vdate=.
vtext=31.12.1581 vdate=.
vtext=01.01.1582 vdate=01JAN1582
So years from 0001 to 0099, which should be rejected as out-of-range, are instead processed using the YEARCUTOFF option, as if the two leading zeroes weren't there. Documentation says that YEARCUTOFF only applies to 2-digit years.
Here's a workaround. Define an informat like
proc format ;
invalue dmy111_
'01.01.0001' = .
other = [ddmmyy10.] ;
run ;
and use it to process the date fields.
christyh wrote:
Hi there,
I'm getting some raw data that has the following dates in the file: 01.01.0001 but when I read it in, it comes in as 01.01.2001. Here is an example:
DATA temp;
input polno $
prdtstrt $10.
prdtend $10.;
cards;
AA1 01.01.000131.12.9999
AA2 01.01.000131.12.9999
AA3 01.01.200115.10.2009
AA4 01.11.201001.11.2011
AA5 01.03.201101.03.2012
AA6 01.05.201101.05.2011
;
run;
I need to be able to distinguish 0001 from 2001. I don't mind if all these values are set to missing, just don't want them in the dataset as 2001.
Thanks,
Christy
Thanks so much! that's exactly what I was after
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.