BookmarkSubscribeRSS Feed
Solph
Pyrite | Level 9

I know I can use INPUT to convert a character variable to a date variable. But what if one of the date components is missing or out of range. Here is an example with some of the wrong day component value. Is there an easy way or a function to tell SAS to replace the out of range or missing as say day 1 or day 31 or any day?

Thanks.

data aa; input id date $;

datalines;

1  20110304

2  20090405

3  20100399

4  20100300

5  201004

;

run;

data bb; set aa;

format newdate yymmdd10.;

newdate=input(date, yymmdd10.);

proc print noobs; run;

id   date        newdate     (Ideal output)

1 20110304  2011-03-04

2 20090405  2009-04-05

3 20100399    .                2010-03-01 (or 2010-03-15, or 2010-03-31)

4 20100300    .                2010-03-01

4 201004     1920-10-04    2010-04-01

14 REPLIES 14
MikeZdeb
Rhodochrosite | Level 12

hi ... if it's always a problem with just the day part (out of range or missing) ...

data bb;

set aa;

newdate = input(catt(date,'00'), ?? yymmdd8.);

if missing(newdate) then

     newdate = mdy(input(substr(date,5,2),2.), 1 , input(substr(date,5,2),2.));

format newdate yymmdd10.;

run;


id      date         newdate

1    20110304    2011-03-04

2    20090405    2009-04-05

3    20100399    2003-03-01

4    20100300    2003-03-01

5    201004      2004-04-01

PGStats
Opal | Level 21

Hi Mike, nice approach! I guess you meant

     newdate = mdy(input(substr(date,5,2),2.), 1 , input(substr(date,1,4),4.));

PG

PG
MikeZdeb
Rhodochrosite | Level 12

hi ... thanks, so much for my cutting/pasting my own code

also the CATT function part should have had '01' ... the FINAL version (using Astounding's idea about the year portion)

data bb;

set aa;

newdate = input(catt(date,'01'), ?? yymmdd8.);

if missing(newdate) then

     newdate = mdy(input(substr(date,5,2),2.), 1 , input(date,4.));

format newdate yymmdd10.;

run;

Linlin
Lapis Lazuli | Level 10

Hi Mike,

I made changes to your code.

data bb;

set aa;

newdate = input(catt(date,'00'), ?? yymmdd8.);

if missing(newdate) then

     newdate = mdy(input(substr(date,5,2),2.), 1 , input(substr(date,1,4),4.));

format newdate yymmdd10.;

run;

proc print;run;

Astounding
PROC Star

OK, as long as we are working on each other's code, consider this:

input(substr(date,1,4),4.)

It could be replaced with:

input(date,4.)

Good luck all.

PGStats
Opal | Level 21

Something like :

newdate=mdy(

     input(substr(date,5,2),2.),

     max(1,min(31,input(coalescec(substr(date,7,2),"01"),2.))),

     input(substr(date,1,4),4.) );

PG

PG
data_null__
Jade | Level 19

The ISO 8601 INFORMATS can read partial 8601 dates but you have to get ride of that silly 99 and 00.

data aa;
  
input id date $;
   if length(date) eq 8 and substr(date,7,2) in('00','99') then substr(date,7)='  ';
   iso = input(date,
B8601DA8.);
   format iso yymmddn.;
  
datalines;
1  20110304
8  2011
2  20090405
3  20100399
4  20100300
5  201004
;;;;
   run;
Solph
Pyrite | Level 9

Thanks all. I just tested them all and they all worked.

Regarding data_null_'s solution, Does SAS have a similar function like B8601DA8. to handle the month part being blank?

data_null__
Jade | Level 19

Yes see my example.  When month is blank the date is set to first day of the year.

Solph
Pyrite | Level 9

With your example, it worked great. But I added another case with "2010  31", it returned with a missing iso value for that record. Any solution?

data aa;

   input id @4 date $char8.;

   if length(date) eq 8 and substr(date,7,2) in('00','99') then substr(date,7)='  ';

   iso = input(date,B8601DA8.);

   format iso yymmddn.;

   datalines;

1  20110304

8  2011 

9  2010  31

2  20090405

3  20100399

4  20100300

5  201004

;;;;

run;

proc print; run;

data_null__
Jade | Level 19

If you don't know the month how can the day be useful.  The ISO 8601 standard only alows truncation from the right.

MikeZdeb
Rhodochrosite | Level 12

hi ... you can add the TRANWRD function to the data step if you want a blank month as January and a blank day as the 1st of the month ...

if you do that, then you can revert to YYMMDD8. if you also correct the 00 and 99

data aa;

   input id @4 date $char8.;

   if length(date) eq 8 and substr(date,7,2) in('00','99') then substr(date,7)='  ';

    date = tranwrd(date,'  ','01');

   newdate = input(date,yymmdd8.);

   format iso yymmddn.;

   datalines;

1  20110304

8  2011

9  2010  31

2  20090405

3  20100399

4  20100300

5  201004

;

Solph
Pyrite | Level 9

Many thanks Mike. Indeed the code works both ways (with either day or month, or both, missing).

> If you don't know the month how can the day be useful.

Data _null, it's just that the date variable wasn't properly processed as a date variable, the integrity of year/month/day isn't always checked, and we'd rather have proxy date values than assign them as missing simply because some date components are missing.

Again thanks everyone for your extremely helpful answers (helped me to see how it can be resolved differently).

Linlin
Lapis Lazuli | Level 10

Hi Solph,

I am curious why you always check your questions as "Assumed Answered"?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 3227 views
  • 0 likes
  • 6 in conversation