Join Now

Tired of annoying messages using date informats and the DMY function on invalid dates?

by Occasional Contributor PeterKellberg on ‎10-13-2014 07:23 AM (763 Views)

Hopefully English is OK. Then our SAS friends in other countries may benefit from the article as well.

Dates in text variables and their conversion into SAS dates

If you are dealing with dates in text variables you would probably want to convert them into SAS dates.

The INPUT function

One option is to use the INPUT function using a suitable informat that corresponds to the date structure. The list of date informats is long so you should be able to pick up the right one.

If you  are dealing with a complete mess of different date structures the ANYDTDTE informat may be the solution.

 

An example

 

dateC="02/05/2014";

SASDate=input(dateC,ddmmyy10.);

This is what happens if the date is invalid:

 

NOTE: Invalid argument to function INPUT at line 5376 column 12.

dateC=02/13/2014 SASDate=. _ERROR_=1 _N_=1

NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values.

      Each place is given by: (Number of times) at (Line)Smiley SadColumn).

      1 at 5376:12

 

In order to get rid of those messages use the following construction:

SASDate=input(dateC,?? ddmmyy10.);

The MDY function

The MDY function also yields messages if the combination of month, day and year is invalid:

NOTE: Invalid argument to function MDY(13,12,2014) at line 5382 column 12.

month=13 day=12 year=2014 SASDate=. _ERROR_=1 _N_=1

NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values.

      Each place is given by: (Number of times) at (Line)Smiley SadColumn).

      1 at 5382:12

There is - as far as I know - no reasonable way to get rid of the messages from the MDY function. Unless you make your own DMY function.

Your own MDY function

This homemade function is stable regarding invalid month, day,year combinations. It returns a SAS date in case of valid dates and a missing value if not. No hiccups in the LOG.

proc fcmp outlib=work.funcs.funcs;
   function MDYDate(month, day, year);

      if month < 1 or month > 12 then /* month 1-12 */
         return;

      if day < 1 or day > 31 then /* day 1-31 */
         return;

      if year < 1582 or year > 19900 then /* year 1582-19900  */
         return;

      /* How many days in each month? */
      array daysInMonth [12] _temporary_;
      daysInMonth[1]= 31;
      daysInMonth[2]= ifn((mod(year, 4)=0 and mod(year, 100)^=0) or mod(year, 400)=0,29,28); /* test for leap year */
      daysInMonth[3]= 31;
      daysInMonth[4]= 30;
      daysInMonth[5]= 31;
      daysInMonth[6]= 30;
      daysInMonth[7]= 31;
      daysInMonth[8]= 31;
      daysInMonth[9]= 30;
      daysInMonth[10]=31;
      daysInMonth[11]=30;
      daysInMonth[12]=31;

      if day > daysInMonth[month] then /* day cannot exceed number of days in actual month */
         return;

      /* date from day/month/year is OK. Return the SAS date */   
      return(MDY(month,day,year));
   endsub;
run;

And the test:

options cmplib=work.funcs;

data test;
   SASDate=MDYdate(2,29,2011); /* invalid date */

   output;

   SASDate=MDYdate(2,28,2011); /* valid date */

   output; 

   format SASDate ddmmyy10.;
run;

proc print data=test;

run;

The LOG is nice and clean:

93   options cmplib=work.funcs;
94
95   data test;
96      SASDate=MDYdate(2,29,2011); /* invalid date */
97      output;
98      SASDate=MDYdate(2,28,2011); /* valid date */
99      output;
100     format SASDate ddmmyy10.;
101  run;

NOTE: The data set WORK.TEST has 2 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.06 seconds

Comments
by SAS Employee GeorgMorsing
on ‎10-13-2014 09:29 AM

Rigtig godt eksempel på hvor nyttigt det kan være at lave sine egne SAS funktioner.

by New Contributor OskarBak
on ‎10-21-2014 05:37 AM

Interessant metode, kunne forestille mig at den er specielt brugbar når ved at der eksisterer kendte/gentagne fejl i data som man automatisk vil sorterer fra.

Hvis man forventer at data kun indeholder korrekte datoer, så giver den "almindelige" metode stadig brugbar info i loggen ift. hvor det er fejlene er.

by Occasional Contributor PeterKellberg
on ‎10-21-2014 05:48 AM

Jo, det kan du have ret i, men vi bruger jo netop dato-informater og MDY-funktionen, når vi har tekstdatoer, som vi vil validere. Og ugyldige datoer bliver til missing values. Min pointe er bare, at i hvert fald DMY-funktionen ikke nødvendigvis burde hoste over ugyldige datoer fx hvis den havde denne udforming: dato=MDY(month,day,year,'NOERROR').

by New Contributor OskarBak
on ‎10-21-2014 05:57 AM

Ja det kan jeg godt se. Outputtet fra de to forskellige MDY funktioner er det samme (missing values ved invalid datoer), men du slipper for den lange fejllog. Fejlene kan du lokalisere ved at kigge missing values igennem, og så kan jeg godt se, at det ikke er nødvendigt at finde de samme fejl i loggen også.