Help using Base SAS procedures

Compund date informats

Accepted Solution Solved
Reply
Contributor KAZ
Contributor
Posts: 25
Accepted Solution

Compund date informats

I have a data sources that contains a column (string) that is not consistent with respect to how it stores dates.  Some are in the form mm/dd/yyyy, some are mm/yyyy and some are yyyy.

 

The anydtdte21. informat performs admirably, with the excpetion of just a year given.  In that case, it returns a missing value.

 

How do I create an informat that tries anydtdte21. first, and if it fails assumes the input is in the format YYYY?

 

This code is used as an example

data _a;
    length dtTXT $24.;
    dtTxt = "03/01/2015";     output;
    dtTxt = "02/2015";         output;
    dtTXT = "2015";            output;
run;

data _b;
    set _a;
    format _dt _dt2 date.;

    *** I would like to replace this code ***;
    _dt = input(dtTxt, anydtdte21.);
    if _dt = . then _dt = input("01/01/"||dtTxt, anydtdte21.);

    **** REPLACE anydtdte21. here with a custome informat to achieve the same results as the above code ****;
    _dt2 = input(dtTxt, anydtdte21.);

run;

proc print data=_b; run;

I would like to create an informat that correctly translates all three obs.

 

 


Accepted Solutions
Solution
‎07-27-2017 02:07 PM
Super User
Super User
Posts: 7,076

Re: Compund date informats

The B8601DA4. format will read four digit years as the first day of the year.

 

proc format;   
  invalue messyDate
    "/^\d{1,2}\D\d{1,2}\D\d{4}/" (regexp) = [mmddyy10.]    
    "/^\d{4}/" (regexp)       = [B8601DA.]
    other           = [anydtdte32.]
  ;
run;

View solution in original post


All Replies
Super User
Posts: 11,343

Re: Compund date informats

You aren't going to find an informat that will handle all three. I've been somewhat curious about how SAS goes about implementing the Anydtdte and related informats as the tools we have for building informats with Proc Format don't seem possible to build them.

 

If you insist on a single line of code you could create a function that would contain your existing code to return a date value but I doubt the added complexity helps much.

 

Super User
Posts: 19,861

Re: Compund date informats

If you only have 3 variations you may as well read it in as a character, check the length and use the appropriate format for the length.

 

if length(char_date) = 4 then date=mdy(1, 1,  input(char_date, 4.));
else date=input(char_date, anydtdte.);
PROC Star
Posts: 7,491

Re: Compund date informats

I really wish that my/our colleagues would stop saying that something can't be done. That has taken me away from my real work for more hours than I'd care to count.

 

Regardless, here is your custom informat:

data _a;
    length dtTXT $24.;
    dtTxt = "03/01/2015";     output;
    dtTxt = "02/2015";        output;
    dtTXT = "2015";           output;
run;

data infmt;
  retain fmtname "yearonly" type "I" ;
  do start=1900 to 2100;
    label = mdy(1,1,start); 
    output ;
  end ;
run;
proc format cntlin = infmt ; run ;
proc format; invalue dte 1900-3000=yearonly. other=anydtdte21.; run; data _b; set _a; format _dt _dt2 date.; *** I would like to replace this code ***; _dt = input(dtTxt, anydtdte21.); if _dt = . then _dt = input("01/01/"||dtTxt, anydtdte21.); **** REPLACE anydtdte21. here with a custom informat to achieve the same results as the above code ****; _dt2 = input(dtTxt, dte.); run;

Art, CEO, AnalystFinder.com

 

 

Super User
Posts: 7,854

Re: Compund date informats

Actually, the anydtdte format does not perform "admirably", it results in wrong values for your dates:

data test;
input datestring :$21.;
date = input(datestring,anydtdte21.);
format date date9.;
cards;
03/01/2015
12/12/2017
12/10/2017
10/12/2017
10/13/2017
12/2017
2017
;
run;

proc print data=test noobs;
run;

Result:

datestring         date

03/01/2015    03JAN2015
12/12/2017    12DEC2017
12/10/2017    12OCT2017
10/12/2017    10DEC2017
10/13/2017    13OCT2017
12/2017       01DEC2017
2017                  .

As you can see, it only uses MMDDYY when the second digits exceed 12, otherwise you get DDMMYY.

I would never use that format because of such shortcomings. Instead expand shorter strings to your likings with clearly defined default values, and then use one matching format that only accepts one defined sequence. If you allow the computer to do your thinking for you, you'll end up with grief. Computers are dumb.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super Contributor
Posts: 345

Re: Compund date informats

[ Edited ]

idea: Using regex in proc format and a function created with proc fcmp. The function is necessary because in a format defnition only functions with zero or one argument can be called and we need mdy().

 

EDIT: Fixed the format definition, still uses anydtdte - could be replaced by another function.

EDIT2: Fixed bad format of code.

 

data work.have;
    length dtTXT $24.;
    dtTxt = "03/01/2015";     output;
    dtTxt = "02/2015";        output;
    dtTXT = "2015";           output;
run;

proc fcmp outlib=work.funcs.dates;
   function createDate(YearStr $);
      return (mdy(1, 1, left(YearStr)));
   endsub;
run;

options cmplib= work.funcs;


proc format;   
   invalue GetDate
      "/^\d\d\/\d\d\/\d{4}/" (regexp) = [mmddyy10.]
      "/^\d\d\/\d{4}/" (regexp) = [anydtdte.]
      "/^\d{4}/" (regexp) = [createDate()]
      other = 0
   ;
run;

data work.want;
   set have;

   format d Date9.;

   put dtTxt=;

   d = input(dtTxt, GetDate.);
run;

 

 

Contributor KAZ
Contributor
Posts: 25

Re: Compund date informats

Thanks all for the feedback.  I am trying to avoid using a user defned function, and want to accomplish this with informats.

 

I've combined several of the suggestions and have come up with this:

(Thanks @KurtBremser for pointing out anydtdte's, ummm, "feature").

 

data infmt;
  retain fmtname "yyyy" type "I" ;
  do start=0001 to 9999;
    label = mdy(1,1,start); 
    output ;
  end ;
run;
proc format cntlin = infmt ; run ;

proc format;   
	invalue messyDate
		"/^\d{1,2}\D\d{1,2}\D\d{4}/" (regexp)	= [mmddyy10.]    
		"/^\d{4}/" (regexp) 			= [yyyy.]
		other 					= [anydtdte32.]
	;
run;

data test;
infile datalines truncover dlm="|";
input datestring :$64.; 			format datestring $64.;
formatedDate = input(datestring, messyDate.);	format formatedDate date11.; 
cards;
12/12/2017
12/10/2017
10/12/2017
10/13/2017
03/01/2015
03-01-2015
03.01.2015
03 01 2015 3/1/2015 03/2015 3/2015 2017 1March2015 01March2015 1MAR2015 Mar 01 2015 MAR 01, 2015 March 1, 2015 March 1 2015 Mar 2015 March 2015 March, 2015 ; run; proc print data=test; run;

As you can see, I was a little coy regarding how messy the data are.  This seems to work with the exception of the last two rows, i.e. when the full month is spelled out but no day is given.

Solution
‎07-27-2017 02:07 PM
Super User
Super User
Posts: 7,076

Re: Compund date informats

The B8601DA4. format will read four digit years as the first day of the year.

 

proc format;   
  invalue messyDate
    "/^\d{1,2}\D\d{1,2}\D\d{4}/" (regexp) = [mmddyy10.]    
    "/^\d{4}/" (regexp)       = [B8601DA.]
    other           = [anydtdte32.]
  ;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 650 views
  • 7 likes
  • 7 in conversation