How do I read differently formatted dates within a single dataline?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

How do I read differently formatted dates within a single dataline?

data club600;
informat full_name & $25. nHR DOB DATE11. DOHR MMDDYY8.;
input full_name nHR DOB DOHR;
format DOB MMDDYY10. DOHR MMDDYY10.;
datalines;
Barry Bonds          762   24-Jul-1964   8/8/02
Hank Aaron	      755   5-Feb-1934    4/27/71
Babe Ruth	      714   6-Feb-1895    8/21/31
Alex Rodriguez      696   27-Jul-1975   8/4/10
Willie Mays            660   6-May-1931    9/22/69
Ken Griffey Jr.	      630   21-Nov-1969   6/9/08
Jim Thome            612   27-Aug-1970   8/15/11
Sammy Sosa 	      609   12-Nov-1968   6/2/07
;
proc print noobs;
run;

I'm brand new to SAS but my goal with this is to eventually sort the data by the difference between column 4 and column 3. I've read quite a bit on the basics but I'm making no progress here, because I suspect that the text parsing issues aren't covered in introductory material on SAS.


Initially, I was using a single line:

input full_name & $25. nHR DOB DATE11. DOHR MMDDYY8.;

 rather than

informat full_name & $25. nHR DOB DATE11. DOHR MMDDYY8.;
input full_name nHR DOB DOHR;

 

The former method didn't handle date cases with single digit values for the DAY.

 

I switched format for no reason other than coming across this resource:

http://support.sas.com/publishing/pubcat/chaps/55126.pdf

 

I found that layout more readable.

 

My current issue (I suspect) is that something is happening in the parsing of the first data that is leaving the line in a state unfit for the next informat specification, so the print result is an empty entry in each row for 'DOHR'


Accepted Solutions
Solution
2 weeks ago
Super User
Posts: 2,061

Re: How do I read differently formatted dates within a single dataline?

data club600;
informat full_name $25.  nHR 8. DOB DATE11. DOHR MMDDYY8.;
input full_name &  nHR DOB DOHR;
format DOB MMDDYY10. DOHR MMDDYY10.;
datalines;
Barry Bonds          762   24-Jul-1964   8/8/02
Hank Aaron	      755   5-Feb-1934    4/27/71
Babe Ruth	      714   6-Feb-1895    8/21/31
Alex Rodriguez      696   27-Jul-1975   8/4/10
Willie Mays            660   6-May-1931    9/22/69
Ken Griffey Jr.	      630   21-Nov-1969   6/9/08
Jim Thome            612   27-Aug-1970   8/15/11
Sammy Sosa 	      609   12-Nov-1968   6/2/07
;

View solution in original post


All Replies
Solution
2 weeks ago
Super User
Posts: 2,061

Re: How do I read differently formatted dates within a single dataline?

data club600;
informat full_name $25.  nHR 8. DOB DATE11. DOHR MMDDYY8.;
input full_name &  nHR DOB DOHR;
format DOB MMDDYY10. DOHR MMDDYY10.;
datalines;
Barry Bonds          762   24-Jul-1964   8/8/02
Hank Aaron	      755   5-Feb-1934    4/27/71
Babe Ruth	      714   6-Feb-1895    8/21/31
Alex Rodriguez      696   27-Jul-1975   8/4/10
Willie Mays            660   6-May-1931    9/22/69
Ken Griffey Jr.	      630   21-Nov-1969   6/9/08
Jim Thome            612   27-Aug-1970   8/15/11
Sammy Sosa 	      609   12-Nov-1968   6/2/07
;
Occasional Contributor
Posts: 5

Re: How do I read differently formatted dates within a single dataline?

Posted in reply to novinosrin

How did you know to address the 'nHR' column when that column was being read correctly, along with the following column? I have a little experience with regex parsing and tokenizing and I would suspect that an informat error would cascade into more problems with reading the subsequent columns of a line.

 

And also, thank you.

Super User
Posts: 2,061

Re: How do I read differently formatted dates within a single dataline?

You are most welcome. Just experience. The code comes to mind the very second the eyes look at the data. 

Occasional Contributor
Posts: 5

Re: How do I read differently formatted dates within a single dataline?

ADDITIONALLY, does anybody understand why the division of informat and input affected the ability to handle non zero filled date values?
Super User
Posts: 13,941

Re: How do I read differently formatted dates within a single dataline?

You might want to look into using the ANYDTDTE informat which will read a pretty wide number of date formats.

 

 

Also when you specify a bare format on an input such as

input full_name & $25. nHR DOB DATE11. DOHR MMDDYY8.;

 

it will force reading all of the characters specified by the length of the informat supplied. Hence it may read "long" into the next variable field on occasion. Sometimes this can be fixed by placing a : before the informat to modify the behavior to correctly read "short" values.

 

With

informat full_name & $25. nHR DOB DATE11. DOHR MMDDYY8.;

since you do not specify an informat after nHR it is attempting to use the same informat as DOB. I would suggest adding something like F5. after nHR.

The syntax for Format and Informat statements is
Informat <variable list> informat <variable list> informat  (repeat as needed). So if you may sometimes be assigning an unexpected informat or format when accidently left out.

Occasional Contributor
Posts: 5

Re: How do I read differently formatted dates within a single dataline?

[ Edited ]

Informative response, thank you.

Interesting that nHR absorbs an informat to the right rather than the left side, but it explains the error perfectly.

Esteemed Advisor
Posts: 5,625

Re: How do I read differently formatted dates within a single dataline?

SAS input statement enters (yet another) special mode when dealing with variables having an assigned informat. I much prefer reading with the modified list format, signaled by prefixing the format with a colon, as in:

 

data club600;
length full_name $25;
input full_name & nHR DOB :date32. DOHR :mmddyy32. ;
format DOB MMDDYY10. DOHR MMDDYY10.;
datalines;
Barry Bonds          762   24-Jul-1964   8/8/02
Hank Aaron	      755   5-Feb-1934    4/27/71
Babe Ruth	      714   6-Feb-1895    8/21/31
Alex Rodriguez      696   27-Jul-1975   8/4/10
Willie Mays            660   6-May-1931    9/22/69
Ken Griffey Jr.	      630   21-Nov-1969   6/9/08
Jim Thome            612   27-Aug-1970   8/15/11
Sammy Sosa 	      609   12-Nov-1968   6/2/07
;
PG
Occasional Contributor
Posts: 5

Re: How do I read differently formatted dates within a single dataline?

I have learned more in 10 minutes on this community than I have learned in months on SO.

Thank you.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 83 views
  • 1 like
  • 4 in conversation