09-02-2013 01:36 AM
Could any one help me out, how to read different date format,
I have an dataset like,
I want Admission date to be in sas date format or Any sas dateformat and i also want the missing day as 07 and missing months as 01
I want dataset to be as
09-02-2013 07:22 AM
The ANYDTDTE. informat read many different date representations, but some are not ready in as a date. So you will need some logic to adapt the text so that the ANYDTDTE. informat can read it as a date.
Find below an example that uses regular expressions to check for some pattern, but not all the cases. But you can addtional logic to handle the other cases as well:
09-02-2013 07:43 AM
The ANYDTDTEw. informat might provide a means of interpreting some of these dates but you will have a problem with ambiguity: 121987 might be interpreted as 01-02-1987 or 02-10-1987 depending on locale; neither is what you want.
You will have to clean up your data before you can input it to insert default values.
If the following are always true you may have a chance:
* "UU" (or nothing) is always used to indicate an unknown day or month
* Admission year must not be missing and must always be the last (or only) value given
* A six digit value would be MMYYYY or DDMMYY
The suggested strategy would be to identify records with day and/or month missing and substitute the default values. Using 'adm' as shorthand for admission date, try
When (length(adm) = 2)
adm = cats ('07-01-', adm) ;
When (length(adm) = 4 and adm = compress (adm, ' ', 'KD'))
If 1920 < input(adm, 4.) < 2020
then adm = cats ('07-01-', adm) ;
else adm = catx ('-','07', substr(adm, 1, 2), substr(adm, 3, 2)) ;
When (length(adm) = 6 and adm = compress (adm, ' ', 'KD'))
If 1920 < input(substr(adm, 3, 4), 4.) < 2020
then adm = catx ('-','07', substr(adm, 1, 2), substr(adm, 3, 4)) ;
/* else leave adm unchanged, in DDMMYY format */
When (index (upcase(adm), 'U') > 0)
If index (upcase(adm), 'U') = 1
then adm = cats ('07', substr (adm, 3, length(adm))) ;
If index (upcase(adm), 'U') > 1
then adm = transtr (upcase (adm), 'UU', '07') ;
date = input (adm, ?? ANYDTDTE.) ;
format date ddmmyy10. ;
if date = .
then put 'Record ' _N_ adm= ;
Code should write any values that cannot be interpreted correctly to the log, so these can be examined and either edited manually (if only a few) or additional statements added to the Select group.