BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jerry898969
Pyrite | Level 9

I'm have to import an excel spreadsheet that had a date column but the values are not a consistant format.  For example :

SDate

------

Jun-07
Nov-08
2009
2000
11/01/2009

When I try

                        
Jun-04
Nov-04
2007
2002



07/01/2008


proc sql ;
create table xls
as
select
    input(sdate, ddmmyy10.) as sdate
from exl
;
quit ;

I get

NOTE: Invalid date value
NOTE: Invalid argument to function INPUT. Missing values may be generated.
NOTE: Table WORK.XLS created, with 5 rows and 1 columns.

But the column is empty except for the last row.  Is there a way to apply a default format to all rows?

Thanks for any help

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Very close!  But I would preface the if statements with one to catch actual dates that were readable with the anydtdte informat.  E.g.,

data have;

  informat sdate $10.;

  input sdate  ;

  cards;

Jun-07

Nov-08

14-Jun-07

12-Nov-08

2009

2000

11/01/2009

;

run;

data want;

  set have;

  format newdate mmddyy10.;

  if input(sdate, ? anydtdte21.) then newdate=input(sdate, ? anydtdte21.);

  else if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);

    else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);

       else newdate=mdy(1,1,input(sdate,4.));

run;

View solution in original post

14 REPLIES 14
art297
Opal | Level 21

I proposed a couple of methods for very similar data in a presentation last year.  Take a look at the powerpoint at: http://torsas.ca/attachments/File/06112010/Worst_Data.ppt .  Good luck!

Linlin
Lapis Lazuli | Level 10

How about:

data have;

informat sdate $10.;

input sdate  ;

cards;

Jun-07

Nov-08

2009

2000

11/01/2009

;

run;

data want;

  set have;

  format newdate mmddyy10.;

  if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);

    else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);

       else newdate=mdy(1,1,input(sdate,4.));

proc print;

run;

art297
Opal | Level 21

Very close!  But I would preface the if statements with one to catch actual dates that were readable with the anydtdte informat.  E.g.,

data have;

  informat sdate $10.;

  input sdate  ;

  cards;

Jun-07

Nov-08

14-Jun-07

12-Nov-08

2009

2000

11/01/2009

;

run;

data want;

  set have;

  format newdate mmddyy10.;

  if input(sdate, ? anydtdte21.) then newdate=input(sdate, ? anydtdte21.);

  else if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);

    else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);

       else newdate=mdy(1,1,input(sdate,4.));

run;

Linlin
Lapis Lazuli | Level 10

Hi Art297,

Thank you!  I have looked at your powerpoint, it is very helpful!

jerry898969
Pyrite | Level 9

Art,


Thank you for your help.

I looked at the last bit of code you posted.  It all looks fine except for 2009 converts to 09/01/1920.

I changed the code around a bit

data have;

  informat sdate $10.;

  input sdate  ;

  cards;

Jun-07

Nov-08

14-Jun-07

12-Nov-08

2009

2000

11/01/2009

;

run;

data want;

  set have;

  format newdate mmddyy10.;

  if length(sdate) = 4 then newdate=mdy(1,1,input(sdate,4.));

  else if input(sdate, ? anydtdte21.) then newdate=input(sdate, ? anydtdte21.);

  else if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);

  else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);   

run;

Thanks again for all your help. 

Jerry

art297
Opal | Level 21

As long as you got what you needed.  When I run that type of code I also typicallly set the system option "datestyle" the type most prevalent form the data has.  e.g.,

options datestyle=mdy;

Also, it always helps if one ensures that their year cutoff option is set appropriately for the current year.

FriedEgg
SAS Employee

options datestyle=mdy;

data have;

format sdate yymmdd10.;

informat sdate anydtdte10.;

input @;

if length(_infile_)=4 then _infile_='01-01-'||_infile_;

input sdate;

cards;

Jun-07

Nov-08

14-Jun-07

12-Nov-08

2009

2000

11/01/2009

;

run;

jerry898969
Pyrite | Level 9

Thank you everyone for all your help.  All your input helped.

Thanks

jerry898969
Pyrite | Level 9

FriedEgg,

I tried doing the following code which is coming from a sas data set

options datestyle=mdy;

data tab1 ;

    set tab0 ;

    informat sdate anydtdte10.;

    format sdate   yymmdd10.;   

    if length(sdate)=4 then sdate='01-01-'||sdate;   

run ;

I keep getting the following errors. 

ERROR 48-59: The informat $ANYDTDTE was not found or could not be loaded.

ERROR 48-59: The format $YYMMDD was not found or could not be loaded.

Currently sdate is text in the data table that is being set.  I have it working using the code that art posted but I just realized I have more then one column with data like this so I don't want to use if's for every variable.

Thank you for any more help you can give me

art297
Opal | Level 21

Jerry,

You apparently first read sdate in as a character variable.  As such, your probably have to run something like:

data tab0;

  informat sdate $10.;

  input sdate;

  cards;

Jun-07

Nov-08

14-Jun-07

12-Nov-08

2009

2000

11/01/2009

;

run;

options datestyle=mdy;

data tab1 ;

    set tab0 (rename=(sdate=sdate_in));

    format sdate   yymmdd10.;  

    if length(sdate_in)=4 then sdate_in='01-01-'||sdate_in;  

    sdate=input(sdate_in,anydtdte10.);

run ;

Phaani
Calcite | Level 5

Hello Art and Fried,

Just a question regarding options datestyle=mdy; do I need to use it for using anydtdte informat option or is that option for something else ??

thanks
kumar

art297
Opal | Level 21

Need to: no!  Advisable: yes!  Why make SAS guess if you know the true order of the month, day and year components?

FriedEgg
SAS Employee

Phaani wrote:

Hello Art and Fried,

Just a question regarding options datestyle=mdy; do I need to use it for using anydtdte informat option or is that option for something else ??

thanks
kumar

The datestyle system option tells the anydt* formats what style date to use as a default when it cannot determine the proper order for the given date.  For instance in this example we have:

11/01/2009

This could be November 1st 2009 or January 11th 2009, I want is to be Nov. 1 so I assign datestyle=mdy.  With bad data always comes mistakes and assumptions so this should be assigned to the occurance you view as being most frequent.

jerry898969 wrote:

ERROR 48-59: The informat $ANYDTDTE was not found or could not be loaded.

ERROR 48-59: The format $YYMMDD was not found or could not be loaded.

I avoid this issue by using the special _infile_ variable, if you are changing the format from a dataset you will need to use input function for my method.  Art's post would be my solution for this situation.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 9060 views
  • 7 likes
  • 5 in conversation