BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jerry898969
Pyrite | Level 9

I'm have to import an excel spreadsheet that had a date column but the values are not a consistant format.  For example :

SDate

------

Jun-07
Nov-08
2009
2000
11/01/2009

When I try

                        
Jun-04
Nov-04
2007
2002



07/01/2008


proc sql ;
create table xls
as
select
    input(sdate, ddmmyy10.) as sdate
from exl
;
quit ;

I get

NOTE: Invalid date value
NOTE: Invalid argument to function INPUT. Missing values may be generated.
NOTE: Table WORK.XLS created, with 5 rows and 1 columns.

But the column is empty except for the last row.  Is there a way to apply a default format to all rows?

Thanks for any help

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Very close!  But I would preface the if statements with one to catch actual dates that were readable with the anydtdte informat.  E.g.,

data have;

  informat sdate $10.;

  input sdate  ;

  cards;

Jun-07

Nov-08

14-Jun-07

12-Nov-08

2009

2000

11/01/2009

;

run;

data want;

  set have;

  format newdate mmddyy10.;

  if input(sdate, ? anydtdte21.) then newdate=input(sdate, ? anydtdte21.);

  else if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);

    else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);

       else newdate=mdy(1,1,input(sdate,4.));

run;

View solution in original post

14 REPLIES 14
art297
Opal | Level 21

I proposed a couple of methods for very similar data in a presentation last year.  Take a look at the powerpoint at: http://torsas.ca/attachments/File/06112010/Worst_Data.ppt .  Good luck!

Linlin
Lapis Lazuli | Level 10

How about:

data have;

informat sdate $10.;

input sdate  ;

cards;

Jun-07

Nov-08

2009

2000

11/01/2009

;

run;

data want;

  set have;

  format newdate mmddyy10.;

  if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);

    else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);

       else newdate=mdy(1,1,input(sdate,4.));

proc print;

run;

art297
Opal | Level 21

Very close!  But I would preface the if statements with one to catch actual dates that were readable with the anydtdte informat.  E.g.,

data have;

  informat sdate $10.;

  input sdate  ;

  cards;

Jun-07

Nov-08

14-Jun-07

12-Nov-08

2009

2000

11/01/2009

;

run;

data want;

  set have;

  format newdate mmddyy10.;

  if input(sdate, ? anydtdte21.) then newdate=input(sdate, ? anydtdte21.);

  else if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);

    else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);

       else newdate=mdy(1,1,input(sdate,4.));

run;

Linlin
Lapis Lazuli | Level 10

Hi Art297,

Thank you!  I have looked at your powerpoint, it is very helpful!

jerry898969
Pyrite | Level 9

Art,


Thank you for your help.

I looked at the last bit of code you posted.  It all looks fine except for 2009 converts to 09/01/1920.

I changed the code around a bit

data have;

  informat sdate $10.;

  input sdate  ;

  cards;

Jun-07

Nov-08

14-Jun-07

12-Nov-08

2009

2000

11/01/2009

;

run;

data want;

  set have;

  format newdate mmddyy10.;

  if length(sdate) = 4 then newdate=mdy(1,1,input(sdate,4.));

  else if input(sdate, ? anydtdte21.) then newdate=input(sdate, ? anydtdte21.);

  else if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);

  else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);   

run;

Thanks again for all your help. 

Jerry

art297
Opal | Level 21

As long as you got what you needed.  When I run that type of code I also typicallly set the system option "datestyle" the type most prevalent form the data has.  e.g.,

options datestyle=mdy;

Also, it always helps if one ensures that their year cutoff option is set appropriately for the current year.

FriedEgg
SAS Employee

options datestyle=mdy;

data have;

format sdate yymmdd10.;

informat sdate anydtdte10.;

input @;

if length(_infile_)=4 then _infile_='01-01-'||_infile_;

input sdate;

cards;

Jun-07

Nov-08

14-Jun-07

12-Nov-08

2009

2000

11/01/2009

;

run;

jerry898969
Pyrite | Level 9

Thank you everyone for all your help.  All your input helped.

Thanks

jerry898969
Pyrite | Level 9

FriedEgg,

I tried doing the following code which is coming from a sas data set

options datestyle=mdy;

data tab1 ;

    set tab0 ;

    informat sdate anydtdte10.;

    format sdate   yymmdd10.;   

    if length(sdate)=4 then sdate='01-01-'||sdate;   

run ;

I keep getting the following errors. 

ERROR 48-59: The informat $ANYDTDTE was not found or could not be loaded.

ERROR 48-59: The format $YYMMDD was not found or could not be loaded.

Currently sdate is text in the data table that is being set.  I have it working using the code that art posted but I just realized I have more then one column with data like this so I don't want to use if's for every variable.

Thank you for any more help you can give me

art297
Opal | Level 21

Jerry,

You apparently first read sdate in as a character variable.  As such, your probably have to run something like:

data tab0;

  informat sdate $10.;

  input sdate;

  cards;

Jun-07

Nov-08

14-Jun-07

12-Nov-08

2009

2000

11/01/2009

;

run;

options datestyle=mdy;

data tab1 ;

    set tab0 (rename=(sdate=sdate_in));

    format sdate   yymmdd10.;  

    if length(sdate_in)=4 then sdate_in='01-01-'||sdate_in;  

    sdate=input(sdate_in,anydtdte10.);

run ;

Phaani
Calcite | Level 5

Hello Art and Fried,

Just a question regarding options datestyle=mdy; do I need to use it for using anydtdte informat option or is that option for something else ??

thanks
kumar

art297
Opal | Level 21

Need to: no!  Advisable: yes!  Why make SAS guess if you know the true order of the month, day and year components?

FriedEgg
SAS Employee

Phaani wrote:

Hello Art and Fried,

Just a question regarding options datestyle=mdy; do I need to use it for using anydtdte informat option or is that option for something else ??

thanks
kumar

The datestyle system option tells the anydt* formats what style date to use as a default when it cannot determine the proper order for the given date.  For instance in this example we have:

11/01/2009

This could be November 1st 2009 or January 11th 2009, I want is to be Nov. 1 so I assign datestyle=mdy.  With bad data always comes mistakes and assumptions so this should be assigned to the occurance you view as being most frequent.

jerry898969 wrote:

ERROR 48-59: The informat $ANYDTDTE was not found or could not be loaded.

ERROR 48-59: The format $YYMMDD was not found or could not be loaded.

I avoid this issue by using the special _infile_ variable, if you are changing the format from a dataset you will need to use input function for my method.  Art's post would be my solution for this situation.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 8168 views
  • 7 likes
  • 5 in conversation