I'm have to import an excel spreadsheet that had a date column but the values are not a consistant format. For example :
SDate
------
Jun-07
Nov-08
2009
2000
11/01/2009
When I try
Jun-04 |
Nov-04 |
2007 |
2002 |
07/01/2008 |
proc sql ;
create table xls
as
select
input(sdate, ddmmyy10.) as sdate
from exl
;
quit ;
I get
NOTE: Invalid date value
NOTE: Invalid argument to function INPUT. Missing values may be generated.
NOTE: Table WORK.XLS created, with 5 rows and 1 columns.
But the column is empty except for the last row. Is there a way to apply a default format to all rows?
Thanks for any help
Very close! But I would preface the if statements with one to catch actual dates that were readable with the anydtdte informat. E.g.,
data have;
informat sdate $10.;
input sdate ;
cards;
Jun-07
Nov-08
14-Jun-07
12-Nov-08
2009
2000
11/01/2009
;
run;
data want;
set have;
format newdate mmddyy10.;
if input(sdate, ? anydtdte21.) then newdate=input(sdate, ? anydtdte21.);
else if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);
else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);
else newdate=mdy(1,1,input(sdate,4.));
run;
I proposed a couple of methods for very similar data in a presentation last year. Take a look at the powerpoint at: http://torsas.ca/attachments/File/06112010/Worst_Data.ppt . Good luck!
How about:
data have;
informat sdate $10.;
input sdate ;
cards;
Jun-07
Nov-08
2009
2000
11/01/2009
;
run;
data want;
set have;
format newdate mmddyy10.;
if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);
else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);
else newdate=mdy(1,1,input(sdate,4.));
proc print;
run;
Very close! But I would preface the if statements with one to catch actual dates that were readable with the anydtdte informat. E.g.,
data have;
informat sdate $10.;
input sdate ;
cards;
Jun-07
Nov-08
14-Jun-07
12-Nov-08
2009
2000
11/01/2009
;
run;
data want;
set have;
format newdate mmddyy10.;
if input(sdate, ? anydtdte21.) then newdate=input(sdate, ? anydtdte21.);
else if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);
else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);
else newdate=mdy(1,1,input(sdate,4.));
run;
Hi Art297,
Thank you! I have looked at your powerpoint, it is very helpful!
Art,
Thank you for your help.
I looked at the last bit of code you posted. It all looks fine except for 2009 converts to 09/01/1920.
I changed the code around a bit
data have;
informat sdate $10.;
input sdate ;
cards;
Jun-07
Nov-08
14-Jun-07
12-Nov-08
2009
2000
11/01/2009
;
run;
data want;
set have;
format newdate mmddyy10.;
if length(sdate) = 4 then newdate=mdy(1,1,input(sdate,4.));
else if input(sdate, ? anydtdte21.) then newdate=input(sdate, ? anydtdte21.);
else if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);
else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);
run;
Thanks again for all your help.
Jerry
As long as you got what you needed. When I run that type of code I also typicallly set the system option "datestyle" the type most prevalent form the data has. e.g.,
options datestyle=mdy;
Also, it always helps if one ensures that their year cutoff option is set appropriately for the current year.
options datestyle=mdy;
data have;
format sdate yymmdd10.;
informat sdate anydtdte10.;
input @;
if length(_infile_)=4 then _infile_='01-01-'||_infile_;
input sdate;
cards;
Jun-07
Nov-08
14-Jun-07
12-Nov-08
2009
2000
11/01/2009
;
run;
FriedEgg: Very nice!
Thank you everyone for all your help. All your input helped.
Thanks
FriedEgg,
I tried doing the following code which is coming from a sas data set
options datestyle=mdy;
data tab1 ;
set tab0 ;
informat sdate anydtdte10.;
format sdate yymmdd10.;
if length(sdate)=4 then sdate='01-01-'||sdate;
run ;
I keep getting the following errors.
ERROR 48-59: The informat $ANYDTDTE was not found or could not be loaded.
ERROR 48-59: The format $YYMMDD was not found or could not be loaded.
Currently sdate is text in the data table that is being set. I have it working using the code that art posted but I just realized I have more then one column with data like this so I don't want to use if's for every variable.
Thank you for any more help you can give me
Jerry,
You apparently first read sdate in as a character variable. As such, your probably have to run something like:
data tab0;
informat sdate $10.;
input sdate;
cards;
Jun-07
Nov-08
14-Jun-07
12-Nov-08
2009
2000
11/01/2009
;
run;
options datestyle=mdy;
data tab1 ;
set tab0 (rename=(sdate=sdate_in));
format sdate yymmdd10.;
if length(sdate_in)=4 then sdate_in='01-01-'||sdate_in;
sdate=input(sdate_in,anydtdte10.);
run ;
Hello Art and Fried,
Just a question regarding options datestyle=mdy; do I need to use it for using anydtdte informat option or is that option for something else ??
thanks
kumar
Need to: no! Advisable: yes! Why make SAS guess if you know the true order of the month, day and year components?
Phaani wrote:
Hello Art and Fried,
Just a question regarding options datestyle=mdy; do I need to use it for using anydtdte informat option or is that option for something else ??
thanks
kumar
The datestyle system option tells the anydt* formats what style date to use as a default when it cannot determine the proper order for the given date. For instance in this example we have:
11/01/2009
This could be November 1st 2009 or January 11th 2009, I want is to be Nov. 1 so I assign datestyle=mdy. With bad data always comes mistakes and assumptions so this should be assigned to the occurance you view as being most frequent.
jerry898969 wrote:
ERROR 48-59: The informat $ANYDTDTE was not found or could not be loaded.
ERROR 48-59: The format $YYMMDD was not found or could not be loaded.
I avoid this issue by using the special _infile_ variable, if you are changing the format from a dataset you will need to use input function for my method. Art's post would be my solution for this situation.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.