I'm have to import an excel spreadsheet that had a date column but the values are not a consistant format. For example :
SDate
------
Jun-07
Nov-08
2009
2000
11/01/2009
When I try
Jun-04 |
Nov-04 |
2007 |
2002 |
07/01/2008 |
proc sql ;
create table xls
as
select
input(sdate, ddmmyy10.) as sdate
from exl
;
quit ;
I get
NOTE: Invalid date value
NOTE: Invalid argument to function INPUT. Missing values may be generated.
NOTE: Table WORK.XLS created, with 5 rows and 1 columns.
But the column is empty except for the last row. Is there a way to apply a default format to all rows?
Thanks for any help
Very close! But I would preface the if statements with one to catch actual dates that were readable with the anydtdte informat. E.g.,
data have;
informat sdate $10.;
input sdate ;
cards;
Jun-07
Nov-08
14-Jun-07
12-Nov-08
2009
2000
11/01/2009
;
run;
data want;
set have;
format newdate mmddyy10.;
if input(sdate, ? anydtdte21.) then newdate=input(sdate, ? anydtdte21.);
else if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);
else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);
else newdate=mdy(1,1,input(sdate,4.));
run;
I proposed a couple of methods for very similar data in a presentation last year. Take a look at the powerpoint at: http://torsas.ca/attachments/File/06112010/Worst_Data.ppt . Good luck!
How about:
data have;
informat sdate $10.;
input sdate ;
cards;
Jun-07
Nov-08
2009
2000
11/01/2009
;
run;
data want;
set have;
format newdate mmddyy10.;
if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);
else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);
else newdate=mdy(1,1,input(sdate,4.));
proc print;
run;
Very close! But I would preface the if statements with one to catch actual dates that were readable with the anydtdte informat. E.g.,
data have;
informat sdate $10.;
input sdate ;
cards;
Jun-07
Nov-08
14-Jun-07
12-Nov-08
2009
2000
11/01/2009
;
run;
data want;
set have;
format newdate mmddyy10.;
if input(sdate, ? anydtdte21.) then newdate=input(sdate, ? anydtdte21.);
else if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);
else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);
else newdate=mdy(1,1,input(sdate,4.));
run;
Hi Art297,
Thank you! I have looked at your powerpoint, it is very helpful!
Art,
Thank you for your help.
I looked at the last bit of code you posted. It all looks fine except for 2009 converts to 09/01/1920.
I changed the code around a bit
data have;
informat sdate $10.;
input sdate ;
cards;
Jun-07
Nov-08
14-Jun-07
12-Nov-08
2009
2000
11/01/2009
;
run;
data want;
set have;
format newdate mmddyy10.;
if length(sdate) = 4 then newdate=mdy(1,1,input(sdate,4.));
else if input(sdate, ? anydtdte21.) then newdate=input(sdate, ? anydtdte21.);
else if find(sdate,'-') then newdate=input(catx('-','1',sdate),date9.);
else if find(sdate,'/') then newdate=input(sdate,mmddyy10.);
run;
Thanks again for all your help.
Jerry
As long as you got what you needed. When I run that type of code I also typicallly set the system option "datestyle" the type most prevalent form the data has. e.g.,
options datestyle=mdy;
Also, it always helps if one ensures that their year cutoff option is set appropriately for the current year.
options datestyle=mdy;
data have;
format sdate yymmdd10.;
informat sdate anydtdte10.;
input @;
if length(_infile_)=4 then _infile_='01-01-'||_infile_;
input sdate;
cards;
Jun-07
Nov-08
14-Jun-07
12-Nov-08
2009
2000
11/01/2009
;
run;
FriedEgg: Very nice!
Thank you everyone for all your help. All your input helped.
Thanks
FriedEgg,
I tried doing the following code which is coming from a sas data set
options datestyle=mdy;
data tab1 ;
set tab0 ;
informat sdate anydtdte10.;
format sdate yymmdd10.;
if length(sdate)=4 then sdate='01-01-'||sdate;
run ;
I keep getting the following errors.
ERROR 48-59: The informat $ANYDTDTE was not found or could not be loaded.
ERROR 48-59: The format $YYMMDD was not found or could not be loaded.
Currently sdate is text in the data table that is being set. I have it working using the code that art posted but I just realized I have more then one column with data like this so I don't want to use if's for every variable.
Thank you for any more help you can give me
Jerry,
You apparently first read sdate in as a character variable. As such, your probably have to run something like:
data tab0;
informat sdate $10.;
input sdate;
cards;
Jun-07
Nov-08
14-Jun-07
12-Nov-08
2009
2000
11/01/2009
;
run;
options datestyle=mdy;
data tab1 ;
set tab0 (rename=(sdate=sdate_in));
format sdate yymmdd10.;
if length(sdate_in)=4 then sdate_in='01-01-'||sdate_in;
sdate=input(sdate_in,anydtdte10.);
run ;
Hello Art and Fried,
Just a question regarding options datestyle=mdy; do I need to use it for using anydtdte informat option or is that option for something else ??
thanks
kumar
Need to: no! Advisable: yes! Why make SAS guess if you know the true order of the month, day and year components?
Phaani wrote:
Hello Art and Fried,
Just a question regarding options datestyle=mdy; do I need to use it for using anydtdte informat option or is that option for something else ??
thanks
kumar
The datestyle system option tells the anydt* formats what style date to use as a default when it cannot determine the proper order for the given date. For instance in this example we have:
11/01/2009
This could be November 1st 2009 or January 11th 2009, I want is to be Nov. 1 so I assign datestyle=mdy. With bad data always comes mistakes and assumptions so this should be assigned to the occurance you view as being most frequent.
jerry898969 wrote:
ERROR 48-59: The informat $ANYDTDTE was not found or could not be loaded.
ERROR 48-59: The format $YYMMDD was not found or could not be loaded.
I avoid this issue by using the special _infile_ variable, if you are changing the format from a dataset you will need to use input function for my method. Art's post would be my solution for this situation.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.