- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello everyone. I recently ran into a 'feature' of SAS that I actually never noticed in a little over 2 years using the system. Noting the error, I am actually not sure how to handle it going forward, so any help and ideas would be greatly appreciated!
Basically I am reading from a csv file one field, called "date1". I am reading the field as an Informat Date1 MMDDYY10. as I am expecting the date to come in a known date format of 03/18/2015.
However, one day the users sent me a file with the following in the date field. "41974". What happened is the users got their data from Excel, and accidentally formatted the date field as "numeric", and 41,974 is the number of days between 1-1-1900 and "12-1-2014"
However when SAS read this number in, it actually produced the value "April 19 1974". I was pretty shocked at this, but looking back I realize why it would do this..
However, what informat would I use to get around this problem? I can't use MMDDYY10. anymore, since it incorrectly reads in numbers above as dates....
Any and all help would be appreciated!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm not sure it's a problem with the informat. It's a change in the way the file was formatted. You'll have to translate excel date to sas date.
If you can put today() in for the date that would work nicely. Depending on frequency and amount of data this might not be a good option but it could be a start.
data data2;
infile cards;
informat date1 mmddyy10.;
input date1;
cards;
04/19/1974
;
run;
data have2;
set data2;
number = intck('day',date1,'18mar2015'd);
run;
data want2;
format new_date mmddyy10.;
set have;
new_date = date1 + number;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You could just use something like:
data have;
input date1 $10.;
format datewant date9.;
if anypunct(date1) then datewant=input(date1,anydtdte10.);
else datewant=input(date1,12.)-21916;
cards;
41974
12-1-2014
03/18/2015
;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
DC I think you might have missed the point of my question. I don't want to have to know what the format of the string is coming in at the start of the function... I actually want this file to error out when the value is "41974", or to be smart enough to convert it to the correct date.
My question was more along the line of "is there a date informat that is smart enough to not try to read this number the way that sas is doing it." There might not be.
Arthur Tabachneck Unfortunately this is part of a production process, so I really need a specific informat so I can grab any format errors into the log I am outputting..
The reason being, I have code that actually outputs the SAS log to a text file, and then I am scanning that text file for any formatting errors found through the infile statement.
Theoritically I could make the change that you noted; if I could figure out how to flag errors into a log (or data_set) somehow. Example: If someone put the value "heyman" into the date field, I would want sas to error on and for me to be able to flag the error.
I might have to just read in all of the dates as varchars in the informats as step 1, and then do your trick with added error flagging as a secondary data-step.....
I will do a little bit of testing to see if this works for me (regarldess of if it does or not, that's a very nifty solution. Thanks!.... I also especially like the trick of subtracting 21916 (days between 12/31/1899 and 1/1/1960)).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Since you have to trap errors how about something like:
data have;
input date1 $10.;
format datewant date9.;
length error $20;
if count(date1,'/') eq 2 then
datewant=input(date1,?? mmddyy10.);
if missing(datewant) then do;
if anypunct(date1) then do;
datewant=input(date1,anydtdte10.);
error='unexpected format';
end;
else if not missing(input(date1,?? 12.)) then do;
datewant=input(date1,12.)-21916;
error='Excel number as date';
end;
else error='Invalid value';
end;
cards;
41974
heyman
12-1-2014
03/18/2015
;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Probably easiest to read it as character and then add your own logic to validate and convert the value. You can even trap the invalid values and generate you own error messages.
data _null_ ;
input datestr $10.;
select ;
when (. ne input(datestr,??comma32.)) date=input(datestr,comma32.)+'01JAN1900'd - 2 ;
when (. ne input(datestr,??mmddyy10.)) date=input(datestr,mmddyy10.);
when (. ne input(datestr,??yymmdd10.)) date=input(datestr,yymmdd10.);
when (. ne input(datestr,??ddmmyy10.)) date=input(datestr,ddmmyy10.);
otherwise date=.;
end;
if date=. and datestr ne ' ' then put 'ERROR: Invalid Date String. ' datestr=;
format date date9.;
put (datestr date) (=);
cards;
1/1/1965
41,974
41974
Bad Date.
2014/01/30
30/12/2013
;;;;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Actually that difference is 41972
1 data _null_ ;
2 days_between = '01dec2014'd - '01jan1900'd ;
3 put days_between= ;
4 run ;
days_between=41972
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content