- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am trying to import data using the data step and infile, however one of my data fields has the format '31/12/2015 5:45:00 PM'.
How can I tell SAS to read this as a date (preferably remove time all together).
Ive tried the following.
Data testdata;
infile 'filepath' dsd;
input Date DDMMYY10.;
run;
but Im guessing I need a different Date format, as this gives me an error in the log?
Thanks
Thanks
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
OK, so it's clear. You mixed list and formatted input, which usually fails. Happens to everyone at least once 😉
Do this instead:
data testing;
informat Created_On mmddyy10.;
length Location $16;format Created_On mmddyy10.;
infile
'C:\Users\****\Desktop\SAStest.csv'
dlm=','
firstobs=2
dsd
;
input
Case_Number $
Created_On
Location
;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If your data looks exactly like your example, your code should work. Please post the ERROR message and preceding code from the log
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here is the exact code I am using:
data testing;
infile 'C:\Users\****\Desktop\SAStest.csv' dlm=',' firstobs=2 dsd;
input Case_Number $ Created_On DDMMYY10. Location $16.;
run;
Here is the error
NOTE: Increase available buffer lines with the INFILE n= option.
Case_Number=Cas 1 Created_On=. Location=Cas 2,12/26/2015 _ERROR_=1 _N_=1
NOTE: Invalid data for Created_On in line 4 7-16.
5 Cas 4,12/23/2015 10:27,Victoria 31
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
12/23/2015 10:27
Sas isnt reading this as a date, even with the ANYDTDTM10. format I get the Invalid data for Created_On in line 63 8-17.
Error.
Any way around this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To further explain what my version does:
The DLM=',' option sets the delimiter. By not using a format in the INPUT statement, I give SAS free hand in snipping the input line into columns (formatted input prevents this, as each format reads exactly the number of bytes specified by the format length, and delimiters are ignored).
Since I specified the date-only informat for Created_On, the remaining bytes of that column in the infile are simply discarded.
If you also want to read the time, read the whole input column into a character variable, split it with SCAN(), and use the INPUT() function with proper informats on the slices to convert to date/time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Kurt_Bremser wrote:
Since I specified the date-only informat for Created_On, the remaining bytes of that column in the infile are simply discarded.
I'm under the impression it's not that simple. If I add AM or PM after the time part (as suggested by the initial post), I get "Invalid data" messages and missing values. It's surprising enough to me that MMDDYY10. used with modified list input (the informat length is ignored in this situation) is able to discard the time part without AM/PM. This is not mentioned in the documentation. Even ANYDTDTE. fails (but without messages in the log!) as soon as AM or PM are added, although the documentation states the contrary (incorrectly, I think).
Reading the field into a character variable, as you've described, would be the straightforward remedy in case of such problems.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You are right. Adding something to the string causes problems, so reading the whole field into character and splitting is the way to go.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
OK, so it's clear. You mixed list and formatted input, which usually fails. Happens to everyone at least once 😉
Do this instead:
data testing;
informat Created_On mmddyy10.;
length Location $16;format Created_On mmddyy10.;
infile
'C:\Users\****\Desktop\SAStest.csv'
dlm=','
firstobs=2
dsd
;
input
Case_Number $
Created_On
Location
;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content