Hi,
I am trying to import data using the data step and infile, however one of my data fields has the format '31/12/2015 5:45:00 PM'.
How can I tell SAS to read this as a date (preferably remove time all together).
Ive tried the following.
Data testdata;
infile 'filepath' dsd;
input Date DDMMYY10.;
run;
but Im guessing I need a different Date format, as this gives me an error in the log?
Thanks
Thanks
OK, so it's clear. You mixed list and formatted input, which usually fails. Happens to everyone at least once 😉
Do this instead:
data testing;
informat Created_On mmddyy10.;
length Location $16;
format Created_On mmddyy10.;
infile
'C:\Users\****\Desktop\SAStest.csv'
dlm=','
firstobs=2
dsd
;
input
Case_Number $
Created_On
Location
;
run;
If your data looks exactly like your example, your code should work. Please post the ERROR message and preceding code from the log
Here is the exact code I am using:
data testing;
infile 'C:\Users\****\Desktop\SAStest.csv' dlm=',' firstobs=2 dsd;
input Case_Number $ Created_On DDMMYY10. Location $16.;
run;
Here is the error
NOTE: Increase available buffer lines with the INFILE n= option.
Case_Number=Cas 1 Created_On=. Location=Cas 2,12/26/2015 _ERROR_=1 _N_=1
NOTE: Invalid data for Created_On in line 4 7-16.
5 Cas 4,12/23/2015 10:27,Victoria 31
To further explain what my version does:
The DLM=',' option sets the delimiter. By not using a format in the INPUT statement, I give SAS free hand in snipping the input line into columns (formatted input prevents this, as each format reads exactly the number of bytes specified by the format length, and delimiters are ignored).
Since I specified the date-only informat for Created_On, the remaining bytes of that column in the infile are simply discarded.
If you also want to read the time, read the whole input column into a character variable, split it with SCAN(), and use the INPUT() function with proper informats on the slices to convert to date/time.
@Kurt_Bremser wrote:
Since I specified the date-only informat for Created_On, the remaining bytes of that column in the infile are simply discarded.
I'm under the impression it's not that simple. If I add AM or PM after the time part (as suggested by the initial post), I get "Invalid data" messages and missing values. It's surprising enough to me that MMDDYY10. used with modified list input (the informat length is ignored in this situation) is able to discard the time part without AM/PM. This is not mentioned in the documentation. Even ANYDTDTE. fails (but without messages in the log!) as soon as AM or PM are added, although the documentation states the contrary (incorrectly, I think).
Reading the field into a character variable, as you've described, would be the straightforward remedy in case of such problems.
You are right. Adding something to the string causes problems, so reading the whole field into character and splitting is the way to go.
OK, so it's clear. You mixed list and formatted input, which usually fails. Happens to everyone at least once 😉
Do this instead:
data testing;
informat Created_On mmddyy10.;
length Location $16;
format Created_On mmddyy10.;
infile
'C:\Users\****\Desktop\SAStest.csv'
dlm=','
firstobs=2
dsd
;
input
Case_Number $
Created_On
Location
;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.