- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
When import csv file with date column like below, SAS data will have a small percentage of rows showing incorrect date, some dates become future dated. i.e. date 2020-03-09 may be converted to 3rd September instead of 09 March. I had similar issue with reading CSV using date format ddmmyy10. of which now resolved by using format date9. instead of using ddmmyy10.
I guess the solution might be similar to using a different format other than
anydtdtm40. or datetime. ?
Can anyone help me out?
it is frustrating when you spot some of the dates were incorrectly converted and you have no idea how to make it right.
Thanks 🙂
date in CSV file:
SAS data:
Left with wrong dates, right column is the correct date it should be showing.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Good that you could resolve the issue for now. Reading through this discussion the following just for the future.
Difference between .csv and .xlsx
A .csv is a comma delimited text file. The .csv format is closely tied to MS Excel and you can easily open it using Excel - but: Excel will parse the .csv text file and convert the "raw" values to what Excel "thinks" they are. And it will represent these internal values using default Excel formats. So opening a .csv with Excel is anything but "What You See is What You Get".
With SAS you read .csv without any Excel involvement. SAS will basically do what Excel does: Impute and convert the text in the .csv into SAS variables. That's why everybody is always asking to show data not how it looks in Excel but in a Text editor like Notepad (which doesn't change anything).
Reading a .csv into SAS
You can either use Proc Import - which like Excel will first parse the text data, make some assumptions and then read the data into a SAS table. The result is very often what you need - but as it's based on assumptions SAS makes it's not always what you need/expect.
You can also use a SAS data step. There you have to code for everything but you also have full control and can get what you need.
Now in your case you're obviously using SAS macro %FTP_SAS_PC(test_date,Import); someone else wrote. It's this SAS macro which creates the code reading the .csv text file into SAS. The SAS log you've posted shows what code the SAS macro generates to read the text file - so it's not only transferring the .csv it's also reading the data into a SAS file.
The relevant bit which caused you pain is here:
A SAS informat instructs SAS how to read a text string into a SAS variable. Here it instructs SAS how to read a text string representing a date into a SAS numerical variable representing a date (which is the count of days since 1/1/1960). The special thing about the anydtdtm. informat: It accepts different patterns of source strings representing a date (like: 01Mar2020 or 01/03/2020 or 03/01/2020). It just tries one pattern after the other until it can convert the string. The documentation here shows you which text patterns the informat tries. The problem is: "Some people" make a mess and have the month before the day. This leads to a situation where a text pattern representing a date becomes ambiguous. So 01/03/2020 can be 01Mar2020 or 03Jan2012. Because informat anydtdtm. will just try until it can convert a source string to a SAS date, the sequence how it tries is relevant - like does it first try a pattern of mm/dd/yyyy or first dd/mm/yyyy. If your .csv comes from the US then you likely want the informat to try mm/dd/yyyy first; else it's likely first dd/mm/yyyy.
You can instruct SAS which sequence to use by setting option DATESTYLE documented here. If your macro reads some date values wrongly into SAS then investigate what you've actually got in the .csv and then set the DATESTYLE option accordingly prior to calling macro %FTP_SAS_PC()
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Show us the code you are using. Show us a portion of the exact data set and then show us the results of running the code. You did not show us the data from the csv file that relates to the output you show. The data you showed does not match in any way the results you show.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
DO NOT POST DATA AS PICTURES!!
Copy/paste text data into a window opened with </>, so we can readily use it.
Post your code in a similar way, you can also use the next button (little running man) for this.
Your timestamps can be easily read using the e8601dt informat:
data want;
input dt e8601dt19.;
format dt datetime19.;
datalines;
2020-02-21 00:00:00
;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What do the lines in the CSV file look like? You posted a photograph of what looks like part of a spreadsheet. A CSV file is a TEXT file. Just copy and paste the lines of text from the file. Do NOT open the CSV file with Excel or any other spreadsheet program, they will change the contents into a spreadsheet instead of showing you the actual text in the file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Thank you all for your suggestions. My apologies not making the issue easier for helpers 🙂
there are a couple of macros being used in the programme to transfer CSV file from PC to server.
The datetime column in csv looks like this (note the difference in the formula field to the cell):
Below are the codes used to get the data:
%FTP_SAS_PC(test_date,Import);
data test;
set test_date;
format date DTDATE9.;
run;
proc sort data=test nodupkey;by date; run;
the test data will show some incorrect dates where the day & month being misplaced.
after changed the format for the dates column in CSV from ddmmyyyy to mmddyyyy, the issue resolved using the same code above.
When I changed the format to
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It does NOT(!!) look like this AT ALL.
You are opening the csv file with Excel, what you see on the screen is what Excel makes out of it, which is usually quite some distance away from what's actually in there. It is quite possible that Excel converted the dates wrongly when reading the csv.
DO NOT(!!) post data as pictures, pictures are USELESS.
Open the csv file with a text editor, and copy/paste the text into a window opened with </>. THIS IS MANDATORY AND ABSOLUTELY REQUIRED for us to give proper help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, my dear helpers 🙂
This issue exist when import CSV from PC to server.
I had an earlier issue which is similar but exist when export data to CSV from sas server to PC. It was resolved with the help from @Reeza who suggested change dates format from ddmmyy to mmddyy as that is what excel likes (I guess it is to do with default settings in excel). I don't have any export dates issue ever since I changed the date format from ddmmyy10. to date9. for all sas data sets to be exported. This stopped excel to incorrectly switch numbers (under 13) between month and day.
I finally managed a work around to ensure dates are correctly imported from CSV file from PC to server by change the format in CSV(in excel) to mmddyy format (customized format).
I wondered if there is an option set the default dates format that will in line with default settings between excel/csv to SAS environment?
in the macro there is nothing set the dates format, but the log shows the date format(datetime.) and informat(anydtdtm40.):
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Show us the REAL contents of the csv file.
Everything else is just a waste of time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
03/10/2020 0:00
03/11/2020 0:00
03/11/2020 0:00
03/16/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/15/2020 0:00
03/15/2020 0:00
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use the proper informats, and everything will be fine:
data want;
input datepart :mmddyy10. timepart time5.;
dt = dhms(datepart,0,0,timepart);
format
datepart yymmddd10.
timepart time8.
dt datetime19.
;
datalines;
03/17/2020 0:00
03/10/2020 0:00
03/11/2020 0:00
03/11/2020 0:00
03/16/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/15/2020 0:00
03/15/2020 0:00
;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The dates will be correctly imported when csv with data formatted below (mm/dd/yyyy):
03/14/2020 0:00
03/15/2020 0:00
03/15/2020 0:00
Some dates will be wrong when import from CSV with below data (yyyy-mm-dd):
2020-03-17 00:00:00
2020-03-10 00:00:00
or (dd/mm/yyyy):
14/03/2020 0:00
09/03/2020 0:00
10/03/2020 0:00
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you have mixed formats in the spreadsheets, you should correct that first. SAS expects a consistent format for a given column throughout all rows.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
the data itself is nice and clean. but it will be converted to wrong dates if it were not using the format starting with mm instead of dd. (error occurred when days number <=12)
So, it worked when I manually changed the format to mm/dd/yy in excel. This is not an ideal solution and needs extra step on a daily basis.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Anyway, it worked by manually change the CSV file (using format mm/dd/yy) before import the CSV to SAS.
Thanks for all the helps 🙂
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Good that you could resolve the issue for now. Reading through this discussion the following just for the future.
Difference between .csv and .xlsx
A .csv is a comma delimited text file. The .csv format is closely tied to MS Excel and you can easily open it using Excel - but: Excel will parse the .csv text file and convert the "raw" values to what Excel "thinks" they are. And it will represent these internal values using default Excel formats. So opening a .csv with Excel is anything but "What You See is What You Get".
With SAS you read .csv without any Excel involvement. SAS will basically do what Excel does: Impute and convert the text in the .csv into SAS variables. That's why everybody is always asking to show data not how it looks in Excel but in a Text editor like Notepad (which doesn't change anything).
Reading a .csv into SAS
You can either use Proc Import - which like Excel will first parse the text data, make some assumptions and then read the data into a SAS table. The result is very often what you need - but as it's based on assumptions SAS makes it's not always what you need/expect.
You can also use a SAS data step. There you have to code for everything but you also have full control and can get what you need.
Now in your case you're obviously using SAS macro %FTP_SAS_PC(test_date,Import); someone else wrote. It's this SAS macro which creates the code reading the .csv text file into SAS. The SAS log you've posted shows what code the SAS macro generates to read the text file - so it's not only transferring the .csv it's also reading the data into a SAS file.
The relevant bit which caused you pain is here:
A SAS informat instructs SAS how to read a text string into a SAS variable. Here it instructs SAS how to read a text string representing a date into a SAS numerical variable representing a date (which is the count of days since 1/1/1960). The special thing about the anydtdtm. informat: It accepts different patterns of source strings representing a date (like: 01Mar2020 or 01/03/2020 or 03/01/2020). It just tries one pattern after the other until it can convert the string. The documentation here shows you which text patterns the informat tries. The problem is: "Some people" make a mess and have the month before the day. This leads to a situation where a text pattern representing a date becomes ambiguous. So 01/03/2020 can be 01Mar2020 or 03Jan2012. Because informat anydtdtm. will just try until it can convert a source string to a SAS date, the sequence how it tries is relevant - like does it first try a pattern of mm/dd/yyyy or first dd/mm/yyyy. If your .csv comes from the US then you likely want the informat to try mm/dd/yyyy first; else it's likely first dd/mm/yyyy.
You can instruct SAS which sequence to use by setting option DATESTYLE documented here. If your macro reads some date values wrongly into SAS then investigate what you've actually got in the .csv and then set the DATESTYLE option accordingly prior to calling macro %FTP_SAS_PC()