SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Suzy_Cat
Pyrite | Level 9

Hi there,

 

When import csv file with date column like below, SAS data will have a small percentage of rows showing incorrect date, some dates become future dated. i.e. date 2020-03-09  may be converted to 3rd September instead of 09 March.  I had similar issue with reading CSV using date format ddmmyy10. of which now resolved by using format date9. instead of using ddmmyy10. 

 

I guess the solution might be similar to using a different format other than

anydtdtm40. or datetime. ?

 

Can anyone help me out?

it is frustrating when you spot some of the dates were incorrectly converted and you have no idea how to make it right.

 

Thanks 🙂 

 

 

date in CSV file:

Capture.PNG

 

SAS data:

Left with wrong dates, right column is the correct date it should be showing.

 

Capture1.PNG

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@Suzy_Cat 

Good that you could resolve the issue for now. Reading through this discussion the following just for the future.

 

Difference between .csv and .xlsx

A .csv is a comma delimited text file. The .csv format is closely tied to MS Excel and you can easily open it using Excel - but: Excel will parse the .csv text file and convert the "raw" values to what Excel "thinks" they are. And it will represent these internal values using default Excel formats. So opening a .csv with Excel is anything but "What You See is What You Get".

With SAS you read .csv without any Excel involvement. SAS will basically do what Excel does: Impute and convert the text in the .csv into SAS variables. That's why everybody is always asking to show data not how it looks in Excel but in a Text editor like Notepad (which doesn't change anything).

 

Reading a .csv into SAS

You can either use Proc Import - which like Excel will first parse the text data, make some assumptions and then read the data into a SAS table. The result is very often what you need - but as it's based on assumptions SAS makes it's not always what you need/expect.

You can also use a SAS data step. There you have to code for everything but you also have full control and can get what you need.

 

Now in your case you're obviously using SAS macro %FTP_SAS_PC(test_date,Import); someone else wrote. It's this SAS macro which creates the code reading the .csv text file into SAS. The SAS log you've posted shows what code the SAS macro generates to read the text file - so it's not only transferring the .csv it's also reading the data into a SAS file.

The relevant bit which caused you pain is here:

Capture.JPG

A SAS informat instructs SAS how to read a text string into a SAS variable. Here it instructs SAS how to read a text string representing a date into a SAS numerical variable representing a date (which is the count of days since 1/1/1960). The special thing about the anydtdtm. informat: It accepts different patterns of source strings representing a date (like: 01Mar2020 or 01/03/2020 or 03/01/2020). It just tries one pattern after the other until it can convert the string. The documentation here shows you which text patterns the informat tries. The problem is: "Some people" make a mess and have the month before the day. This leads to a situation where a text pattern representing a date becomes ambiguous. So 01/03/2020 can be 01Mar2020 or 03Jan2012. Because informat anydtdtm. will just try until it can convert a source string to a SAS date, the sequence how it tries is relevant - like does it first try a pattern of mm/dd/yyyy or first dd/mm/yyyy. If your .csv comes from the US then you likely want the informat to try mm/dd/yyyy first; else it's likely first dd/mm/yyyy. 

You can instruct SAS which sequence to use by setting option DATESTYLE documented here. If your macro reads some date values wrongly into SAS then investigate what you've actually got in the .csv and then set the DATESTYLE option accordingly prior to calling macro %FTP_SAS_PC()

 

 

View solution in original post

17 REPLIES 17
PaigeMiller
Diamond | Level 26

Show us the code you are using. Show us a portion of the exact data set and then show us the results of running the code. You did not show us the data from the csv file that relates to the output you show. The data you showed does not match in any way the results you show.

--
Paige Miller
Kurt_Bremser
Super User

DO NOT POST DATA AS PICTURES!!

Copy/paste text data into a window opened with </>, so we can readily use it.

Post your code in a similar way, you can also use the next button (little running man) for this.

Your timestamps can be easily read using the e8601dt informat:

data want;
input dt e8601dt19.;
format dt datetime19.;
datalines;
2020-02-21 00:00:00
;
Tom
Super User Tom
Super User

What do the lines in the CSV file look like? You posted a photograph of what looks like part of a spreadsheet. A CSV file is a TEXT file.  Just copy and paste the lines of text from the file. Do NOT open the CSV file with Excel or any other spreadsheet program, they will change the contents into a spreadsheet instead of showing you the actual text in the file.

Suzy_Cat
Pyrite | Level 9

Hi Thank you all for your suggestions. My apologies not making the issue easier for helpers 🙂

 

there are a couple of macros being used in the programme to transfer CSV file from PC to server.

The datetime column in csv looks like this (note the difference in the formula field to the cell):

Capture.PNG

 

Below are the codes used to get the data:

 

%FTP_SAS_PC(test_date,Import);

data test;

set test_date;

format date DTDATE9.;

run;

proc sort data=test nodupkey;by date; run;

 

the test data will show some incorrect dates where the day & month being misplaced.

 

after changed the format for the dates column in CSV from ddmmyyyy to mmddyyyy, the issue resolved using the same code above.

 

 

 

When I changed the format to

 

 

 

Kurt_Bremser
Super User

It does NOT(!!) look like this AT ALL.

You are opening the csv file with Excel, what you see on the screen is what Excel makes out of it, which is usually quite some distance away from what's actually in there. It is quite possible that Excel converted the dates wrongly when reading the csv.

DO NOT(!!) post data as pictures, pictures are USELESS.

Open the csv file with a text editor, and copy/paste the text into a window opened with </>. THIS IS MANDATORY AND ABSOLUTELY REQUIRED for us to give proper help.

Suzy_Cat
Pyrite | Level 9

Hi, my dear helpers 🙂

 

This issue exist when import CSV from PC to server.

 

I had an earlier issue which is similar but exist when export data to CSV from sas server to PC.  It was resolved with the help from @Reeza who suggested change dates format from ddmmyy to mmddyy as that is what excel likes (I guess it is to do with default settings in excel). I don't have any export dates issue ever since I changed the date format from ddmmyy10. to date9. for all sas data sets to be exported.  This stopped excel to incorrectly switch numbers (under 13) between month and day.

 

I finally managed a work around to ensure dates are correctly imported from CSV file from PC to server by change the format in CSV(in excel) to mmddyy format (customized format).

 

I wondered if there is an option set the default dates format that will in line with default settings between excel/csv to SAS environment?

 

in the macro there is nothing set the dates format, but the log shows the date format(datetime.) and informat(anydtdtm40.):

Capture.PNG

 

 

Suzy_Cat
Pyrite | Level 9
03/17/2020 0:00
03/10/2020 0:00
03/11/2020 0:00
03/11/2020 0:00
03/16/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/15/2020 0:00
03/15/2020 0:00
Kurt_Bremser
Super User

Use the proper informats, and everything will be fine:

data want;
input datepart :mmddyy10. timepart time5.;
dt = dhms(datepart,0,0,timepart);
format
  datepart yymmddd10.
  timepart time8.
  dt datetime19.
;
datalines;
03/17/2020 0:00
03/10/2020 0:00
03/11/2020 0:00
03/11/2020 0:00
03/16/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/14/2020 0:00
03/15/2020 0:00
03/15/2020 0:00
;
Suzy_Cat
Pyrite | Level 9
put the story short:

The dates will be correctly imported when csv with data formatted below (mm/dd/yyyy):
03/14/2020 0:00
03/15/2020 0:00
03/15/2020 0:00

Some dates will be wrong when import from CSV with below data (yyyy-mm-dd):
2020-03-17 00:00:00
2020-03-10 00:00:00

or (dd/mm/yyyy):
14/03/2020 0:00
09/03/2020 0:00
10/03/2020 0:00
Suzy_Cat
Pyrite | Level 9
@Kurt_Bremser Thank you very much for your great support. I am really appreciated your time and effort in helping people out :).

the data itself is nice and clean. but it will be converted to wrong dates if it were not using the format starting with mm instead of dd. (error occurred when days number <=12)
So, it worked when I manually changed the format to mm/dd/yy in excel. This is not an ideal solution and needs extra step on a daily basis.


Suzy_Cat
Pyrite | Level 9
The problem is that I can not see anywhere in the code the format anydtdtm40. or datetime. being used (as highlighted in the screenshot from the log file for the macro). Therefore I have no control to use the correct informat or format to read the date data from the CSV 😞

Anyway, it worked by manually change the CSV file (using format mm/dd/yy) before import the CSV to SAS.
Thanks for all the helps 🙂
Patrick
Opal | Level 21

@Suzy_Cat 

Good that you could resolve the issue for now. Reading through this discussion the following just for the future.

 

Difference between .csv and .xlsx

A .csv is a comma delimited text file. The .csv format is closely tied to MS Excel and you can easily open it using Excel - but: Excel will parse the .csv text file and convert the "raw" values to what Excel "thinks" they are. And it will represent these internal values using default Excel formats. So opening a .csv with Excel is anything but "What You See is What You Get".

With SAS you read .csv without any Excel involvement. SAS will basically do what Excel does: Impute and convert the text in the .csv into SAS variables. That's why everybody is always asking to show data not how it looks in Excel but in a Text editor like Notepad (which doesn't change anything).

 

Reading a .csv into SAS

You can either use Proc Import - which like Excel will first parse the text data, make some assumptions and then read the data into a SAS table. The result is very often what you need - but as it's based on assumptions SAS makes it's not always what you need/expect.

You can also use a SAS data step. There you have to code for everything but you also have full control and can get what you need.

 

Now in your case you're obviously using SAS macro %FTP_SAS_PC(test_date,Import); someone else wrote. It's this SAS macro which creates the code reading the .csv text file into SAS. The SAS log you've posted shows what code the SAS macro generates to read the text file - so it's not only transferring the .csv it's also reading the data into a SAS file.

The relevant bit which caused you pain is here:

Capture.JPG

A SAS informat instructs SAS how to read a text string into a SAS variable. Here it instructs SAS how to read a text string representing a date into a SAS numerical variable representing a date (which is the count of days since 1/1/1960). The special thing about the anydtdtm. informat: It accepts different patterns of source strings representing a date (like: 01Mar2020 or 01/03/2020 or 03/01/2020). It just tries one pattern after the other until it can convert the string. The documentation here shows you which text patterns the informat tries. The problem is: "Some people" make a mess and have the month before the day. This leads to a situation where a text pattern representing a date becomes ambiguous. So 01/03/2020 can be 01Mar2020 or 03Jan2012. Because informat anydtdtm. will just try until it can convert a source string to a SAS date, the sequence how it tries is relevant - like does it first try a pattern of mm/dd/yyyy or first dd/mm/yyyy. If your .csv comes from the US then you likely want the informat to try mm/dd/yyyy first; else it's likely first dd/mm/yyyy. 

You can instruct SAS which sequence to use by setting option DATESTYLE documented here. If your macro reads some date values wrongly into SAS then investigate what you've actually got in the .csv and then set the DATESTYLE option accordingly prior to calling macro %FTP_SAS_PC()

 

 

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 17 replies
  • 14465 views
  • 8 likes
  • 5 in conversation