Hi all,
I have received an Excel file, where the date column is written in two formats: 7/nov/13 and 21oct2010. I can manually change the three month letter into /mon/ format in Excel.
However, I would like to do that only in SAS. When I use
Libname abc xlsx "~\abc.xlsx";
data abc;
set abc.Sheet1;
run;
The date column returns SAS Output 41585 and 21oct2010. I check with proc contents and the format for the date is char.
Do you have any idea of reading mixture date formats?
Thank you for reading my question!
I'd return such **** to sender.
Since you now have it as character in SAS, this means the following:
- 21oct2010 is stored in Excel as a string, causing the column to be considered character by libname xlsx
- therefore you get the raw numeric value of the real Excel date converted to a string in the same column
You can conditionally convert such values:
data have;
input column :$10.;
cards;
21oct2010
41585
;
run;
data want;
set have;
if length(column) > 5
then mydate = input(column,date9.);
else mydate = input(column,5.) + '30dec1899'd;
format mydate e8601da10.;
run;
This will of course work only until the sender comes up with another surprise for you.
Edit: corrected the date offset for Excel according to @Tom's advice.
Save as CSV, read in that variable using the anydtdte. informat.
Save as CSV, read in as text, then convert.
Fix the bad data at source.
Ditch Excel.
I'd return such **** to sender.
Since you now have it as character in SAS, this means the following:
- 21oct2010 is stored in Excel as a string, causing the column to be considered character by libname xlsx
- therefore you get the raw numeric value of the real Excel date converted to a string in the same column
You can conditionally convert such values:
data have;
input column :$10.;
cards;
21oct2010
41585
;
run;
data want;
set have;
if length(column) > 5
then mydate = input(column,date9.);
else mydate = input(column,5.) + '30dec1899'd;
format mydate e8601da10.;
run;
This will of course work only until the sender comes up with another surprise for you.
Edit: corrected the date offset for Excel according to @Tom's advice.
Great answer, except the conversion factor is a little off.
1) SAS considers zero the first day and Excel considers the first day as number one.
2) Excel thinks that 1900 was a leap year.
So you can either use '30DEC1899'd as your constant or subtract an additional 2 days.
Thank you all for your help!
I will try to ask the source to fix the column first, otherwise the solution here is perfect.
The goal has to be to reduce the process to the least necessary steps, and to make it error-proof in the sense that the process detects problems at the earliest possible moment.
The way to do this is to use a stable and reliable data transfer format (which implicitly disqualifies Excel files), and data steps that are custom-written to the file specification and will throw an error when unexpected data (unexpected either through content or format) is fed into them.
All my ETL jobs follow this guideline, and the only problems that "slip through" are of a semantic type where completely plausible data is given that later is found to be logically defective.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.