- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi SAS community,
I encountered a data change issue when I use proc import to call in XLSX into sas. There is 1 column, let us call it date1. And the column is text in excel file, contains date information in different formats, please see screenshot1. When it is imported into SAS, the value of date1 changed as screenshot2. You can see highlighted part in screenshot2 becomes 4xxxx. If I tried to use put function to convert it to charater, the value changed to other dates as shown in screenshot3.
Is ithere any solution to avoid this data change issue?
Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What you are describing is what happens when you mix date values and strings in the same column in your Excel file. When SAS sees mixed types in the same column it sets the field as character in SAS. But then the date values come across as digit strings that represent the actual number that Excel uses to store a date, instead of the formatted value that Excel shows you when looking at the spreadsheet.
Easiest fix is to find those cells in the Excel sheet that have date values instead of strings and convert them from date to text. Then they will import the same as the other text fields.
If you cannot do that then you can try to write code to check if the value looks like a raw Excel date and then convert it.
So assume your variable with the mixed up column is named RAW this code will make a new column name DATE with a SAS date value.
data want;
set have;
date=input(raw,??date11.);
format date date9.;
if missing(date) and 1 <= input(date,??32.) <= 50000 then do;
date = int(input(date,??32.))+'30DEC1899'd ;
end;
run;
SAS uses 1960 as the base year and Excel uses 1900. But they use different starting numbers and Excel mistakenly thinks that 1900 was a leap year. Hence the offset is 2 days different then just remove the SAS date for 1/1/1900 (remember that dates before 1960 are negative numbers in SAS).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What you are describing is what happens when you mix date values and strings in the same column in your Excel file. When SAS sees mixed types in the same column it sets the field as character in SAS. But then the date values come across as digit strings that represent the actual number that Excel uses to store a date, instead of the formatted value that Excel shows you when looking at the spreadsheet.
Easiest fix is to find those cells in the Excel sheet that have date values instead of strings and convert them from date to text. Then they will import the same as the other text fields.
If you cannot do that then you can try to write code to check if the value looks like a raw Excel date and then convert it.
So assume your variable with the mixed up column is named RAW this code will make a new column name DATE with a SAS date value.
data want;
set have;
date=input(raw,??date11.);
format date date9.;
if missing(date) and 1 <= input(date,??32.) <= 50000 then do;
date = int(input(date,??32.))+'30DEC1899'd ;
end;
run;
SAS uses 1960 as the base year and Excel uses 1900. But they use different starting numbers and Excel mistakenly thinks that 1900 was a leap year. Hence the offset is 2 days different then just remove the SAS date for 1/1/1900 (remember that dates before 1960 are negative numbers in SAS).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sometimes, depending on how badly the data has been mixed up, then saving a spreadsheet as a CSV file and importing that may fix this sort of issue. If using a wizard to import CSV set the guessingrows option available with delimited files to a large value or in Proc Import code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content