I am reading xlsx. data into SAS and am using USEDATE=YES. I have two date variables and both are in the exact same format in Excel. (In the cells themselves they appear such as 11-Feb-98 and in the function bar 2/11/1998.) When I use USEDATE and then run a proc print on the resulting dataset, only one of the date variables is transformed to date9 format. Does anyone have ideas why this is happening?
PS: Apologies if my language is off, I am very new to SAS.
Go back to the Excel an make sure the value is a simple date. Sometimes you will find the value is actually a character value that only looks like a date (because Excel provides NO control of any cell content type). If the value is character, and there aren't many of these, then reenter the value and reread the file. Another approach may be to export the xlsx to CSV using File>Save As and read that file. Sometimes the created CSV is more amenable to reading scrambled data.
You might share the entire code you used to read the xlsx file as different approaches may have different alternatives to "fixing" the issue.
Are you sure that both are actually dates? Perhaps one is a string. Try changing the date display style/format in Excel and see if the value changes.
Excel stores values into cells, but a SAS dataset uses variables. A variable has to have the same type of data for every observation. In a spreadsheet any cell can have anything in it. Are they in the same column of the sheet? SAS makes decisions on what type of variable to create based on the values in the column If they are in different columns then they might be getting converted to different types in SAS because of other differences between those two columns.
When SAS converts the sheet into a dataset it must impose a single type (character or numeric) for whole column. If any of the cells in that column have character strings then the whole column are character strings. Once the column is defined as character then any cells with date values in that column will be converted into character strings that contain the decimal representation of the number that Excel uses to store that date.
Thanks! This appears to have been the issue; SAS was reading the variable values as string. I went into the Excel sheet and manually changed the format to a date format for the column. Is there a way to make this fix on the SAS end instead of manually going in and editing the raw data itself?
If you have to use Excel as a data source then just make sure to be consistent in how you enter the data into a column so that every cell in the column is the same type.
You can convert the resulting digit strings that SAS will make for dates by just adjusting the offset date. If Excel is using 1900 as the base date then use:
realdate = input(excel_string,32.) + '30DEC1899'd ;
If it was using 1904 then use:
realdate = input(excel_string,32.) + '31DEC1903'd ;
The one day difference is because Excel treats 1900 as a leap year.
Go back to the Excel an make sure the value is a simple date. Sometimes you will find the value is actually a character value that only looks like a date (because Excel provides NO control of any cell content type). If the value is character, and there aren't many of these, then reenter the value and reread the file. Another approach may be to export the xlsx to CSV using File>Save As and read that file. Sometimes the created CSV is more amenable to reading scrambled data.
You might share the entire code you used to read the xlsx file as different approaches may have different alternatives to "fixing" the issue.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.