SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
User12321
Obsidian | Level 7

I have several challenges with dates.

1) I am reading data from Excel.

All  but one columns reads automatically as normal dd/mm/yyyy i.e. MMDDYY10. according to proc contents.

However, one column originally had  mixture of dates in format  12/31/2020 and TBD.  As a result date comes as Char 5. And all dates are automatically converted to some kind on numbers like 44196 (instead of 12/31/2020).

I fist replaced TBD with space using proq sql and translate(BadDate, '  ', 'TBD') as Date. 

Now I have column full of 5  digits instead of regular dates (like 44196) or empty cells. I need to convert these 5 digit numbers to normal dates. 

I tried NewDate=input(Date,MMDDYY10.) and get column full of  "." instead of dates.

Any suggestions? What am I doing wrong? Goal is to have in column normal dates or empty cells.

2)Is there a way to compare dates in SAS. For example, if I need to compare dates in two columns and keep latest can I use normal comparison or should I convert all dates to SAS dates first?

3) Similarly, if I need to remove all records with dates that expired 3 months ago, is there a simple solution like read current date, calculate 90 dates determine cutoff date.

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
User12321
Obsidian | Level 7
Final solution that worked (if someone will have same problem) :
NewDate=input(Date,best12.);
NewDate=NewDate -365*60 -16; /*I have not idea why, but that gave me correct original dates; */
format NewDate mmddyy10.;

View solution in original post

12 REPLIES 12
SASKiwi
PROC Star

You could try this:

NewDate=input(Date,best12.);
format NewDate mmddyy10.;

Your character variable is no longer in date format but probably the number of days since 1 Jan 1960. Converting that to a number and applying a date format should fix your problem.

User12321
Obsidian | Level 7
Thanks.It almost worked but did not provide accurate results.I now have correct format with column full of wrong dates...Here first 5 lines of original dates:12/31/202012/31/20215/31/20215/31/20218/31/2021

It was read by Excel into SAS as4419644561443474434744439

After using suggested code:
NewDate=input(Date,best12.);
format NewDate mmddyy10.;
My output is:01/01/208101/01/208206/01/208109/01/2081

I am off by 1 day and 60 years :). Should I subtract something prior to conversion?Thanks.
User12321
Obsidian | Level 7

It looks that I need to subtract 365*60+1. Checking...

Tom
Super User Tom
Super User

You need to adjust for difference in where SAS and Excel start counting (also because Excel treats 1900 as a leap year).

Note their is no "best" informat, just use regular informat.

NewDate=input(Date,32.) + '30DEC1899'd;
format NewDate mmddyy10.;

PS If you are sharing this data with other you might want to use a format that does not use either MDY or DMY ordering of date parts as either choice will confuse half of you audience.  I like to use YYMMDD10. or DATE9.

User12321
Obsidian | Level 7

Thanks!!! That worked too.

User12321
Obsidian | Level 7
Final solution that worked (if someone will have same problem) :
NewDate=input(Date,best12.);
NewDate=NewDate -365*60 -16; /*I have not idea why, but that gave me correct original dates; */
format NewDate mmddyy10.;
Tom
Super User Tom
Super User

It works because that is the value SAS uses to store 2 days before start of 1900.  SAS counts from 1960 and Excel counts from 1900.  They use different choices for whether to count from 0 or 1 and Excel treats 1900 has a leap year.

 

596    data test;
597     x = -365*60 -16;
598     y = '30DEC1899'd ;
599     put (x y) (= comma12. / );
600   run;

x=-21,916
y=-21,916
Kurt_Bremser
Super User
What you subtracted is '30dec1899'd.
Excel starts (nominally) at 1900-01-01, but it starts with day 1 (1960-01-01 is day zero in SAS), and mistakes 1900 for a leap year.
Kurt_Bremser
Super User

The best method is this:

  • save the data to a textual format, like CSV
  • create a custom informat that reads 'TBD' as missing, and other with DDMMYY10.
  • use this informat in a DATA step that reads the text file

With your current data, you need to to this:

if chardate = 'TBD'
then numdate = .;
else numdate = input(chardate,5.) + '30dec1899'd;
format numdate yymmdd10.;

The addition of 30dec1899 corrects the different zero day of Excel vs. SAS.

User12321
Obsidian | Level 7

Thanks a lot to everyone who replied! 

If some one has idea how to answer items 2 and 3 from original post, it would be greatly appreciated.

ballardw
Super User

1) I am reading data from Excel.

All  but one columns reads automatically as normal dd/mm/yyyy i.e. MMDDYY10. according to proc contents.

However, one column originally had  mixture of dates in format  12/31/2020 and TBD.  As a result date comes as Char 5. And all dates are automatically converted to some kind on numbers like 44196 (instead of 12/31/2020).

I fist replaced TBD with space using proq sql and translate(BadDate, '  ', 'TBD') as Date. 

Now I have column full of 5  digits instead of regular dates (like 44196) or empty cells. I need to convert these 5 digit numbers to normal dates. 

 

You do not say how you are reading the values for Excel. This sort of smells like using Proc Import. Which often results in sub-optimal results because Excel imposes absolutely NO restrictions on what goes into any single cell and so results can be pretty poor.

 

If you save the file as a CSV and use a data step to read the resulting text file you have much more control over the result and you can use custom informats to read problematic data such as you show.

Example:

proc format;
invalue tbddate  (upcase)
'TBD' = .
other = [mmddyy10.]
;

data example;
   input x :tbddate.;
   format x mmddyy10.;
datalines;
TBD
tbd
tBD
01/01/2021
10/12/1995
;

This approach has several advantages. First, your code to remove 'TBD' only works if it always all upper case. The informat as defined with the (upcase) option converts all text to uppercase before comparsion so would get any of 3-letter versions correctly assigned to missing. Second if you wanted to assign a specific date value for some reason other than missing you could do that as:  'TBD' = '31DEC2099'd  for example.

Another is if at a latter date you have multiple date columns with this headache you can use the same format to read all of them.

 

2) SAS date values are numeric. So if you need the "latest" that would be the largest of a group of values and the MAX function would work:  LatestDate = max(date1, date2, newdate);

 

3) Similarly, if I need to remove all records with dates that expired 3 months ago, is there a simple solution like read current date, calculate 90 dates determine cutoff date.

The function TODAY() returns the current date. You could use the INTCK function to get the number of intervals specified. The question becomes how you define "3 months ago".

The basic approach would be

IF intck('month',date,Today()) ge 3 then delete.

You specifically asked for month so that is the interval I show. You can ask for 'year', 'week', 'day', time intervals and if the variables are datetime values ask for 'dtyear' 'dtmonth' 'dtweek' 'dtday' to use the part of the algorithm to return the intervals correctly. There are also offsets available.

The details can come with exact values you need for a specific purpose. Sometimes "3 months" might depend on the day of the month. So the INTCK function has an additional parameter that works with where the boundaries may fall. Or you can shift either of dates using the INTNX function as well.

The basic Intck as shown would return 1 for the interval between (today as) 1May2021 and 30Apr2021. It would also return 1 for 1Apr2021.

 

 

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 3700 views
  • 3 likes
  • 5 in conversation