BookmarkSubscribeRSS Feed
Sathish_jammy
Lapis Lazuli | Level 10

Hi,

I need to calculate the Record_date column using Duration and Diagnosis_Date.

The Duration column in different formats. I simply updated the Record_date column as manual calculation

but suggest me to find the accurate date in the Record_date column.

 

    OUTPUT
Duration Diag_Date Record_Date
4 Years 2 Days  23-06-2016 21-06-2012
12 Years  27-06-2008 27-06-1996
5 Years 1 Month 3 Days  14-05-2015 11-04-2010
3 Month 17 Days  09-03-2020 21-11-2019
5 Days  18-03-2020 13-03-2020
1 MONTH 07-05-2020 07-04-2020
3 Weeks 18-03-2019 25-02-2019

Thanks in advance!

2 REPLIES 2
Reeza
Super User
1. Use COUNTW() to find the number of terms.This is used to control the loop ie first row has years and days whereas the last has only weeks.
2. Use SCAN to extract the components within a loop.
3. Pass parameters to INTNX() - you need to likely remove the s from weeks etc.

data want;
set have;
x= countw(duration);
new_date = diag_date;
do i=1 to x by 2;
increment = scan(duration, i);
interval = scan(duration, i+1);
new_date = intnx(increment, new_date, duration, 's');
end;

run;

This is only an outline. Untested. You likely need to verify your increments come in correctly but I think the solution as outlined would work.

I'd also explore a custom informat but that seems like a lot more work.
mkeintz
PROC Star

@Sathish_jammy 

 

Your sample observation with

   duration=3 month 17 days,  and diag_date=09-03-2020

is interesting

 

You have the expected record_date=21-11-2019, which is not the result SAS will generate if you

  1. First, subtract 3 months from 09-03-2020,    yielding  09-12-2019,    09-12-2020
       followed by
  2. Subtract 17 days from   09-12-2019,    yielding  record_date= 22-11-2019

Yet if you went in the other direction, starting with record_date=22-11-2019, add three months, then add 17 days you would get diag_date=09-03-2020, as per your sample.

 

You will have to decide which date is the reference point.  Given the text in duration ("x years y months z days") the actual number of days between record_date and diag_date is not symmetrical.   BTW, you would still have asymmetry even if you reversed the elements in duration (z days y months x years).

 

If, since most people think of duration as going forward, you want to make record_date as the reference point, even though it is derived from diag_date, you'll have to do a two-stage process:

  1. Calculate a provisional record_date first, going backwards from diag_date.
  2. Then see if going forward from the provisional record_date yields diag_date.  If it doesn't, adjust the provisional record_date accordingly.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 921 views
  • 0 likes
  • 3 in conversation