BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
crimdoc
Calcite | Level 5

Hi,

 

I'm new to the forum, so I'm looking forward to not only getting some support from the community forums but also hoping to also provide some where possible.

 

I can't seem to figure out the following (without considering transposing the data in some way).  I'm trying to do some calculations using dates but the dates I'm working with are on two separate records for the same individual.   I suspect there is some lag function, but this is an area that I am unfamiliar with.

 

Here is what the data looks like:

 

ID     DATE1                   DATE2

1      2009-06-14       2009-08-11

1      2009-10-29

1      200912-31

2      2009-09-05       2009-11-10

2      2009-10-09

2      2010-05-23

and so on.....

 

The output I'm looking for is the following:

ID     DATE1                 DATE2            DAYS

1     2009-06-14       2009-08-11         79  

1     2009-10-29

1     2009-12-31

2     2009-09-05       2009-11-10        194

2     2009-10-09

2     2010-05-23

 

and so on.....

 

So, for the first person (ID #1)  (LAG)DATE1 - DATE2 = 20091029 - 20090811 = 79 days.   So, no further calculations are required.

 

For the 2nd person (ID #2)  (LAG)DATE1 - DATE2 =  20091009 - 20091110 = -32 days.  When a negative "DAYS" is encountered, I would like to be able to check the next record for that person to see if there is a non-negative count.  So, it would loop to the 3rd record whereby (LAG)DATE1 - DATE2 = 20100523 - 20091110 = 194 days.

 

For illustrative purposes, I've included the actual dates, but I did convert the dates into numeric values using  NEWDATE1 = input (DATE1, yymmdd8.) to allow for the calculations.

 

Essentially, I would like one record per person which tells me when did the next event (DATE1) occur after DATE2.

 

Any help with this would be GREATLY appreciated.

 

Thanks in advance for your time,

M. Anthony

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

This will do it:

 


proc sql;
create table want as
select 
    a.*, 
    case 
        when date2 is missing then .
        else intck( "DAY", date2,
            (select min(date1) from have where id=a.id and date1 > a.date2) )
        end as days
from 
    have as a
order by id, date1;
quit;
PG

View solution in original post

6 REPLIES 6
LinusH
Tourmaline | Level 20
Since you need to access the information on the next row, I think you need to re-sort your data in descending order on date1.
The you can use lag () or RETAIN (with BY and first./last.-logic). Which to use is a matter of taste, and for me I find RETAIN easier to follow. There a tons of lag () and RETAIN examples on support.Sas.com.
Data never sleeps
Steelers_In_DC
Barite | Level 11

I agree, you can't look down and then pull up.  If you want the presentation you have I think you need to sort:

 

data have;
infile cards dsd;
informat d 5. date1 date2 yymmdd10.;
format d 5. date1 date2 yymmdd10.;
input D DATE1 DATE2;
cards;
1,2009-06-14,2009-08-11
1,2009-10-29,
1,2009-12-31,
2,2009-09-05,2009-11-10
2,2009-10-09,
2,2010-05-23,
;

proc sort data=have;by d descending date1;

data want;
set have;
by d descending date1;
days2 = lag3(date1);
DAYS = lag(date1)-date2;
if days < 0 and not missing(days) then days = date2-days2;
drop days2;
run;

proc sort data=want;by d date1;

PGStats
Opal | Level 21

This will do it:

 


proc sql;
create table want as
select 
    a.*, 
    case 
        when date2 is missing then .
        else intck( "DAY", date2,
            (select min(date1) from have where id=a.id and date1 > a.date2) )
        end as days
from 
    have as a
order by id, date1;
quit;
PG
slchen
Lapis Lazuli | Level 10

If date1 of 3 or more later record in each ID is larger than date2, in such condition, you could try hash.

 

data want;
   if 0 then set have(keep=id date1);
   if _n_=1 then do;
    declare hash h(dataset:'have(keep=id date1 rename=(date1=_date1))',multidata&colon;'y');
    declare hiter hi('h');
      h.definekey('id');
      h.definedata('_date1');
      h.definedone();
    end;
    set have;
    by id notsorted;
    if first.id then do;
       rc=hi.setcur();
       do while (rc=0);
         if _date1>date2 then do;
           temp=_date1;      
           goto day;
         end;
         rc=hi.next();
    end;
    day: day=temp-date2;
    end;
    drop temp _date1 rc;
run;

 

crimdoc
Calcite | Level 5

Hi,

 

Thank you for all the quick replies and for the various suggestions.  For my current purposes, the solution by PGStats seemed to be relatively straightfoward and did the trick.   The other solutions are certain to come in hand in the future, so they are definitely worth adding to my syntax list.

 

Thanks again for the help.

FreelanceReinh
Jade | Level 19

Just for the records:

Solutions to this kind of problem using the RETAIN statement or (especially) the LAGn functions, as have been suggested, are somewhat prone to unwanted carry-over effects from one BY group to another. Care has to be taken to avoid these. Neither RETAIN nor LAG/LAGn respect BY groups by default.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1119 views
  • 3 likes
  • 6 in conversation