BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ammarhm
Lapis Lazuli | Level 10

Hi everyone

I have a dataset that looks like this:

Patient_ID Visit_Date
1 13-08-2016
1 28-05-2013
1 24-12-2013
2 23-05-2012
2 11-01-2014
2 03-10-2017
2 16-02-2015
3 03-08-2013
3 28-10-2017
3 27-10-2013
3 18-04-2015
4 28-04-2016
4 07-08-2012
4 19-03-2013
4 25-05-2017
5 21-11-2014
5 20-03-2016
5 08-01-2015
5 21-09-2012

 

What i want to do is to extract a list of patients who have been seen in any 2 consecutive  years.

Could anyone help please?

Kind regards

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

You don't really tell us how the desired result should look like - especially for cases where you have two visits in year 1 and two visits in year 2.

Below one way to go:

data have;
  infile cards expandtabs truncover;
  input Patient_ID  Visit_Date : ddmmyy12.;
  year=year(visit_date);
  format visit_date ddmmyy10.;
  cards;
1 13-08-2016
1 28-05-2013
1 24-12-2013
2 23-05-2012
2 11-01-2014
2 03-10-2017
2 16-02-2015
3 03-08-2013
3 28-10-2017
3 27-10-2013
3 18-04-2015
4 28-04-2016
4 07-08-2012
4 19-03-2013
4 25-05-2017
5 21-11-2014
5 20-03-2016
5 08-01-2015
5 21-09-2012
;

proc sql;
  create table want as
    select l.*, r.visit_date as next_years_date format=ddmmyy10.
    from have as l inner join have as r
      on 
        l.patient_id=r.patient_id 
        and intck('year',l.Visit_Date,r.Visit_Date)=1
  ;
quit;

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Something like (assumes sorted):

data want;
  set have;
  retain lst_v;
  if _n_=1 then lst_v=visit_date;
  else do;
    if year(visit_date)=year(lst_v)+1 then output;
  end;
  lst_v=visit_date;
run;
proc sort data=want nodupkey;
  by patient_id;
run;
Patrick
Opal | Level 21

You don't really tell us how the desired result should look like - especially for cases where you have two visits in year 1 and two visits in year 2.

Below one way to go:

data have;
  infile cards expandtabs truncover;
  input Patient_ID  Visit_Date : ddmmyy12.;
  year=year(visit_date);
  format visit_date ddmmyy10.;
  cards;
1 13-08-2016
1 28-05-2013
1 24-12-2013
2 23-05-2012
2 11-01-2014
2 03-10-2017
2 16-02-2015
3 03-08-2013
3 28-10-2017
3 27-10-2013
3 18-04-2015
4 28-04-2016
4 07-08-2012
4 19-03-2013
4 25-05-2017
5 21-11-2014
5 20-03-2016
5 08-01-2015
5 21-09-2012
;

proc sql;
  create table want as
    select l.*, r.visit_date as next_years_date format=ddmmyy10.
    from have as l inner join have as r
      on 
        l.patient_id=r.patient_id 
        and intck('year',l.Visit_Date,r.Visit_Date)=1
  ;
quit;
Ksharp
Super User
data have;
infile cards expandtabs truncover;
input Patient_ID	Visit_Date : ddmmyy12.;
year=year(visit_date);
format visit_date ddmmyy10.;
cards;
1	13-08-2016
1	28-05-2013
1	24-12-2013
2	23-05-2012
2	11-01-2014
2	03-10-2017
2	16-02-2015
3	03-08-2013
3	28-10-2017
3	27-10-2013
3	18-04-2015
4	28-04-2016
4	07-08-2012
4	19-03-2013
4	25-05-2017
5	21-11-2014
5	20-03-2016
5	08-01-2015
5	21-09-2012
;
proc sort data=have out=temp nodupkey;
by Patient_ID year;
run;
data temp;
 set temp;
 by Patient_ID;
 dif=dif(year);
 if first.Patient_ID then call missing(dif);
run;
proc sql;
select distinct Patient_ID
 from temp
  group by Patient_ID
   having sum(dif=1) ne 0;
quit;
mkeintz
PROC Star

I suggest a merge of HAVE with itself, in which one of the merged datasteams starts with firstobs=2.

 

 

data want (keep=patient year1 year2);
  set have (keep=patient);
  by patient;

  merge have
        have (firstobs=2 keep=date rename=(date=nxt_date));
  year1=year(date);
  year2=year(nxt_date);

  if last.patient=0 and year2=year1+1;
run;

 

This writes out one record for each instance of two consecutive records belonging to 2 consecutive years.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Peter_C
Rhodochrosite | Level 12
Patrick's approach is the way to go especially if the input could sit in memory.
s_lassen
Meteorite | Level 14
data want;
  used=0; /* this ID has not been output yet */
  do until(last.id);
    set have;
    by id;
    if used then continue;
    if dif(year(visit_date)) ne 1 then continue;
    if first.id then continue;
    used=1;
    output;
    end;
  keep id;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2004 views
  • 7 likes
  • 7 in conversation