Combine periods across observations

Accepted Solution Solved
Reply
Regular Contributor
Posts: 161
Accepted Solution

Combine periods across observations

Dear All:

I'm wondering how I can combine a few date range into one single date range if the variable of interest is the same across periods.  Basically, the following is my HAVE table:

doctorperiod_startperiod_endhospital
A19972000X
A20002003X
A20032009X
B19852001Y
B20012009Y
B20092013Z
C19821990K
C19902001M

After combining redundant periods, here is my WANT table:

doctorperiod_startperiod_endhospital
A19972009X
B19852009Y
B20092013Z
C19821990K
C19902001M

How could I best do this, is there a coalesce-like function across observations?  Thank you!


Accepted Solutions
Solution
‎07-25-2013 03:17 AM
SAS Super FREQ
Posts: 709

Re: Combine periods across observations

Posted in reply to caveman529

Have a look at the following code:

*
* create test data
*;

data test;
  infile cards dlm=",";
 
input
    doctor :
$8.
    period_start :
8.
    period_end :
8.
    hospital :
$8.
  ;
cards;
a,1997,2000,x
a,2000,2003,x
a,2003,2009,x
b,1985,2001,y
b,2001,2009,y
b,2009,2013,z
c,1982,1990,k
c,1990,2001,m
;

*
* using SAS Proc
*;

proc summary data=test nway;
 
class doctor hospital;
 
  output out=want(drop=_Smiley Happy min(period_start)=period_start max(period_end)=period_end;
run;

*
* using SQL
*;

proc sql;
 
create table want_sql as
 
select
    doctor
    , min(period_start)
as period_start
    , max(period_end)
as period_end
    , hospital
 
from
    test
 
group by
    doctor
    , hospital
  ;
quit;

View solution in original post


All Replies
Solution
‎07-25-2013 03:17 AM
SAS Super FREQ
Posts: 709

Re: Combine periods across observations

Posted in reply to caveman529

Have a look at the following code:

*
* create test data
*;

data test;
  infile cards dlm=",";
 
input
    doctor :
$8.
    period_start :
8.
    period_end :
8.
    hospital :
$8.
  ;
cards;
a,1997,2000,x
a,2000,2003,x
a,2003,2009,x
b,1985,2001,y
b,2001,2009,y
b,2009,2013,z
c,1982,1990,k
c,1990,2001,m
;

*
* using SAS Proc
*;

proc summary data=test nway;
 
class doctor hospital;
 
  output out=want(drop=_Smiley Happy min(period_start)=period_start max(period_end)=period_end;
run;

*
* using SQL
*;

proc sql;
 
create table want_sql as
 
select
    doctor
    , min(period_start)
as period_start
    , max(period_end)
as period_end
    , hospital
 
from
    test
 
group by
    doctor
    , hospital
  ;
quit;
Super Contributor
Posts: 297

Re: Combine periods across observations

Posted in reply to caveman529

You should get a successful result with the following.  The retain statement is very powerful in that it causes the value of an observation to be held across multiple observations and when combined with a by statement we can control what values we want retained and when they can be released.

DATA HAVE;

INFILE DATALINES DLM=",";

RETAIN DOCTOR HOSPITAL;

INPUT DOCTOR $ PERIOD_START $ PERIOD_END $ HOSPITAL $;

DATALINES;

A,1997,2000,X

A,2000,2003,X

A,2003,2009,X

B,1985,2001,Y

B,2001,2009,Y

B,2009,2013,Z

C,1982,1990,K

C,1990,2001,M

;

RUN;

PROC SORT DATA=HAVE;

  BY DOCTOR HOSPITAL;

RUN;

DATA WANT;

  SET HAVE;

  BY DOCTOR HOSPITAL;

  RETAIN _PERIOD_START _PERIOD_END;

  IF FIRST.HOSPITAL THEN DO;

   _PERIOD_START = PERIOD_START;

   _PERIOD_END = PERIOD_END;

    END;

  IF LAST.HOSPITAL;

    IF _PERIOD_START < PERIOD_START THEN PERIOD_START = _PERIOD_START;

    IF _PERIOD_END > PERIOD_END THEN PERIOD_END = _PERIOD_END;

  DROP _PERIOD_END _PERIOD_START;

RUN;

Regular Contributor
Posts: 161

Re: Combine periods across observations

Posted in reply to Scott_Mitchell

Thank you so much for your illustration!  Very educational Smiley Happy

Frequent Contributor
Posts: 85

Re: Combine periods across observations

Posted in reply to caveman529

Good answers already but try this if your dates are not continuous for a doctor at a hospital (I've tweaked the second start date) ...

data a;

input doctor $

period_start

period_end

hospital $;

cards;

A 1997 2000 X

A 2001 2003 X

A 2003 2009 X

B 1985 2001 Y

B 2001 2009 Y

B 2009 2013 Z

C 1982 1990 K

C 1990 2001 M

;

run;

proc sort data=a out=b;

  by doctor hospital period_start;

run;

data wanted (keep=doctor hospital real_start real_end);

  set b;

  by doctor hospital;

  retain start last_end;

 

  if first.hospital then  

    do;

   start = period_start;

   last_end = period_end;

    end;

  if period_start > last_end then

    do;

   real_start = start;

   real_end = last_end;

      output;

   start = period_start;

   last_end = period_end;

    end;

  if last.hospital  then

    do;

    real_start = start;

    real_end = period_end;

        output;

   end;

  else

   last_end = period_end;

run;


Regular Contributor
Posts: 161

Re: Combine periods across observations

Posted in reply to JerryLeBreton

Thank you for your guys all!   They all works as charm.  I learned a lot of different way of doing this.  Smiley Happy

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 291 views
  • 7 likes
  • 4 in conversation