BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
APU_007
Fluorite | Level 6

I have found some help but not quite able to resolve the issues so looking for a help to get it done correctly. Here is the example of sample data-set and result data-set that is desired.

 

Data have:

 

IDSTART_DATEEnd_Date
123-Jan-0427-Mar-08
127-Mar-1027-Mar-13
127-Mar-1027-Aug-18
127-Mar-1327-Mar-15
125-Sep-1325-May-16
216-Dec-0612-Dec-18
216-Jan-0916-Jan-14
216-Jan-1416-Jan-17
323-Jan-0427-Mar-08
323-Jan-0927-Mar-12
31-May-141-May-18
425-Feb-0825-Feb-15
47-Mar-087-Mar-15
428-Feb-1928-Feb-20

 

Data want:

IDSTART_DATEEnd_DateStayGapNet_Stay
123-Jan-0427-Aug-1853307304600
216-Dec-0612-Dec-18437904379
323-Jan-041-May-18521210674145
425-Feb-0828-Feb-20438614542932

 

What is actually needed:

START_DATE - minimum for a given ID

End_Date - maximum for a given ID

Stay - Total Stayed

Gap - Any gap during the stay

NetStay - Total Stay - Gap

 

Started thinking about going this route but appears with this logic - not able to salvage min(Start_Date) and Max(End_Date).

data diff;
merge have
HAVE(firstobs=2 keep=ID START_DATE end_date rename=(ID=nID START_DATE=nSTART_DATE end_date=nend_date));
diff_date=end_date-start_date;
if ID=NID then do;
diff_btwn_date=nSTART_DATE-end_date;
end;
run;

 

Note: i am using SAS9.4

 

So, GAP is calculated as nStart Date - End_date.

Example: For ID1 - There is no coverage between 3/27/2008 and 3/27/2010 so the GAP = 730 days. If there are multiple gaps then it has to be summed up to get the total gap days (for ID3 = 302 + 765 = 1067 days)

 

Thank you in advance for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hello @APU_007  can you please review and confirm for ID1

 

Stay: 

23-Jan-04 27-Aug-18 5330

 isn't this 27aug2018-23jan2004+1 meaning an inclusive of start_date and end_date making it 5331 rather than 5330?

 



data have;
    infile cards;
	input id start_date:anydtdte. end_date:anydtdte.;
	format start_date end_date date9.;
cards;
1	23-Jan-04	27-Mar-08
1	27-Mar-10	27-Mar-13
1	27-Mar-10	27-Aug-18
1	27-Mar-13	27-Mar-15
1	25-Sep-13	25-May-16
2	16-Dec-06	12-Dec-18
2	16-Jan-09	16-Jan-14
2	16-Jan-14	16-Jan-17
3	23-Jan-04	27-Mar-08
3	23-Jan-09	27-Mar-12
3	1-May-14	1-May-18
4	25-Feb-08	25-Feb-15
4	7-Mar-08	7-Mar-15
4	28-Feb-19	28-Feb-20
;

/*Get the number of dimensions in the array*/
proc sql;
select min(start_date), max(end_date) into :s trimmed, :e trimmed
from have;
quit;
/*Key Indexing method*/
data want;
 do until(last.id);
  set have;
  by id;
  array t(&s:&e) _temporary_;
  do _n_=start_date to end_date;
   t(_n_)=1;
  end;
  _min=min(_min,start_date);
  _max=max(_max,end_date);
 end;
 stay=sum(of t(*));
 do _n_=_min to _max;
  gap=sum(gap,nmiss(t(_n_)));
 end;
 totay_stay=_max-_min+1;
 call missing(of t(*));
rename _min=start_date _max=end_date;
drop start_date end_date;
format _: date9.;
run;

View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

So there is not logik determining Gap? Should this just be selected randomly or?

PaigeMiller
Diamond | Level 26

This is a bit of a "brute force" way to program it, but it should work

 

data have;
    infile cards;
	input id start_date:anydtdte. end_date:anydtdte.;
cards;
1	23-Jan-04	27-Mar-08
1	27-Mar-10	27-Mar-13
1	27-Mar-10	27-Aug-18
1	27-Mar-13	27-Mar-15
1	25-Sep-13	25-May-16
2	16-Dec-06	12-Dec-18
2	16-Jan-09	16-Jan-14
2	16-Jan-14	16-Jan-17
3	23-Jan-04	27-Mar-08
3	23-Jan-09	27-Mar-12
3	1-May-14	1-May-18
4	25-Feb-08	25-Feb-15
4	7-Mar-08	7-Mar-15
4	28-Feb-19	28-Feb-20
;

/* Compute max and min dates of all stays */
proc summary data=have nway;
    class id;
    var start_date end_date;
    output out=stay min(start_date)=min_start_date max(end_date)=max_end_date;
run;

/* Compute if each date between min_start_date and max_start_date is "stay" or "gap" */
data stay2;
    merge have stay;
	by id;
	do thisdate=min_start_date to max_end_date;
            if start_date<=thisdate<=end_date then gap=0;
	    else gap=1;
            output;
	end;
run;

/* Find days that are "gap" */
proc summary data=stay2 nway;
	class id thisdate;
	var gap;
	output out=gap min=min_gap;
run;

/* Add up number of gap days */
proc summary data=gap nway;
	class id;
	var min_gap;
	output out=want sum=gap;
run;

/* Now that you have total length of stay and number of gap days, you do the rest */
--
Paige Miller
novinosrin
Tourmaline | Level 20

Sir @PaigeMiller  Neat work with right set of comments to follow. Very nice!

novinosrin
Tourmaline | Level 20

Hello @APU_007  can you please review and confirm for ID1

 

Stay: 

23-Jan-04 27-Aug-18 5330

 isn't this 27aug2018-23jan2004+1 meaning an inclusive of start_date and end_date making it 5331 rather than 5330?

 



data have;
    infile cards;
	input id start_date:anydtdte. end_date:anydtdte.;
	format start_date end_date date9.;
cards;
1	23-Jan-04	27-Mar-08
1	27-Mar-10	27-Mar-13
1	27-Mar-10	27-Aug-18
1	27-Mar-13	27-Mar-15
1	25-Sep-13	25-May-16
2	16-Dec-06	12-Dec-18
2	16-Jan-09	16-Jan-14
2	16-Jan-14	16-Jan-17
3	23-Jan-04	27-Mar-08
3	23-Jan-09	27-Mar-12
3	1-May-14	1-May-18
4	25-Feb-08	25-Feb-15
4	7-Mar-08	7-Mar-15
4	28-Feb-19	28-Feb-20
;

/*Get the number of dimensions in the array*/
proc sql;
select min(start_date), max(end_date) into :s trimmed, :e trimmed
from have;
quit;
/*Key Indexing method*/
data want;
 do until(last.id);
  set have;
  by id;
  array t(&s:&e) _temporary_;
  do _n_=start_date to end_date;
   t(_n_)=1;
  end;
  _min=min(_min,start_date);
  _max=max(_max,end_date);
 end;
 stay=sum(of t(*));
 do _n_=_min to _max;
  gap=sum(gap,nmiss(t(_n_)));
 end;
 totay_stay=_max-_min+1;
 call missing(of t(*));
rename _min=start_date _max=end_date;
drop start_date end_date;
format _: date9.;
run;
APU_007
Fluorite | Level 6

Hello @novinosrin,

 

This is what I was exactly looking for. Thank you so much for this piece of code. Regarding, adding +1, I was excluding the start date for a final total stay. It can be included/excluded as this would be a rule for every observation!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1531 views
  • 6 likes
  • 4 in conversation