BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
APU_007
Fluorite | Level 6

I have found some help but not quite able to resolve the issues so looking for a help to get it done correctly. Here is the example of sample data-set and result data-set that is desired.

 

Data have:

 

IDSTART_DATEEnd_Date
123-Jan-0427-Mar-08
127-Mar-1027-Mar-13
127-Mar-1027-Aug-18
127-Mar-1327-Mar-15
125-Sep-1325-May-16
216-Dec-0612-Dec-18
216-Jan-0916-Jan-14
216-Jan-1416-Jan-17
323-Jan-0427-Mar-08
323-Jan-0927-Mar-12
31-May-141-May-18
425-Feb-0825-Feb-15
47-Mar-087-Mar-15
428-Feb-1928-Feb-20

 

Data want:

IDSTART_DATEEnd_DateStayGapNet_Stay
123-Jan-0427-Aug-1853307304600
216-Dec-0612-Dec-18437904379
323-Jan-041-May-18521210674145
425-Feb-0828-Feb-20438614542932

 

What is actually needed:

START_DATE - minimum for a given ID

End_Date - maximum for a given ID

Stay - Total Stayed

Gap - Any gap during the stay

NetStay - Total Stay - Gap

 

Started thinking about going this route but appears with this logic - not able to salvage min(Start_Date) and Max(End_Date).

data diff;
merge have
HAVE(firstobs=2 keep=ID START_DATE end_date rename=(ID=nID START_DATE=nSTART_DATE end_date=nend_date));
diff_date=end_date-start_date;
if ID=NID then do;
diff_btwn_date=nSTART_DATE-end_date;
end;
run;

 

Note: i am using SAS9.4

 

So, GAP is calculated as nStart Date - End_date.

Example: For ID1 - There is no coverage between 3/27/2008 and 3/27/2010 so the GAP = 730 days. If there are multiple gaps then it has to be summed up to get the total gap days (for ID3 = 302 + 765 = 1067 days)

 

Thank you in advance for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hello @APU_007  can you please review and confirm for ID1

 

Stay: 

23-Jan-04 27-Aug-18 5330

 isn't this 27aug2018-23jan2004+1 meaning an inclusive of start_date and end_date making it 5331 rather than 5330?

 



data have;
    infile cards;
	input id start_date:anydtdte. end_date:anydtdte.;
	format start_date end_date date9.;
cards;
1	23-Jan-04	27-Mar-08
1	27-Mar-10	27-Mar-13
1	27-Mar-10	27-Aug-18
1	27-Mar-13	27-Mar-15
1	25-Sep-13	25-May-16
2	16-Dec-06	12-Dec-18
2	16-Jan-09	16-Jan-14
2	16-Jan-14	16-Jan-17
3	23-Jan-04	27-Mar-08
3	23-Jan-09	27-Mar-12
3	1-May-14	1-May-18
4	25-Feb-08	25-Feb-15
4	7-Mar-08	7-Mar-15
4	28-Feb-19	28-Feb-20
;

/*Get the number of dimensions in the array*/
proc sql;
select min(start_date), max(end_date) into :s trimmed, :e trimmed
from have;
quit;
/*Key Indexing method*/
data want;
 do until(last.id);
  set have;
  by id;
  array t(&s:&e) _temporary_;
  do _n_=start_date to end_date;
   t(_n_)=1;
  end;
  _min=min(_min,start_date);
  _max=max(_max,end_date);
 end;
 stay=sum(of t(*));
 do _n_=_min to _max;
  gap=sum(gap,nmiss(t(_n_)));
 end;
 totay_stay=_max-_min+1;
 call missing(of t(*));
rename _min=start_date _max=end_date;
drop start_date end_date;
format _: date9.;
run;

View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

So there is not logik determining Gap? Should this just be selected randomly or?

PaigeMiller
Diamond | Level 26

This is a bit of a "brute force" way to program it, but it should work

 

data have;
    infile cards;
	input id start_date:anydtdte. end_date:anydtdte.;
cards;
1	23-Jan-04	27-Mar-08
1	27-Mar-10	27-Mar-13
1	27-Mar-10	27-Aug-18
1	27-Mar-13	27-Mar-15
1	25-Sep-13	25-May-16
2	16-Dec-06	12-Dec-18
2	16-Jan-09	16-Jan-14
2	16-Jan-14	16-Jan-17
3	23-Jan-04	27-Mar-08
3	23-Jan-09	27-Mar-12
3	1-May-14	1-May-18
4	25-Feb-08	25-Feb-15
4	7-Mar-08	7-Mar-15
4	28-Feb-19	28-Feb-20
;

/* Compute max and min dates of all stays */
proc summary data=have nway;
    class id;
    var start_date end_date;
    output out=stay min(start_date)=min_start_date max(end_date)=max_end_date;
run;

/* Compute if each date between min_start_date and max_start_date is "stay" or "gap" */
data stay2;
    merge have stay;
	by id;
	do thisdate=min_start_date to max_end_date;
            if start_date<=thisdate<=end_date then gap=0;
	    else gap=1;
            output;
	end;
run;

/* Find days that are "gap" */
proc summary data=stay2 nway;
	class id thisdate;
	var gap;
	output out=gap min=min_gap;
run;

/* Add up number of gap days */
proc summary data=gap nway;
	class id;
	var min_gap;
	output out=want sum=gap;
run;

/* Now that you have total length of stay and number of gap days, you do the rest */
--
Paige Miller
novinosrin
Tourmaline | Level 20

Sir @PaigeMiller  Neat work with right set of comments to follow. Very nice!

novinosrin
Tourmaline | Level 20

Hello @APU_007  can you please review and confirm for ID1

 

Stay: 

23-Jan-04 27-Aug-18 5330

 isn't this 27aug2018-23jan2004+1 meaning an inclusive of start_date and end_date making it 5331 rather than 5330?

 



data have;
    infile cards;
	input id start_date:anydtdte. end_date:anydtdte.;
	format start_date end_date date9.;
cards;
1	23-Jan-04	27-Mar-08
1	27-Mar-10	27-Mar-13
1	27-Mar-10	27-Aug-18
1	27-Mar-13	27-Mar-15
1	25-Sep-13	25-May-16
2	16-Dec-06	12-Dec-18
2	16-Jan-09	16-Jan-14
2	16-Jan-14	16-Jan-17
3	23-Jan-04	27-Mar-08
3	23-Jan-09	27-Mar-12
3	1-May-14	1-May-18
4	25-Feb-08	25-Feb-15
4	7-Mar-08	7-Mar-15
4	28-Feb-19	28-Feb-20
;

/*Get the number of dimensions in the array*/
proc sql;
select min(start_date), max(end_date) into :s trimmed, :e trimmed
from have;
quit;
/*Key Indexing method*/
data want;
 do until(last.id);
  set have;
  by id;
  array t(&s:&e) _temporary_;
  do _n_=start_date to end_date;
   t(_n_)=1;
  end;
  _min=min(_min,start_date);
  _max=max(_max,end_date);
 end;
 stay=sum(of t(*));
 do _n_=_min to _max;
  gap=sum(gap,nmiss(t(_n_)));
 end;
 totay_stay=_max-_min+1;
 call missing(of t(*));
rename _min=start_date _max=end_date;
drop start_date end_date;
format _: date9.;
run;
APU_007
Fluorite | Level 6

Hello @novinosrin,

 

This is what I was exactly looking for. Thank you so much for this piece of code. Regarding, adding +1, I was excluding the start date for a final total stay. It can be included/excluded as this would be a rule for every observation!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 452 views
  • 6 likes
  • 4 in conversation