SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Counting gaps in days and rolling up to individual level

Reply
Contributor
Posts: 22

Counting gaps in days and rolling up to individual level

I have a data-set with individual id, date variable for each day, and a filled date variable, as shown below. I wanted to know the gaps in filled dates, so I created a count variable which starts as 1 on the first gap for each gap interval and then increments by 1. I would now like to create an id -level data-set (one observation per person) with indicator variables for number of gaps within each gap interval (for instance, Jan 3 to Jan 5, 3 days gap, and Jan 8 to Jan 9, 2 days gap - for id A). How could I do this? Thank you!

IDeach datefilled datecount
A1-Jan1-Jan0
A2-Jan2-Jan0
A3-Jan.1
A4-Jan.2
A5-Jan.3
A6-Jan6-Jan0
A7-Jan7-Jan0
A8-Jan.1
A9-Jan.2
A10-Jan10-Jan0
PROC Star
Posts: 1,547

Re: Counting gaps in days and rolling up to individual level

Can you please post a sample of your expected output?

Contributor
Posts: 22

Re: Counting gaps in days and rolling up to individual level

Posted in reply to novinosrin

Something like this would work (I want how many days were continuously filled and how many continuous gap intervals). Thank you!

IDeach datecount
A1-Jan0
A3-Jan3
6-Jan0
A8-Jan2
Super User
Posts: 6,622

Re: Counting gaps in days and rolling up to individual level

 

Is the variable COUNT already in your data set, or do you need the program to compute it as well?

Contributor
Posts: 22

Re: Counting gaps in days and rolling up to individual level

Posted in reply to Astounding

I created that variable to count the continuous gaps, so it is currently on the data-set. 

Super User
Posts: 6,622

Re: Counting gaps in days and rolling up to individual level

[ Edited ]

Here's a start:

 

data temp;

set have;

by id;

where count > 0;

if first.id then group=1;

else if count=1 then group + 1;

run;

 

That creates an artificial GROUP variable that can identify when groups of observations having count > 0 begin and end.

 

data want;

set have;

by id group;

if last.group;

drop group;

run;

 

That's closer to what you asked for.  It gives you the last observation in each grouping.  Take a look at that much, and then reconsider.  Trying to change this into one observation per ID is a bad idea.  You will end up with different numbers of variables populated for each ID, and the structure of the data will be difficult to program with.  But if you still want to pursue that as the final result, it can be done using PROC TRANSPOSE (perhaps more than one PROC TRANSPOSE depending on what you want the final result to look like).

 

********** EDITED:

 

While I was composing this,  you must have posted what you want the result to look like.  I may be able to get to that later, to include some of the "0" records.  Right now, all you get is the non-zero gaps.  But that will have to wait (at least for me).

PROC Star
Posts: 1,547

Re: Counting gaps in days and rolling up to individual level

data have;
input (ID	each_date	filled_date) ($)	count;
datalines;
A	1-Jan	1-Jan	0
A	2-Jan	2-Jan	0
A	3-Jan	.	1
A	4-Jan	.	2
A	5-Jan	.	3
A	6-Jan	6-Jan	0
A	7-Jan	7-Jan	0
A	8-Jan	.	1
A	9-Jan	.	2
A	10-Jan	10-Jan	0
;
data temp;
set have;
by id ;
if first.id then k=0;
if count=1 or count=0 and lag(count) ne 0 then k+1;
run;

proc sql;
create table want as
select id,each_date,max(count) as _count
from temp
group by id, k
having each_date=min(each_date) and count(k)>1;
quit;
Super User
Posts: 10,679

Re: Counting gaps in days and rolling up to individual level

data have;
input (ID	each_date	filled_date) ($)	count;
datalines;
A	1-Jan	1-Jan	0
A	2-Jan	2-Jan	0
A	3-Jan	.	1
A	4-Jan	.	2
A	5-Jan	.	3
A	6-Jan	6-Jan	0
A	7-Jan	7-Jan	0
A	8-Jan	.	1
A	9-Jan	.	2
A	10-Jan	10-Jan	0
;
data have;
 set have;
 flag=missing(filled_date);
run;
data have;
 set have;
 by id flag notsorted;
 group+first.flag;
run;
data want;
 do until(last.group);
   set have;
   by group;
   if first.group then new_date=each_date;
 end;
 drop each_date	filled_date group flag;
run;
Ask a Question
Discussion stats
  • 7 replies
  • 151 views
  • 3 likes
  • 4 in conversation