Contributor
Posts: 22

# Counting gaps in days and rolling up to individual level

I have a data-set with individual id, date variable for each day, and a filled date variable, as shown below. I wanted to know the gaps in filled dates, so I created a count variable which starts as 1 on the first gap for each gap interval and then increments by 1. I would now like to create an id -level data-set (one observation per person) with indicator variables for number of gaps within each gap interval (for instance, Jan 3 to Jan 5, 3 days gap, and Jan 8 to Jan 9, 2 days gap - for id A). How could I do this? Thank you!

 ID each date filled date count A 1-Jan 1-Jan 0 A 2-Jan 2-Jan 0 A 3-Jan . 1 A 4-Jan . 2 A 5-Jan . 3 A 6-Jan 6-Jan 0 A 7-Jan 7-Jan 0 A 8-Jan . 1 A 9-Jan . 2 A 10-Jan 10-Jan 0
PROC Star
Posts: 1,547

Contributor
Posts: 22

## Re: Counting gaps in days and rolling up to individual level

Something like this would work (I want how many days were continuously filled and how many continuous gap intervals). Thank you!

 ID each date count A 1-Jan 0 A 3-Jan 3 A 6-Jan 0 A 8-Jan 2
Super User
Posts: 6,622

## Re: Counting gaps in days and rolling up to individual level

Is the variable COUNT already in your data set, or do you need the program to compute it as well?

Contributor
Posts: 22

## Re: Counting gaps in days and rolling up to individual level

I created that variable to count the continuous gaps, so it is currently on the data-set.

Super User
Posts: 6,622

## Re: Counting gaps in days and rolling up to individual level

[ Edited ]

Here's a start:

data temp;

set have;

by id;

where count > 0;

if first.id then group=1;

else if count=1 then group + 1;

run;

That creates an artificial GROUP variable that can identify when groups of observations having count > 0 begin and end.

data want;

set have;

by id group;

if last.group;

drop group;

run;

That's closer to what you asked for.  It gives you the last observation in each grouping.  Take a look at that much, and then reconsider.  Trying to change this into one observation per ID is a bad idea.  You will end up with different numbers of variables populated for each ID, and the structure of the data will be difficult to program with.  But if you still want to pursue that as the final result, it can be done using PROC TRANSPOSE (perhaps more than one PROC TRANSPOSE depending on what you want the final result to look like).

********** EDITED:

While I was composing this,  you must have posted what you want the result to look like.  I may be able to get to that later, to include some of the "0" records.  Right now, all you get is the non-zero gaps.  But that will have to wait (at least for me).

PROC Star
Posts: 1,547

## Re: Counting gaps in days and rolling up to individual level

``````data have;
input (ID	each_date	filled_date) (\$)	count;
datalines;
A	1-Jan	1-Jan	0
A	2-Jan	2-Jan	0
A	3-Jan	.	1
A	4-Jan	.	2
A	5-Jan	.	3
A	6-Jan	6-Jan	0
A	7-Jan	7-Jan	0
A	8-Jan	.	1
A	9-Jan	.	2
A	10-Jan	10-Jan	0
;
data temp;
set have;
by id ;
if first.id then k=0;
if count=1 or count=0 and lag(count) ne 0 then k+1;
run;

proc sql;
create table want as
select id,each_date,max(count) as _count
from temp
group by id, k
having each_date=min(each_date) and count(k)>1;
quit;``````
Super User
Posts: 10,679

## Re: Counting gaps in days and rolling up to individual level

``````data have;
input (ID	each_date	filled_date) (\$)	count;
datalines;
A	1-Jan	1-Jan	0
A	2-Jan	2-Jan	0
A	3-Jan	.	1
A	4-Jan	.	2
A	5-Jan	.	3
A	6-Jan	6-Jan	0
A	7-Jan	7-Jan	0
A	8-Jan	.	1
A	9-Jan	.	2
A	10-Jan	10-Jan	0
;
data have;
set have;
flag=missing(filled_date);
run;
data have;
set have;
by id flag notsorted;
group+first.flag;
run;
data want;
do until(last.group);
set have;
by group;
if first.group then new_date=each_date;
end;
drop each_date	filled_date group flag;
run;``````
Discussion stats
• 7 replies
• 151 views
• 3 likes
• 4 in conversation