Solved
Contributor
Posts: 22

# Collapsing observations based on allowable gaps

I have a dataset (created after bunch of data management steps) which is arranged as below. The gap indicator shows the number of gap days ( 1 indicates gap). Now, I would want to allow a gap of 30 days (i.e., I would count them as having no gap if the gap is less than or equal to 30 days). I have tried a few things (used retain, lag statements), but haven't been able to do so.

 ID Gap_indicator Days A 0 90 A 1 20 A 0 60 A 1 10 A 0 60 A 1 40 A 0 90

This is how the dataset is laid out. My expected output is (I want to add 30 days of gap to the previous no gap days)

 ID Gap_indicator Days A 0 270 A 1 10 A 0 90

I am not able to figure out how to do this. Your help would be much appreciated. Thank you!

Accepted Solutions
Solution
‎04-17-2018 05:50 PM
Posts: 5,474

## Re: Collapsing observations based on allowable gaps

Try this:

``````data have;
input ID \$ Gap Days;
datalines;
A 0 90
A 1 20
A 0 60
A 1 10
A 0 60
A 1 40
A 0 90
;

data want;
gDays = 0;
do until(last.id);
set have; by id;
if gap then do;
if days > 30 then do;
gDays = gDays + 30;
gGap = 0;
output;
gDays = days - 30;
gGap = 1;
output;
gDays = 0;
end;
else gDays = gDays + days;
end;
else gDays = gDays + days;
end;
if gDays > 0 then do;
gGap = 0;
output;
end;
drop gap days;
rename gGap=Gap gDays=Days;
run;``````
PG

All Replies
Super User
Posts: 23,224

## Re: Collapsing observations based on allowable gaps

Post what you've tried so far please.

Contributor
Posts: 22

## Re: Collapsing observations based on allowable gaps

Based on some of the readings, I did the following:

data test;
set problem;
by id;
lg_days=lag(days);
if first.id then lg_days=.;
run;

data test1;
merge test test (firstobs=2 rename=(id =id2 days=new_days gap=gap2));
if id ne id2 then new_days=.;
drop id2 gap2;
run;

I created a dataset by bringing the next observation in the same line. I am, however, struggling to collapse the observations with respect to 30 days allowable gap.

Solution
‎04-17-2018 05:50 PM
Posts: 5,474

## Re: Collapsing observations based on allowable gaps

Try this:

``````data have;
input ID \$ Gap Days;
datalines;
A 0 90
A 1 20
A 0 60
A 1 10
A 0 60
A 1 40
A 0 90
;

data want;
gDays = 0;
do until(last.id);
set have; by id;
if gap then do;
if days > 30 then do;
gDays = gDays + 30;
gGap = 0;
output;
gDays = days - 30;
gGap = 1;
output;
gDays = 0;
end;
else gDays = gDays + days;
end;
else gDays = gDays + days;
end;
if gDays > 0 then do;
gGap = 0;
output;
end;
drop gap days;
rename gGap=Gap gDays=Days;
run;``````
PG
Contributor
Posts: 22

## Re: Collapsing observations based on allowable gaps

This works! I will try to parse the codes. One thing I haven't understood so far is, what is the difference between  initializing a variable before and after the set statement? Thank you once again.

Posts: 5,474

## Re: Collapsing observations based on allowable gaps

Imported dataset variables (with a set statement) are overwritten when the set statement executes. Other variables (such as gDays above) are reset to missing at the beginning of the implicit dataset loop (unless they are RETAINed).

PG
☑ This topic is solved.