SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Collapsing observations based on allowable gaps

Accepted Solution Solved
Reply
Contributor
Posts: 22
Accepted Solution

Collapsing observations based on allowable gaps

I have a dataset (created after bunch of data management steps) which is arranged as below. The gap indicator shows the number of gap days ( 1 indicates gap). Now, I would want to allow a gap of 30 days (i.e., I would count them as having no gap if the gap is less than or equal to 30 days). I have tried a few things (used retain, lag statements), but haven't been able to do so. 

IDGap_indicator Days
A090
A120
A060
A110
A060
A140
A090

 

This is how the dataset is laid out. My expected output is (I want to add 30 days of gap to the previous no gap days)

 

IDGap_indicatorDays
A0270
A110
A090


I am not able to figure out how to do this. Your help would be much appreciated. Thank you!


Accepted Solutions
Solution
‎04-17-2018 05:50 PM
Esteemed Advisor
Posts: 5,474

Re: Collapsing observations based on allowable gaps

Try this:

 

data have;
input ID $ Gap Days; 
datalines;
A 0 90 
A 1 20 
A 0 60 
A 1 10 
A 0 60 
A 1 40 
A 0 90 
;

data want;
gDays = 0;
do until(last.id);
    set have; by id;
    if gap then do;
        if days > 30 then do;
            gDays = gDays + 30;
            gGap = 0;
            output;
            gDays = days - 30;
            gGap = 1;
            output;
            gDays = 0;
            end;
        else gDays = gDays + days;
        end;
    else gDays = gDays + days;
    end;
if gDays > 0 then do;
    gGap = 0;
    output;
    end;
drop gap days;
rename gGap=Gap gDays=Days;
run;
PG

View solution in original post


All Replies
Super User
Posts: 23,224

Re: Collapsing observations based on allowable gaps

Post what you've tried so far please.

Contributor
Posts: 22

Re: Collapsing observations based on allowable gaps

Based on some of the readings, I did the following:

data test;
set problem;
by id;
lg_days=lag(days);
if first.id then lg_days=.;
run;


data test1;
merge test test (firstobs=2 rename=(id =id2 days=new_days gap=gap2));
if id ne id2 then new_days=.;
drop id2 gap2;
run;

 

I created a dataset by bringing the next observation in the same line. I am, however, struggling to collapse the observations with respect to 30 days allowable gap.

Solution
‎04-17-2018 05:50 PM
Esteemed Advisor
Posts: 5,474

Re: Collapsing observations based on allowable gaps

Try this:

 

data have;
input ID $ Gap Days; 
datalines;
A 0 90 
A 1 20 
A 0 60 
A 1 10 
A 0 60 
A 1 40 
A 0 90 
;

data want;
gDays = 0;
do until(last.id);
    set have; by id;
    if gap then do;
        if days > 30 then do;
            gDays = gDays + 30;
            gGap = 0;
            output;
            gDays = days - 30;
            gGap = 1;
            output;
            gDays = 0;
            end;
        else gDays = gDays + days;
        end;
    else gDays = gDays + days;
    end;
if gDays > 0 then do;
    gGap = 0;
    output;
    end;
drop gap days;
rename gGap=Gap gDays=Days;
run;
PG
Contributor
Posts: 22

Re: Collapsing observations based on allowable gaps

This works! I will try to parse the codes. One thing I haven't understood so far is, what is the difference between  initializing a variable before and after the set statement? Thank you once again. 

Esteemed Advisor
Posts: 5,474

Re: Collapsing observations based on allowable gaps

Imported dataset variables (with a set statement) are overwritten when the set statement executes. Other variables (such as gDays above) are reset to missing at the beginning of the implicit dataset loop (unless they are RETAINed).

PG
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 179 views
  • 0 likes
  • 3 in conversation