BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Question
Fluorite | Level 6

Hi All,

 

I would like to calculate the gap between purchases for ecah customer. I have for example the file below, an dwould like to calculate the 3 variables on the righ (First Gaps, Second Gap etc...)

 

Your help would be much appreciated,

 

Thank You

 

 

customer_id w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13   First Gap Secong Gap Third Gap
1 10 4 4 10 8 8 ~ ~ ~ ~ ~ ~ ~   7 weeks    
2 9 ~ 2 2 ~ ~ ~ ~ ~ ~ ~ ~ ~   1 week 9 weeks  
3 29 4 2 2 ~ ~ ~ ~ ~ ~ ~ 4 ~   7 weeks 1 week  
4 8 4 2 4 ~ ~ ~ 2 ~ 4 ~ ~ ~   3 weeks 1 week 3 weeks
1 ACCEPTED SOLUTION

Accepted Solutions
collinelliot
Barite | Level 11

I think the following changes fix the issue. I checked more thoroughly than I did before, but you'll want to check yourself, obviously. Looks like you have alternative solutions, too.

 

data have;
    input customer_id w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13;
datalines;
1 10 4 4 10 8 8 . . . . . . .
2 9 . 2 2 . . . . . . . . .
3 29 4 2 2 . . . . . . . 4 . 
4 8 4 2 4 . . . 2 . 4 . . .
5 . . 1 2 3 4 . 5 . 7 . . .
6 . 0 1 2 3 4 . 5 . 7 . . .
7 10 4 4 10 8 8 . . . . . 1 2
8 9 . 2 2 . . . . . . . . 1
9 . 1 . 2 . 4 . 4 . 9 . . 1
10 . 1 . 2 . 4 . 4 . 9 . 1 .
11 8 . 1 . 2 . 4 . 4 . 9 . 1
12 1 1 1 1 1 1 1 1 1 1 1 1 1
13 . . . 2 3 4 . 5 . 7 . . .
;

data want;
    set have;
    array w{*} w1 - w13;
    array len_gap_{7};
    array wk_gap_{7};

    gapLen = 0;
    gapNum = 0;

    do i = 1 to dim(w);
        if missing(w(i)) then do;
            gapLen = gapLen + 1;
            if gapLen = 1 then gapNum = gapNum + 1;
            if gapNum > 0 and gapLen = 1 then wk_gap_(gapNum) = i;
            if gapNum > 0 then len_gap_(gapNum) = gapLen;
        end;
        else do;
            gapLen = 0;
        end;
    end;

run;

proc print;
    var customer_id len_gap:;
run;

View solution in original post

6 REPLIES 6
ballardw
Super User

Because of the way you posted data the "last 3 columns" are appearing over the forum marginal information and almost impossible to read.

If you have a SAS data set with your example data then you can turn that into data step code: Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

I think we are going to have to assume that the values you show as ~ are actually missing values in your SAS data.

In which case it appears that your request is actually to "count consecutive missing values from variables w1 to w13" in up to 3 groups. Does that sound like your need?

collinelliot
Barite | Level 11

I agree with @ballardw 100% with respect to his comments about the data and description of the problem.  That said, I took a shot. I also had doubts whether the data you presented represented all the possible cases (# of gaps, when they start and end...), so I added some additional cases, including no gaps. The code below should be a start for you. I also added output to store the week in which the gap started, since that seems like it might be useful to have. 

 

data have;
    input customer_id w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13;
datalines;
1 10 4 4 10 8 8 . . . . . . .
2 9 . 2 2 . . . . . . . . .
3 29 4 2 2 . . . . . . . 4 . 
4 8 4 2 4 . . . 2 . 4 . . .
5 . . 1 2 3 4 . 5 . 7 . . .
6 . 0 1 2 3 4 . 5 . 7 . . .
7 10 4 4 10 8 8 . . . . . 1 2
8 9 . 2 2 . . . . . . . . 1
9 . 1 . 2 . 4 . 4 . 9 . . 1
10 . 1 . 2 . 4 . 4 . 9 . 1 .
11 8 . 1 . 2 . 4 . 4 . 9 . 1
12 1 1 1 1 1 1 1 1 1 1 1 1 1
;

data want;
    set have;
    array w{*} w1 - w13;
    array len_gap_{6};
    array wk_gap_{6};
    gapNum = 0; /* Initialize a counter for the gap number. */
    do i = 1 to dim(w);
        if missing(w(i)) then do;
            gapLen = gapLen + 1;
            if gapLen = 1 then gapNum = gapNum + 1;
            if gapNum > 0 then len_gap_(gapNum) = gapLen;
            if gapLen = 1 then wk_gap_(gapNum) = i;
        end;
        else do;
            gapLen = 0;
        end;
    end;

run;

proc print;
    var customer_id len_gap:;
run;
Question
Fluorite | Level 6
Hi ,

Thank you very much for your response. the code gives almost what I want,
but it seems it doesn't count when you have 2 consecutive missings at the
beginning. For example for the customer 5, has 2 missing weeks at the
beginnin, I am expecting to have len_gap_1=2 but the code seems to ignore
the first time they have missed purchases.. Is that right?


Thank You

customer_id w1 w2 len_gap_1
1 10 4 7
2 9 . 1
3 29 4 7
4 8 4 3
5 . . 1
6 . 0 1
7 10 4 5
8 9 . 1
9 . 1 1
10 . 1 1
11 8 . 1
12 1 1 .
##- Please type your reply above this line. Simple formatting, no
attachments. -##
collinelliot
Barite | Level 11

I think the following changes fix the issue. I checked more thoroughly than I did before, but you'll want to check yourself, obviously. Looks like you have alternative solutions, too.

 

data have;
    input customer_id w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13;
datalines;
1 10 4 4 10 8 8 . . . . . . .
2 9 . 2 2 . . . . . . . . .
3 29 4 2 2 . . . . . . . 4 . 
4 8 4 2 4 . . . 2 . 4 . . .
5 . . 1 2 3 4 . 5 . 7 . . .
6 . 0 1 2 3 4 . 5 . 7 . . .
7 10 4 4 10 8 8 . . . . . 1 2
8 9 . 2 2 . . . . . . . . 1
9 . 1 . 2 . 4 . 4 . 9 . . 1
10 . 1 . 2 . 4 . 4 . 9 . 1 .
11 8 . 1 . 2 . 4 . 4 . 9 . 1
12 1 1 1 1 1 1 1 1 1 1 1 1 1
13 . . . 2 3 4 . 5 . 7 . . .
;

data want;
    set have;
    array w{*} w1 - w13;
    array len_gap_{7};
    array wk_gap_{7};

    gapLen = 0;
    gapNum = 0;

    do i = 1 to dim(w);
        if missing(w(i)) then do;
            gapLen = gapLen + 1;
            if gapLen = 1 then gapNum = gapNum + 1;
            if gapNum > 0 and gapLen = 1 then wk_gap_(gapNum) = i;
            if gapNum > 0 then len_gap_(gapNum) = gapLen;
        end;
        else do;
            gapLen = 0;
        end;
    end;

run;

proc print;
    var customer_id len_gap:;
run;
Question
Fluorite | Level 6
Thank You Ksharp,

This woks perfectly 🙂


##- Please type your reply above this line. Simple formatting, no
attachments. -##
Ksharp
Super User

data have;
    input customer_id w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13;
datalines;
1 10 4 4 10 8 8 . . . . . . .
2 9 . 2 2 . . . . . . . . .
3 29 4 2 2 . . . . . . . 4 . 
4 8 4 2 4 . . . 2 . 4 . . .
5 . . 1 2 3 4 . 5 . 7 . . .
6 . 0 1 2 3 4 . 5 . 7 . . .
7 10 4 4 10 8 8 . . . . . 1 2
8 9 . 2 2 . . . . . . . . 1
9 . 1 . 2 . 4 . 4 . 9 . . 1
10 . 1 . 2 . 4 . 4 . 9 . 1 .
11 8 . 1 . 2 . 4 . 4 . 9 . 1
12 1 1 1 1 1 1 1 1 1 1 1 1 1
;
proc transpose data=have out=temp;
by customer_id;
var w:;
run;
proc means data=temp noprint;
by customer_id col1 notsorted;
output out=temp1;
run;
data temp2;
 set temp1(where=(col1 is missing));
 by customer_id;
 if first.customer_id then count=0;
 count+1;
run;
proc transpose data=temp2 out=temp3 prefix=gap_;
by customer_id;
id count;
var _freq_;
run;
data want;
 merge have temp3;
 by customer_id;
 drop _name_;
run;


sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1009 views
  • 1 like
  • 4 in conversation