Counting values in sequence by group

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

Counting values in sequence by group

I wanted to count number of visits in a row (each week) for each customer. The problem I'm having is, if I'm using if.first counters, i need to sort the data by the two variables - customer id and a flag whether they visited that week or not. However, I need to keep the data sorted by date and customer id, as I'm interested in sequential observations. 

 

For example: 

 

Week_end        Cust_id      Visit_flag

01JAN2017      01                     0
08JAN2017      01                     1    
15JAN2017      01                     1 
22JAN2017      01                     0 
29JAN2017      01                     1 
05FEB2017      01                     0 

01JAN2017      02                     0
08JAN2017      02                     1    
15JAN2017      02                     1 
22JAN2017      02                     1 
29JAN2017      02                     1 
05FEB2017      02                     0 

 

 

The output i want, is how many weeks in a row (max) a customer has visited:

Cust_id      max_seq_visits

01                         2

02                         4

 

 

Thanks in advance for any help. 


Accepted Solutions
Solution
‎01-23-2018 04:34 AM
PROC Star
Posts: 1,218

Re: Counting values in sequence by group

ok. do something like this

 

data have;
input Week_end:date9. Cust_id$ Visit_flag;
format Week_end date9.;
datalines;
01JAN2017 01 0
08JAN2017 01 1    
15JAN2017 01 1 
22JAN2017 01 0 
29JAN2017 01 1 
05FEB2017 01 0 
01JAN2017 02 0
08JAN2017 02 1    
15JAN2017 02 1 
22JAN2017 02 1 
29JAN2017 02 1 
05FEB2017 02 0 
;

data temp;
   set have;
   by Cust_id;
   if Visit_flag=0 then seq=0;
   else if Visit_flag then seq+1;
   retain seq;
run;

proc sql;
   create table want as
   select Cust_id
         ,max(seq) as max_seq_visits
   from temp
   group by Cust_id;
quit;

View solution in original post


All Replies
PROC Star
Posts: 1,218

Re: Counting values in sequence by group

Is there always 1 week between the dates? 

 

If U understand you, you want the nunmber of 1s in the longest sequece og 1s by cust_id?

Occasional Contributor
Posts: 7

Re: Counting values in sequence by group

Yes, there is always 1 week between the dates, and each customer has the same number of weeks assigned to them. And correct - I want the longest sequence of 1s by cust_id,

Solution
‎01-23-2018 04:34 AM
PROC Star
Posts: 1,218

Re: Counting values in sequence by group

ok. do something like this

 

data have;
input Week_end:date9. Cust_id$ Visit_flag;
format Week_end date9.;
datalines;
01JAN2017 01 0
08JAN2017 01 1    
15JAN2017 01 1 
22JAN2017 01 0 
29JAN2017 01 1 
05FEB2017 01 0 
01JAN2017 02 0
08JAN2017 02 1    
15JAN2017 02 1 
22JAN2017 02 1 
29JAN2017 02 1 
05FEB2017 02 0 
;

data temp;
   set have;
   by Cust_id;
   if Visit_flag=0 then seq=0;
   else if Visit_flag then seq+1;
   retain seq;
run;

proc sql;
   create table want as
   select Cust_id
         ,max(seq) as max_seq_visits
   from temp
   group by Cust_id;
quit;
Occasional Contributor
Posts: 7

Re: Counting values in sequence by group

That worked out just the way i wanted. Thanks! 

PROC Star
Posts: 1,218

Re: Counting values in sequence by group

No problem. If you want to do this in a single data step alone, then do like this

 

data have;
input Week_end:date9. Cust_id$ Visit_flag;
format Week_end date9.;
datalines;
01JAN2017 01 0
08JAN2017 01 1    
15JAN2017 01 1 
22JAN2017 01 0 
29JAN2017 01 1 
05FEB2017 01 0 
01JAN2017 02 0
08JAN2017 02 1    
15JAN2017 02 1 
22JAN2017 02 1 
29JAN2017 02 1 
05FEB2017 02 0 
;

data want(keep=cust_id max_seq_visits);
   set have;
   by cust_id;

   if Visit_flag=0 then cnt=0;
   else if Visit_flag=1 then cnt+1;
   max_seq_visits=ifn(cnt>max_seq_visits, cnt, max_seq_visits);

   if last.cust_id then output;
   retain cnt max_seq_visits;
run;
Super User
Super User
Posts: 9,441

Re: Counting values in sequence by group

Post test data in the form of a datastep, as such this is not tested:

data want;
  set have;
  by cust_id;
  retain cnt max_cnt;
  if first.cust_id then do;
    cnt_max=0;
    cnt=0;
  end;
  if flag=1 then cnt=cnt+1;
  else if flag=0 and cnt > 0 then do;
    if cnt > cnt_max then cnt_max=cnt;
    cnt=0;
  end;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 95 views
  • 1 like
  • 3 in conversation