I wanted to count number of visits in a row (each week) for each customer. The problem I'm having is, if I'm using if.first counters, i need to sort the data by the two variables - customer id and a flag whether they visited that week or not. However, I need to keep the data sorted by date and customer id, as I'm interested in sequential observations.
For example:
Week_end Cust_id Visit_flag
01JAN2017 01 0
08JAN2017 01 1
15JAN2017 01 1
22JAN2017 01 0
29JAN2017 01 1
05FEB2017 01 0
01JAN2017 02 0
08JAN2017 02 1
15JAN2017 02 1
22JAN2017 02 1
29JAN2017 02 1
05FEB2017 02 0
The output i want, is how many weeks in a row (max) a customer has visited:
Cust_id max_seq_visits
01 2
02 4
Thanks in advance for any help.
ok. do something like this
data have;
input Week_end:date9. Cust_id$ Visit_flag;
format Week_end date9.;
datalines;
01JAN2017 01 0
08JAN2017 01 1
15JAN2017 01 1
22JAN2017 01 0
29JAN2017 01 1
05FEB2017 01 0
01JAN2017 02 0
08JAN2017 02 1
15JAN2017 02 1
22JAN2017 02 1
29JAN2017 02 1
05FEB2017 02 0
;
data temp;
set have;
by Cust_id;
if Visit_flag=0 then seq=0;
else if Visit_flag then seq+1;
retain seq;
run;
proc sql;
create table want as
select Cust_id
,max(seq) as max_seq_visits
from temp
group by Cust_id;
quit;
Is there always 1 week between the dates?
If U understand you, you want the nunmber of 1s in the longest sequece og 1s by cust_id?
Yes, there is always 1 week between the dates, and each customer has the same number of weeks assigned to them. And correct - I want the longest sequence of 1s by cust_id,
ok. do something like this
data have;
input Week_end:date9. Cust_id$ Visit_flag;
format Week_end date9.;
datalines;
01JAN2017 01 0
08JAN2017 01 1
15JAN2017 01 1
22JAN2017 01 0
29JAN2017 01 1
05FEB2017 01 0
01JAN2017 02 0
08JAN2017 02 1
15JAN2017 02 1
22JAN2017 02 1
29JAN2017 02 1
05FEB2017 02 0
;
data temp;
set have;
by Cust_id;
if Visit_flag=0 then seq=0;
else if Visit_flag then seq+1;
retain seq;
run;
proc sql;
create table want as
select Cust_id
,max(seq) as max_seq_visits
from temp
group by Cust_id;
quit;
That worked out just the way i wanted. Thanks!
No problem. If you want to do this in a single data step alone, then do like this
data have;
input Week_end:date9. Cust_id$ Visit_flag;
format Week_end date9.;
datalines;
01JAN2017 01 0
08JAN2017 01 1
15JAN2017 01 1
22JAN2017 01 0
29JAN2017 01 1
05FEB2017 01 0
01JAN2017 02 0
08JAN2017 02 1
15JAN2017 02 1
22JAN2017 02 1
29JAN2017 02 1
05FEB2017 02 0
;
data want(keep=cust_id max_seq_visits);
set have;
by cust_id;
if Visit_flag=0 then cnt=0;
else if Visit_flag=1 then cnt+1;
max_seq_visits=ifn(cnt>max_seq_visits, cnt, max_seq_visits);
if last.cust_id then output;
retain cnt max_seq_visits;
run;
Post test data in the form of a datastep, as such this is not tested:
data want; set have; by cust_id; retain cnt max_cnt; if first.cust_id then do; cnt_max=0; cnt=0; end; if flag=1 then cnt=cnt+1; else if flag=0 and cnt > 0 then do; if cnt > cnt_max then cnt_max=cnt; cnt=0; end; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.