I wanted to count number of visits in a row (each week) for each customer. The problem I'm having is, if I'm using if.first counters, i need to sort the data by the two variables - customer id and a flag whether they visited that week or not. However, I need to keep the data sorted by date and customer id, as I'm interested in sequential observations.
For example:
Week_end Cust_id Visit_flag
01JAN2017 01 0
08JAN2017 01 1
15JAN2017 01 1
22JAN2017 01 0
29JAN2017 01 1
05FEB2017 01 0
01JAN2017 02 0
08JAN2017 02 1
15JAN2017 02 1
22JAN2017 02 1
29JAN2017 02 1
05FEB2017 02 0
The output i want, is how many weeks in a row (max) a customer has visited:
Cust_id max_seq_visits
01 2
02 4
Thanks in advance for any help.
ok. do something like this
data have;
input Week_end:date9. Cust_id$ Visit_flag;
format Week_end date9.;
datalines;
01JAN2017 01 0
08JAN2017 01 1
15JAN2017 01 1
22JAN2017 01 0
29JAN2017 01 1
05FEB2017 01 0
01JAN2017 02 0
08JAN2017 02 1
15JAN2017 02 1
22JAN2017 02 1
29JAN2017 02 1
05FEB2017 02 0
;
data temp;
set have;
by Cust_id;
if Visit_flag=0 then seq=0;
else if Visit_flag then seq+1;
retain seq;
run;
proc sql;
create table want as
select Cust_id
,max(seq) as max_seq_visits
from temp
group by Cust_id;
quit;
Is there always 1 week between the dates?
If U understand you, you want the nunmber of 1s in the longest sequece og 1s by cust_id?
Yes, there is always 1 week between the dates, and each customer has the same number of weeks assigned to them. And correct - I want the longest sequence of 1s by cust_id,
ok. do something like this
data have;
input Week_end:date9. Cust_id$ Visit_flag;
format Week_end date9.;
datalines;
01JAN2017 01 0
08JAN2017 01 1
15JAN2017 01 1
22JAN2017 01 0
29JAN2017 01 1
05FEB2017 01 0
01JAN2017 02 0
08JAN2017 02 1
15JAN2017 02 1
22JAN2017 02 1
29JAN2017 02 1
05FEB2017 02 0
;
data temp;
set have;
by Cust_id;
if Visit_flag=0 then seq=0;
else if Visit_flag then seq+1;
retain seq;
run;
proc sql;
create table want as
select Cust_id
,max(seq) as max_seq_visits
from temp
group by Cust_id;
quit;
That worked out just the way i wanted. Thanks!
No problem. If you want to do this in a single data step alone, then do like this
data have;
input Week_end:date9. Cust_id$ Visit_flag;
format Week_end date9.;
datalines;
01JAN2017 01 0
08JAN2017 01 1
15JAN2017 01 1
22JAN2017 01 0
29JAN2017 01 1
05FEB2017 01 0
01JAN2017 02 0
08JAN2017 02 1
15JAN2017 02 1
22JAN2017 02 1
29JAN2017 02 1
05FEB2017 02 0
;
data want(keep=cust_id max_seq_visits);
set have;
by cust_id;
if Visit_flag=0 then cnt=0;
else if Visit_flag=1 then cnt+1;
max_seq_visits=ifn(cnt>max_seq_visits, cnt, max_seq_visits);
if last.cust_id then output;
retain cnt max_seq_visits;
run;
Post test data in the form of a datastep, as such this is not tested:
data want; set have; by cust_id; retain cnt max_cnt; if first.cust_id then do; cnt_max=0; cnt=0; end; if flag=1 then cnt=cnt+1; else if flag=0 and cnt > 0 then do; if cnt > cnt_max then cnt_max=cnt; cnt=0; end; run;
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.