Solved
Contributor
Posts: 46

Find number of repeat visits within a year

Hello, I need to find the number of repeat visits within a year, starting from the initial visit. Ex., data is as below. A has 0 repeat visits within a year, B has 6, C has 1. I tried the LAG and INTCK functions, but couldn't get the correct answer.

 ID TIME_IN A 5/8/2015 A 10/25/2016 B 6/3/2016 B 7/6/2016 B 8/22/2016 B 10/31/2016 B 1/9/2017 B 2/24/2017 B 4/7/2017 C 7/7/2014 C 7/29/2014 C 8/24/2015 C 10/28/2015 C 1/4/2016

Accepted Solutions
Solution
‎04-26-2018 11:02 AM
Super User
Posts: 6,933

Re: Find number of repeat visits within a year

Here's a simple way:

data want;

set have;

by id;

if first.id then do;

repeat_visits=0;

baseline_date = time_in;

end;

else if time_in <= baseline_date + 365 then repeat_visits + 1;

if last.id;

retain baseline_date;

keep id baseline_date repeat_visits;

run;

There are fancier ways (other than adding 365) to define "within a year" including ways that account for leap years.  So it depends on how exact you want to be.

All Replies
Super User
Posts: 9,840

Re: Find number of repeat visits within a year

And what defines a repeat visit?  Post test data in the form of a datastep and show what you want to have output.  Then explain the logic of why.  As for your problem, simply:

```data want;
set have;
retain repeat;
by id;
if first.id then repeat=0;
else repeat=repeat+1;
run;```
PROC Star
Posts: 1,400

Re: Find number of repeat visits within a year

I'm not sure I follow your logic? How does B have 6 repeat visits? What is the exact definition for a repeat visit?

Solution
‎04-26-2018 11:02 AM
Super User
Posts: 6,933

Re: Find number of repeat visits within a year

Here's a simple way:

data want;

set have;

by id;

if first.id then do;

repeat_visits=0;

baseline_date = time_in;

end;

else if time_in <= baseline_date + 365 then repeat_visits + 1;

if last.id;

retain baseline_date;

keep id baseline_date repeat_visits;

run;

There are fancier ways (other than adding 365) to define "within a year" including ways that account for leap years.  So it depends on how exact you want to be.

Regular Contributor
Posts: 164

Re: Find number of repeat visits within a year

Hi,

there are several ways to perform.

I have 4 in mind:

- use the classic retain

- use proc SQL update

- use call execute

- use the new function dosubl for a more straight forward approach which blows my mind

I've described the retain and dosubl in the code below:

``````data test;
informat ID \$1. TIME_IN mmddyy10.;
input id time_in;
format TIME_IN mmddyy10.;
datalines;
A 5/8/2015
A 10/25/2016
B 6/3/2016
B 7/6/2016
B 8/22/2016
B 10/31/2016
B 1/9/2017
B 2/24/2017
B 4/7/2017
C 7/7/2014
C 7/29/2014
C 8/24/2015
C 10/28/2015
C 1/4/2016
;
run;``````
``````/* RETAIN approach */
data test1;
set test;
by id time_in;
retain min max h_repeatVisits .;
if first.id then do;
min=intnx("day", time_in, 1, "sameday");
max=intnx("year", time_in, 1, "sameday");
h_repeatVisits=0;
end;
if min<time_in<=max then h_repeatVisits=h_repeatVisits+1;
run;
proc sort data=test1; by id descending time_in;run;

data test2;
set test1;
by id descending time_in;
retain repeatVisits .;
if first.id then repeatVisits=h_repeatVisits;
drop min max h_repeatVisits;
run;
proc sort data=test2; by id time_in; run;``````
``````/* DOSUBL approach */
data test3;
set test;
by id time_in;
retain repeatVisits 0;
if first.id then do;
rc=dosubl('%LET n=0;
PROC SQL noprint;
SELECT count(*) INTO :n
FROM test
WHERE id EQ "'||strip(id)||'"
AND '||put(time_in,best8.)||'<time_in<= '||put(intnx("year", time_in, 1, "sameday"),best8.)
||';QUIT;');
repeatVisits=input(symget('n'),best8.);
end;
drop rc;
run;``````
________________________

- Cheers -

Contributor
Posts: 46

Re: Find number of repeat visits within a year

Thank you for the various methods and increasing my knowledge ... I really appreciate it !!

Super User
Posts: 13,939

Re: Find number of repeat visits within a year

Count the occurrence of year within each ID and subtract 1?

```proc summary data=have nway;
class id date;
format date year4.;
output out=yearcount (drop=_type_);
run;```

will have a variable _freq_ with the count.

Assumes that your date is a SAS date value and NOT a character variable.

A data step could subtract the 1.

☑ This topic is solved.