DATA Step, Macro, Functions and more

Find number of repeat visits within a year

Accepted Solution Solved
Reply
Contributor
Posts: 46
Accepted Solution

Find number of repeat visits within a year

Hello, I need to find the number of repeat visits within a year, starting from the initial visit. Ex., data is as below. A has 0 repeat visits within a year, B has 6, C has 1. I tried the LAG and INTCK functions, but couldn't get the correct answer. 

ID TIME_IN
A 5/8/2015
A 10/25/2016
B 6/3/2016
B 7/6/2016
B 8/22/2016
B 10/31/2016
B 1/9/2017
B 2/24/2017
B 4/7/2017
C 7/7/2014
C 7/29/2014
C 8/24/2015
C 10/28/2015
C 1/4/2016

Accepted Solutions
Solution
‎04-26-2018 11:02 AM
Super User
Posts: 6,933

Re: Find number of repeat visits within a year

Here's a simple way:

 

data want;

set have;

by id;

if first.id then do;

   repeat_visits=0;

   baseline_date = time_in;

end;

else if time_in <= baseline_date + 365 then repeat_visits + 1;

if last.id;

retain baseline_date;

keep id baseline_date repeat_visits;

run;

 

There are fancier ways (other than adding 365) to define "within a year" including ways that account for leap years.  So it depends on how exact you want to be.

 

View solution in original post


All Replies
Super User
Super User
Posts: 9,840

Re: Find number of repeat visits within a year

And what defines a repeat visit?  Post test data in the form of a datastep and show what you want to have output.  Then explain the logic of why.  As for your problem, simply:

data want;
  set have;
  retain repeat;
  by id;
  if first.id then repeat=0;
  else repeat=repeat+1;
run;
PROC Star
Posts: 1,400

Re: Find number of repeat visits within a year

I'm not sure I follow your logic? How does B have 6 repeat visits? What is the exact definition for a repeat visit?

Solution
‎04-26-2018 11:02 AM
Super User
Posts: 6,933

Re: Find number of repeat visits within a year

Here's a simple way:

 

data want;

set have;

by id;

if first.id then do;

   repeat_visits=0;

   baseline_date = time_in;

end;

else if time_in <= baseline_date + 365 then repeat_visits + 1;

if last.id;

retain baseline_date;

keep id baseline_date repeat_visits;

run;

 

There are fancier ways (other than adding 365) to define "within a year" including ways that account for leap years.  So it depends on how exact you want to be.

 

Regular Contributor
Posts: 164

Re: Find number of repeat visits within a year

Hi,

 

there are several ways to perform.

I have 4 in mind:

- use the classic retain

- use proc SQL update

- use call execute

- use the new function dosubl for a more straight forward approach which blows my mind Smiley Happy

 

I've described the retain and dosubl in the code below:

data test;
informat ID $1. TIME_IN mmddyy10.;
input id time_in;
format TIME_IN mmddyy10.;
datalines;
A 5/8/2015 
A 10/25/2016 
B 6/3/2016 
B 7/6/2016 
B 8/22/2016 
B 10/31/2016 
B 1/9/2017 
B 2/24/2017 
B 4/7/2017 
C 7/7/2014 
C 7/29/2014 
C 8/24/2015 
C 10/28/2015 
C 1/4/2016 
;
run;
/* RETAIN approach */
data test1;
   set test;
   by id time_in;
   retain min max h_repeatVisits .;
   if first.id then do;
      min=intnx("day", time_in, 1, "sameday");
      max=intnx("year", time_in, 1, "sameday");
      h_repeatVisits=0;
   end;
   if min<time_in<=max then h_repeatVisits=h_repeatVisits+1;
run;
proc sort data=test1; by id descending time_in;run; 

data test2;
   set test1;
   by id descending time_in;
   retain repeatVisits .;
   if first.id then repeatVisits=h_repeatVisits;
   drop min max h_repeatVisits;
run;
proc sort data=test2; by id time_in; run;
/* DOSUBL approach */
data test3;
   set test;
   by id time_in;
   retain repeatVisits 0;
   if first.id then do;
      rc=dosubl('%LET n=0; 
                 PROC SQL noprint; 
                    SELECT count(*) INTO :n 
                    FROM test 
                    WHERE id EQ "'||strip(id)||'" 
                    AND '||put(time_in,best8.)||'<time_in<= '||put(intnx("year", time_in, 1, "sameday"),best8.)
             ||';QUIT;');
      repeatVisits=input(symget('n'),best8.);
   end;
   drop rc;
run;
________________________

- Cheers -

Contributor
Posts: 46

Re: Find number of repeat visits within a year

Thank you for the various methods and increasing my knowledge ... I really appreciate it !!

Super User
Posts: 13,939

Re: Find number of repeat visits within a year

Count the occurrence of year within each ID and subtract 1?

proc summary data=have nway;
   class id date;
   format date year4.;
   output out=yearcount (drop=_type_);
run;

will have a variable _freq_ with the count.

 

Assumes that your date is a SAS date value and NOT a character variable.

A data step could subtract the 1.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 183 views
  • 0 likes
  • 6 in conversation